- Feb 2025
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
The model of phosphotransfer from Y169 IKK to S32 IkBa is compelling and an important new contribution to the field. In fact, this model will not be without controversy, and publishing the work will catalyze follow-up studies for this kinase and others as well. As such, I am supportive of this paper, though I do also suggest some shortening and modification.
We appreciate the reviewers candid response on the difficulty of this study and the requirement of follow-up studies to confirm a direct transfer of the phosphate. We also have edited the manuscript to make it shorter.
Generally, the paper is well written, but several figures should be quantified, and experimental reproducibility is not always clear. The first 4 figures are slow-going and could be condensed to show the key points, so that the reader gets to Figures 6 and 7 which contain the "meat" of the paper.
We have indicated the experimental reproducibility in the methodology section against each assay. We have shortened the manuscript corresponding to sections describing figures 1-4. However, when we talked to some of our colleagues whose expertise do not align with kinases and IKK, we realized that some description were necessary to introduce them to the next figures. Additionally, we added Fig. S6 indicating that the radiolabelled phospho-IKK2 Y169F is unable to transfer its own phosphate group(s) to the substrate IkBa.
Reviewer #2 (Public Review):
Phosphorylation of IκBα is observed after ATP removal, although there are ambiguous requirements for ADP.
We agree with the reviewer that this observation is puzzling. We hypothesize that ADP is simultaneously regulating the transfer process likely through binding to the active site.
It seems that the analysis hinges on the fidelity of pan-specific phosphotyrosine antibodies.
We agree with the reviewer. To bolster our conclusion, we used antibodies from two different sources. These were Monoclonal mouse anti-Phospho-Tyrosine (catalogue number: 610000) was from BD Biosciences or from EMD Millipore (catalogue no. 05-321X).
The analysis often returns to the notion that tyrosine phosphorylation(s) (and critical active site Lys44) dictate IKK2 substrate specificity, but evidence for this seems diffuse and indirect. This is an especially difficult claim to make with in vitro assays, omitting the context of other cellular specificity determinants (e.g., localization, scaffolding, phosphatases).
We agree with the concerns that the specificity could be dependent on other cellular specificity determinants and toned down our claims where necessary. However, we would like to point out that the specificity of IKK2 towards S32 and S36 of IkBa in cells in response to specific stimuli is well-established. It is also well-established that its non-catalytic scaffolding partner NEMO is critical in selectively bringing IkBa to IKK from a large pool of proteins. The exact mechanism of how IKK2 choose the two serines amongst many others in the substrate is not clear.
Multiple phosphorylated tyrosines in IKK2 were apparently identified by mass spectrometric analyses, but the data and methods are not described. It is common to find non-physiological post-translational modifications in over-expressed proteins from recombinant sources. Are these IKK2 phosphotyrosines evident by MS in IKK2 immunoprecipitated from TNFa-stimulated cells? Identifying IKK2 phosphotyrosine sites from cells would be especially helpful in supporting the proposed model.
Mass spectrometric data for identification of phosphotyrosines from purified IKK2 is now incorporated (Figure S3A). Although we have not analyzed IKK2 from TNF-a treated cells in this study, a different study of phospho-status of cellular IKK2 indicated tyrosine phosphorylation (Meyer et al 2013).
Reviewer #3 (Public Review):
The identity and purity of the used proteins is not clear. Since the findings are so unexpected and potentially of wide-reaching interest - this is a weakness. Similar specific detection of phospho-Ser/Thr vs phospho-Tyr relies largely on antibodies which can have varying degrees of specificity.
We followed a stringent purification protocol of several steps (optimized for the successful crystallization of the IKK2) that removed most impurities (PMID: 23776406, PMID: 39227404). The samples analysed with ESI MS did not show any significant contaminating kinase from the Sf9 cells.
Sequence specific phospho-antibodies used in this study are very well characterized and have been used in the field for years (Basak et al 2007, PMID: 17254973). We agree on the reviewer’s concerns on the pan-specific phospho-antibodies. Since phospho-tyrosine detection is the crucial aspect of this study, we minimized such bias by using pan-specific phosphotyrosine antibodies from two independent sources.
Reviewer #1 (Recommendations For The Authors):
I understand that Figure 3 shows that K44M abolishes both S32/26 phosphorylation and tyrosine phosphorylation, but not PEST region phosphorylation. This suggests that autophosphorylation is reflective of its known specific biological role in signal transduction. But I do not understand why "these results strongly suggest that IKK2-autophosphorylation is critical for its substrate specificity". That statement would be supported by a mutant that no longer autophosphorylates, and as a result shows a loss of substrate specificity, i.e. phosphorylates non-specific residues more strongly. Is that the case? Maybe Darwech et al 2010 or Meyer et al 2013 showed this.
Later figures seem to address this point, so maybe this conclusion should be stated later in the paper.
We have now clarified this in the manuscript and moved the comment to the next section. We have consolidated the results in Figure 3 and 4 in the previous version into a single figure in Figure. The text has also been modified accordingly.
Page 10: mentions DFG+1 without a proper introduction. The Chen et al 2014 paper appears to inform the author's interest in Y169 phosphorylation, or is it just an additional interesting finding? Does this publication belong in the Introduction or the Discussion?
The position of Y169 at the DFG+1 was intriguing and the 2014 article Chen et al further bolstered our interest in this residue to be investigated. We think this publication is important in both sections.
To understand the significance of Figure 4D, we need a WT IKK2 control: or is there prior literature to cite? This is relevant to the conclusion that Y169 phosphorylation is particularly important for S32 phosphorylation.
We have now added a new supplementary figure where activities of WT and Y169F IKK2 towards WT and S32/S36 mutants are compared (Figure S3F). At a similar concentration, the activity of WT-IKK2 is many fold higher than that of YtoF mutants (Fig. 4C). The experiments were performed simultaneously, although samples were loaded on different gels but otherwise processed in a similar way. The corresponding data is now included in the manuscript as Figure S3F.
The cold ATP quenching experiment is nice for testing the model that Y169 functions as a phospho sink that allows for a transfer reaction. However, there is only a single timepoint and condition, which does not allow for a quantitative analysis. Furthermore, a positive control would make this experiment more compelling, and Y169F mutant should show that cold ATP quenching reduces the phosphorylation of IkBa.
We thank the reviewer for appreciating our experimental design, and pointing out the concerns. We kept the ATP-time point as the maximum of the non-competition experiment. Also, we took 50mM ATP to compare its competition with highest concentration of ADP used. The idea behind using the maximum time and ATP (comparable to ADP) was to capture the effect of competitive-effect of ATP, if any, that would be maximal in the given assay condition in comparison with the phospho-transfer set up in absence of cold ATP. We agree that finer ranges of ATP concentration and time points would have enabled more quantitative analyses. We have now included data where different time intervals are tested (Figure S5D).
Why is the EE mutant recognized by anti-phospho-serine antibodies? In Figure 2F.
We anticipate Serine residues besides those in the activation loop to be phosphorylated when IKK2 is overexpressed and purified from the Sf9 cells. Since Glu (E) mimics phospho-Ser, the said antibody cross reacts with the IKK2-EE that mimics IKK2 phosphorylated at Ser177 and 181.
Figure 7B is clear, but 7C does not add much.
We have now removed the Fig. 7C in the current version. Figure 7 is now renumbered as Figure 6 that does not contain the said cartoon.
Reviewer #2 (Recommendations For The Authors):
Regarding the specificity arguments (see above in public review), the authors note that NEMO is very important in IKK specificity, and - if I'm understanding correctly - most of these assays were performed without NEMO. Would the IKK2-NEMO complex change these conclusions?
NEMO is a scaffolding protein whose action goes beyond the activation of the IKK-complex. In cells, NEMO brings IkBa from a pool of thousands of proteins to its bonafide kinase when the cells encounter specific signals. In other words, NEMO channels IKK-activity towards its bonafide substrate IkBa at that moment. Though direct proof is lacking, it is likely that NEMO present IkBa in the correct pose to IKK such that the S32/S36 region of IkBa is poised for phosphorylation. The proposed mechanism in the current study further ensures the specificity and fidelity of that phosphorylation event. We believe this mechanism will be preserved in the IKK-NEMO complex unless proven otherwise. As shown below, IKK2 undergoes tyrosine autophosphorylation in presence of NEMO.
Author response image 1.
The work primarily focuses on Y169 as a candidate target for IKK autophosphorylation. This seems reasonable given the proximity to the ATP gamma phosphate. However, Y188F more potently disrupted IκBα phosphorylation. The authors note that this could be due to folding perturbations, but this caveat would also apply to Y169F. A test for global fold perturbations for both Tyr mutants would be helpful.
Y188 is conserved in S/T kinases and that in PKA (Y204) has been studied extensively using structural, biochemical and biophysical tools. It was found in case of PKA that Y204 participates in packing of the hydrophobic core of the large lobe. Disruption of this core structure by mutation allosterically affect the activity of the kinase. We also observed similar engagement of Y188 in IKK2’s large lobe, and speculated folding perturbations in analogy with the experimental evidence observed in PKA. What we meant was mutation of Y188 would allosterically affect the kinase activity. Y169 on the other hand is unique at that position, an no experimental evidence on the effect of phospho-ablative mutation of this residue exist in the literature. Hence, we refrained from speculating its effect on the folding or conformational allostery, however, such a possibility cannot be ruled out.
I struggled to follow the rationalization of the results of Figure 4D, the series of phosphorylation tests of Y169F against IκBα with combinations of phosphoablative or phosphomimetic variants at Ser32 and Ser36. This experiment is hard to interpret without a direct comparison to WT IKK2.
We agree with the reviewer’s concerns. Through this experiment we wanted to inform about the importance of Tyr-phosphorylation of IKK2 in phosphorylating S32 of IκBα which is of vital importance in NF-kB signaling. We have now provided a comparison with WT-IKK2 in the supplementary Figure S3F. We hope this will help bring more clarity to the issue.
MD simulations were performed to compare structures of unphosphorylated vs. Ser-phosphorylated (p-IKK2) vs. Ser+Tyr-phosphorylated (P-IKK2) forms of IKK2. These simulations were performed without ATP bound, and then a representative pose was subject to ADP or ATP docking. The authors note distortions in the simulated P-IKK2 kinase fold and clashes with ATP docking. Given the high cellular concentration of ATP, it seems more logical to approach the MD with the assumption of nucleotide availability. Most kinase domains are highly dynamic in the absence of substrate. Is it possible that the P-IKK2 poses are a result of simulation in a non-physiological absence of bound ATP? Ultimately, this MD observation is linked to the proposed model where ADP-binding is required for efficient phospho-relay to IκBα. Therefore, this observation warrants scrutiny. Perhaps the authors could follow up with binding experiments to directly test whether P-IKK2 binds ADP and fails to bind ATP.
We thank that reviewer for bringing up this issue. This is an important issue and we must agree that we don’t fully understand it yet. We took more rigorous approach this time where we used three docking programs: ATP and ADP were docked to the kinase structures using LeDock and GOLD followed by rescoring with AutoDock Vina. We found that ATP is highly unfavourable to P-IKK2 compared to ADP. To further address these issues, we performed detailed MM-PBSA (Molecular Mechanics Poisson-Boltzmann Surface Area) analyses after MD-simulation to estimate binding free energies and affinities of ADP and ATP for each of the three differently phosphorylated states of IKK2. These analyses (Figure S4 E and F) clearly indicate that phosphorylated IKK2 have much higher preference for ADP over ATP. However, it does not negate ATP-binding by P-IKK2 in a different pose that may not support kinase activity.
We could not perform any binding experiment because of the following reason. We incubated FL IKK2 WT with or without cold ATP for 30mins, and then incubated these samples with <sup>32</sup>P-ATP and analysed the samples by autoradiography after resolving them on a 10% SDS-PAGE. We found that even after pre-incubation of the kinase with excess cold ATP it still underwent autophosphorylation when radioactive ATP was added as shown below. This prevented us from doing direct binding experiment with ATP as it would not represent true binding event. We also noticed that after removal of bulk ATP post autophosphorylation, phosphorylated IKK2 is capable of further autophosphorylation when freshly incubated with ATP. We have not been able to come up with a condition that would only account for binding of ATP and not hydrolysis.
Author response image 2.
The authors could comment on whether robust phosphorylation of NEMO was expected (Figure 1D). On a related note, why is NEMO a single band in the 1D left panel and double bands on the right?
No, we did not expect robust phosphorylation of NEMO. However, robust phosphorylation of NEMO is observed only in the absence of IκBα. In presence of IκBα, phosphorylation of NEMO goes down drastically. These were two different preparations of NEMO. When TEV-digestion to remove His-tag is incomplete it gives two bands as the tagged and untagged versions cannot be separated in size exclusion chromatography which is the final step.
Page 14, line 360. "...observed phosphorylation of tyrosine residue(s) only upon fresh ATP-treatment..." I'm not sure I understand the wording here (or the relevance of the citation). Is this a comment on unreported data demonstrating the rapid hydrolysis of the putative phosphotyrosine(s)? If so, that would be helpful to clarify and report in the supporting information.
In our X-ray crystallographic studies with phosphorylated IKK2 we failed to observe any density of phosphate moiety. Furthermore, this IKK2 showed further autophosphorylation when incubated with fresh ATP. These two observations lead us to believe that some of the autophosphorylation are transient in nature. However, quantitative kinetic analyses of this dephosphorylation have not been performed.
Figure S3 middle panel: The PKA substrate overlaid on the IKK2 seems sterically implausible for protein substrate docking. Is that just a consequence of the viewing angle? On a related note, Figure S3 may be mislabeled as S4 in the main text).
It is a consequence of the viewing angle. Also, we apologize for this inadvertent mislabelling. It has been corrected in the current version.
Reviewer #3 (Recommendations For The Authors):
The detection of phosphorylated amino acids relies largely on antibodies which can have a varying degree of specificity. An alternative detection mode of the phospho-amino acids for example by MS would strengthen the evidence.
We agree with the concern of specificity bias of antibodies. We tried to minimize such bias by using two different p-Tyr antibodies as noted previously and also in the methodology section. We were also able to detect phospho-tyrosine residues by MS/MS analyses, representative spectra are now added (Figure S3A).
IKK2 purity - protocol states "desired purity". What was the actual purity and how was it checked? MS would be useful to check for the presence of other kinases.
Purity of the recombinantly purified IKK2s are routinely checked by silver staining. A representative silver stained SDS-PAGE is shown (Figure S1C). It may be noted that, there’s a direct correlation of expression level and solubility, and hence purification yield and quality with the activity of the kinase. Active IKK2s express at much higher level and yields cleaner prep. In our experience, inactive IKKs like K44M give rise to poor yield and purity. We analysed K44M by LC MS/MS to identify other proteins present in the sample. We did not find any significant contaminant kinase the sample (Figure S1D). The MS/MS result is attached.
Figure 1C&D: where are the Mw markers? What is the size of the band? What is the MS evidence for tyrosine phosphorylation?
We have now indicated MW marker positions on these figures.
MS/MS scan data for the two peptides containing pTyr169 and pTyr188 are shown separately (Figure S3A).
Figure 2F: Why is fresh ATP necessary? Why was Tyr not already phosphorylated? The kinetics of this process appear to be unusual when the reaction runs to completion within 5 minutes ?
As stated earlier, we believe some of the autophosphorylation are transient in nature. We think the Tyr-phosphorylation are lost due to the action of cellular phosphatases. We agree with the concern of the reviewer that, the reaction appears to reach completion within 5 minutes in Fig 2F. We believe it is probably due to the fact that the amount of kinase used in this study exceeds the linear portion of the dynamic range of the antibody used. Lower concentration of the kinase do show that reaction does not reach completion until 60mins as shown in Fig. 2A.
Figure 3: Can the authors exclude contamination with a Tyr kinase in the IKK2-K44M prep? The LC/MS/MS data should be included.
We have reanalysed the sample on orbitrap to check if there’s any Tyr-kinase or any other kinase contamination. We used Spodoptera frugiperda proteome available on the Uniprot website for this analysis. These analyses confirmed that there’s no significant kinase contaminant present in the fraction (Figure S1D).
What is the specificity of IKK-2 Inhibitor VII? Could it inhibit a contaminant kinase?
This inhibitor is highly potent against IKK2 and the IKK-complex, and to a lesser extent to IKK1. No literature is available on its activity on other kinases. In an unrelated study, this compound was used alongside MAPK inhibitor SB202190 wherein they observed completely different outcomes of these two inhibitors (Matou-Nasri S, Najdi M, AlSaud NA, Alhaidan Y, Al-Eidi H, Alatar G, et al. (2022) Blockade of p38 MAPK overcomes AML stem cell line KG1a resistance to 5-Fluorouridine and the impact on miRNA profiling. PLoS ONE 17(5):e0267855. https://doi.org/10.1371/journal.pone.0267855). This study indirectly proves that IKK inhibitor VII does not fiddle with the MAPK pathways. We have not found any literature on the non-specific activity of this inhibitor.
Figure 6B: the band corresponding to "p-IkBa" appears to be similar in the presence of ADP (lanes 4-7) or in the absence of ADP but the presence of ATP (lane 8).
Radioactive p-IκBα level is more when ADP is added than in absence of ADP. In presence of cold ATP, radioactive p-IκBα level remains unchanged. This result strongly indicate that the addition of phosphate group to IκBα happens directly from the radioactively labelled kinase that is not competed out by the cold ATP.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public review):
Summary:
In this work, Harpring et al. investigated divisome assembly in Chlamydia trachomatis serovar L2 (Ct), an obligate intracellular bacterium that lacks FtsZ, the canonical master regulator of bacterial cell division. They find that divisome assembly is initiated by the protein FtsK in Ct by showing that it forms discrete foci at the septum and future division sites. Additionally, knocking down ftsK prevents divisome assembly and inhibits cell division, further supporting their hypothesis that FtsK regulates divisome assembly. Finally, they show that MreB is one of the last chlamydial divisome proteins to arrive at the site of division and is necessary for the formation of septal peptidoglycan rings but does not act as a scaffold for division assembly as previously proposed.
Strengths:
The authors use microscopy to clearly show that FtsK forms foci both at the septum as well as at the base of the progenitor cell where the next septum will form. They also show that the Ct proteins PBP2, PBP3, MreC, and MreB localize to these same sites suggesting they are involved in the divisome complex.
Using CRISPRi the authors knock down ftsK and find that most cells are no longer able to divide and that PBP2 and PBP3 no longer localized to sites of division suggesting that FtsK is responsible for initiating divisome assembly. They also performed a knockdown of pbp2 using the same approach and found that this also mostly inhibited cell division. Additionally, FtsK was still able to localize in this strain, however PBP3 did not, suggesting that FtsK acts upstream of PBP2 in the divisome assembly process while PBP2 is responsible for the localization of PBP3.
The authors also find that performing a knockdown of ftsK also prevents new PG synthesis further supporting the idea that FtsK regulates divisome assembly. They also find that inhibiting MreB filament formation using A22 results in diffuse PG, suggesting that MreB filament formation is necessary for proper PG synthesis to drive cell division.
Overall the authors propose a new hypothesis for divisome assembly in an organism that lacks FtsZ and use a combination of microscopy and genetics to support their model that is rigorous and convincing. The finding that FtsK, rather than a cytoskeletal or "scaffolding" protein is the first division protein to localize to the incipient division site is unexpected and opens up a host of questions about its regulation. The findings will progress our understanding of how cell division is accomplished in bacteria with non-canonical cell wall structure and/or that lack FtsZ.
Weaknesses:
No major weaknesses were noted in the data supporting the main conclusions. However, there was a claim of novelty in showing that multiple divisome complexes can drive cell wall synthesis simultaneously that was not well-supported (i.e. this has been shown previously in other organisms). In addition, there were minor weaknesses in data presentation that do not substantially impact interpretation (e.g. presenting the number of cells rather than the percentage of the population when quantifying phenotypes and showing partial western blots instead of total western blots).
We agree with the weaknesses identified by the reviewer. We removed the statements in the Results and Discussion that multiple independent divisome complexes can simultaneously direct PG synthesis. We presented the data in Figs. 3-5 as % of the cells in the population, and complete western blots are shown in Supp. Fig. S1.
Reviewer #2 (Public review):
Summary:
Chlamydial cell division is a peculiar event, whose mechanism was mysterious for many years. C. trachomatis division was shown to be polar and involve a minimal divisome machinery composed of both homologues of divisome and elongasome components, in the absence of an homologue of the classical division organizer FtsZ. In this paper, Harpring et al., show that FtsK is required at an early stage of the chlamydial divisome formation.
Strengths:
The manuscript is well-written and the results are convincing. Quantification of divisome component localization is well performed, number of replicas and number of cells assessed are sufficient to get convincing data. The use of a CRISPRi approach to knock down some divisome components is an asset and allows a mechanistic understanding of the hierarchy of divisome components.
Weaknesses:
The authors did not analyse the role of all potential chlamydial divisome components and did not show how FtsK may initiate the positioning of the divisome. Their conclusion that FtsK initiates the assembly of the divisome is an overinterpretation and is not backed by the data. However, data show convincingly that FtsK, if perhaps not the initiator of chlamydial division, is definitely an early and essential component of the chlamydial divisome.
The following statement has been included in the Discussion (pg. 16 of the revised manuscript) “Although we focused our study on a subset of the divisome and elongasome proteins that Chlamydia expresses (bolded in Fig. 6G), our results support our conclusion that chlamydial budding is dependent upon a hybrid divisome complex and that FtsK is required for the assembly of this hybrid divisome. At this time, we cannot rule out that other proteins act upstream of FtsK to initiate divisome assembly in this obligate intracellular bacterial pathogen.”
We will soon be submitting another manuscript that addresses how FtsK specifies the site of divisome assembly. This work is too extensive to be included in this manuscript.
Reviewer #3 (Public review):
Summary:
The obligate intracellular bacterium Chlamydia trachomatis (Ct) divides by binary fission. It lacks FtsZ, but still has many other proteins that regulate the synthesis of septal peptidoglycan, including FtsW and FtsI (PBP3) as well as divisome proteins that recruit and activate them, such as FtsK and FtsQLB. Interestingly, MreB is also required for the division of Ct cells, perhaps by polymerizing to form an FtsZ-like scaffold. Here, Harpring et al. show that MreB does not act early in division and instead is recruited to a protein complex that includes FtsK and PBP2/PBP3. This indicates that Ct cell division is organized by a chimera between conserved divisome and elongasome proteins. Their work also shows convincingly that FtsK is the earliest known step of divisome activity, potentially nucleating the divisome as a single protein complex at the future division site. This is reminiscent of the activity of FtsZ, yet fundamentally different.
Strengths:
The study is very well written and presented, and the data are convincing and rigorous. The data underlying the proposed localization dependency order of the various proteins for cell division is well justified by several different approaches using small molecule inhibitors, knockdowns, and fluorescent protein fusions. The proposed dependency pathway of divisome assembly is consistent with the data and with a novel mechanism for MreB in septum synthesis in Ct.
Weaknesses:
The paper could be improved by including more information about FtsK, the "focus" of this study. For example, if FtsK really is the FtsZ-like nucleator of the Ct divisome, how is the Ct FtsK different sequence-wise or structurally from FtsK of, e.g. E. coli? Is the N-terminal part of FtsK sufficient for cell division in Ct like it is in E. coli, or is the DNA translocase also involved in focus formation or localization? Addressing those questions would put the proposed initiator role of FtsK in Ct in a better context and make the conclusions more attractive to a wider readership.
We will be submitting another manuscript soon that details the conserved domain organization of FtsK from different bacteria, and the role of the various domains of chlamydial FtsK (including the N-terminus and the C-terminal translocase domain) in directing its localization in dividing Chlamydia. We have added text to the discussion (pg. 16 of the revised manuscript) that describes the sequence homology of chlamydial FtsK to FtsK from E. coli.
Another weakness is that the title of the paper implies that FtsK alone initiates divisome assembly. However, the data indicate only that FtsK is important at an early stage of divisome assembly, not that it is THE initiator. I suggest modifying the title to account for this--perhaps "FtsK is required to initiate....".
We agree with the reviewer and modified the title to “FtsK is Critical for the Assembly of the Unique Divisome Complex of the FtsZ-less Chlamydia trachomatis”. We have also modified the text throughout to indicate that FtsK is required for the assembly of the hybrid divisome of Chlamydia.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
Suggestions for improvement (mostly minor):
(1) For several of the graphs, the authors plot the number of cells with a given phenotype on the y-axis, but then describe percentages of cells in the text. It would make it clearer if all the graphs had the percentage of cells on the y-axis instead.
We have modified the figures to indicate the percentage of cells on the y-axis with a given phenotype.
(2) In Figures 3, 4, and 5 the authors show separate graphs for plus/minus drug or inducer. These should be on the same graph as they are directly comparing these two different conditions. Having them on separate graphs makes it less clear whether these differences are significant between the two conditions
We modified Fig. 4 to show +/- inducer in ftsk and pbp2 knockdown strains in the same graph. Regarding Figures 3 and 5, we believe the figures in the original submission effectively demonstrate the +/- drug conditions, so these figures remain unchanged in the revised manuscript.
(3) In Figure 2 the authors show microscopy of the colocalization of FtsK with several other divisome proteins from Ct. Quantification of the colocalization of FtsK with these other proteins would provide a more holistic understanding of their colocalization and help further support their argument that FtsK initiates the assembly of the divisome.
Supp. Fig. S4A of the revised manuscript contains images showing the colocalization of FtsK with the fusions at the septum and the base of dividing cells, and the colocalization of FtsK with the fusions that are only at the base of dividing cells. Supp. Fig. S4B quantified the percentage of dividing cells where FtsK overlaps the localization of each of the fusions at the septum, at the septum and the base, and at the base alone.
(4) In Figure 6 the authors mention that the PG ring was at a slight angle relative to the MOMP-stained septum. What is the significance of this? The authors mention it several times but do not explain its relevance to divisome assembly. It is not really evident in the images presented.
We mention in the discussion pgs. 17-18 of the revised manuscript that “The relevance of the angled orientation of PG and MreC rings relative to the MOMP-stained septum in division intermediates is unclear. However, it appears to be a conserved feature of the cell division process and may arise because the divisome proteins are often positioned slightly above or below the plane of the MOMP-stained septum. The positioning of divisome proteins above or below the septum is indicated in Figs. 1 and 2.
We included cartoons in Fig. 6C of the revised manuscript to assist the reader in visualizing the angled orientation of the PG ring relative to the MOMP-stained septum.
(5) In line 270 the authors claim that "these are the first data in any system to suggest that septal PG synthesis/modification is simultaneously directed by multiple independent divisome complexes." However, their experiments do not demonstrate that multiple divisome complexes are active at the same time. They show that multiple foci of FtsK etc. are present at sites where PG synthesis has occurred, but that does not necessarily mean that each focus/complex was actively synthesizing PG at the same time. Moreover, similar approaches were used to support a claim that septal PG synthesis is directed by multiple discrete divisome complexes previously (e.g. in Figure 1 of Bisson-Filho et al. 2017 (PMID: 28209898) in Bacillus subtilis and in Perez et al 2021 (PMID: 33269494) in Streptococcus pneumoniae). This claim is not central to the main conclusions of the study and could just be removed.
This statement has been removed from the Results and the Discussion.
(6) In Figure 6B the authors see three distinct FtsK foci. Why is this the only place in the manuscript where they see three foci? They mentioned previously that they saw foci at the septum and at the base of the progenitor mother cell, but why are there three foci here?
The vast majority of dividing cells displayed one foci at the septum and/or the base. Representative images were chosen that reflected the localization profiles observed in the majority of cells. While we observed cells with multiple foci, as shown in Figure 6C, these cells were relatively rare (~2% of cells for all the divisome proteins in 3 independent experiments). Since the number of cells with multiple foci were relatively rare, we chose to group these cells with the cells that had single foci at the septum, the septum and base, or base alone categories in the quantification shown in Fig. 2C. This is stated in the legend of Fig. 2 of the revised manuscript.
(7) The Discussion section is lacking a couple of things that would put the data in a broader context. Can the authors speculate on how FtsK knows how to find the division site? I.e. what might be upstream of FtsK localization? Additionally, the authors do not talk about the FtsK sequence or domains at any point in the paper. Does Ct FtsK have a similar sequence/structure to FtsKs from other bacteria? Are there any differences in sequence/structure that might tell us about its function in Ct?
We will be submitting another manuscript soon that examines how the site of assembly of the divisome is defined in dividing Chlamydia. This manuscript will also define the localization of the different sub-domains of chlamydial FtsK during cell division. For this manuscript, we added a paragraph in the Discussion (pg. 16 of the revised manuscript) that states the domain organization is conserved in FtsK proteins from different bacteria. This paragraph includes information regarding the % sequence identity of the C-terminus and the N-terminus of chlamydial FtsK when compared to E. coli FtsK.
(8) For Supplementary Figure S1B-C. The authors should show the full blots rather than just the single band of the protein of interest to show that the antibodies are specific. Additionally, the authors should include a loading control to show that they loaded the same amount of protein for each sample.
We have included the full blots in Supp. Fig. S1 of the revised manuscript. We do not see the need for including a loading control for these blots because we are not making arguments about the relative level of the proteins that were assayed. We only use the blots to show that the fusion proteins are primarily a single species of the predicted molecular mass.
(9) In Supplementary Figure S4A the authors use RT-qPCR to measure ftsK and pbp2 transcript levels. Since they have antibodies against these proteins, they should also include Western blots to show that the proteins are not being produced when targeted using CRISPRi.
We have included data in Supp. Fig. S5E of the resubmission that indicates foci of FtsK and PBP2 could not be detected following the knockdown of ftsk and pbp2. We feel that these data support our conclusion that the induced expression of dCas12 in the the ftsk and pbp2 knockdown strains results in the downregulation of the endogenous FtsK and PBP2 polypeptides.
(10) In lines 261-262 the authors say that "PG organization was the same or differed at the septum." What is the PG organization being compared to? Same or different from what?
We agree with the reviewer that the text in lines 261-262 in the original submission was confusing. The text has been modified.
(11) Lines 201-215 the authors refer to Supplementary Figure S3 throughout this section, but they should refer to Supplementary Figure S4.
This has been corrected.
Reviewer #2 (Recommendations for the authors):
I am not convinced that this paper shows that FtsK initiates the assembly of the divisome since the authors did not analyse the role and localization of all other chlamydial divisome components. Out of the ten homologues of divisome and elongasome components encoded by C. trachomatis genome, only five are investigated in this study. There is no explanation about how these five were chosen.
We state on pg. 16 of the revised manuscript that “Although we focused our study on a subset of the divisome and elongasome proteins that Chlamydia expresses (bolded in Fig. 6G), our results support our conclusion that chlamydial budding is dependent upon a hybrid divisome complex and that FtsK is required for the assembly of this hybrid divisome. At this time, we cannot rule out that other proteins act upstream of FtsK to initiate divisome assembly in this obligate intracellular bacterial pathogen.
Results convincingly indicate that FtsK is an early divisome component, but proofs are lacking to indicate that it initiates the divisome formation. Indeed, the authors do not show how FtsK would be the first protein to selectively accumulate at a given location to initiate the divisome formation. For this reason, the model they propose at the end of their study is not backed by sufficient data, to my opinion.
We agree with the reviewer that our data does not show that FtsK initiates divisome assembly. The title of the manuscript has been modified to “FtsK is Critical for the Assembly of the Unique Divisome Complex of the FtsZ-less Chlamydia trachomatis” and the text throughout has been modified to indicate that FtsK is the first protein we assayed that associates with nascent divisomes at the base of dividing cells. We will soon be submitting another manuscript that details how FtsK is recruited to a specific site to initiate nascent divisome assembly, This work is too extensive to be included in this manuscript.
There are also discrepancies in the number of cells analysed to quantify the localization of divisome components, ranging from 50 to 250 cells. The authors could better explain why there are such variations.
There were differences in the number of cells analyzed in the various experiments, but in every instance the effect of inhibitors (A22 and mecillinam) or ftsk and pbp2 knockdown on divisome assembly was statistically significant.
There are a few mistakes in the text regarding figure numbering (Figure S4 is mentioned as S3 in the text). Figures 5B and D are not specifically cited.
These mistakes have been corrected in the revised manuscript.
Line 261-262: the sentence starting "Our imaging analysis.." is not clear to me.
We agree with the reviewer that the text in lines 261-262 was confusing. The text has been modified (pg. 14 of the revised manuscript).
Line 270-271: there are insufficient proofs to say that there are multiple independent divisome complexes. This is in my opinion an overinterpretation of the data, since there is no proof that these complexes are independent.
This statement has been removed from the text.
A few details are lacking in the figure legends:
Figure 2C: when was the expression of the different mCherry and 6xHis constructs induced?
The onset and length of the induction of the fusions have been included in the legend of Fig. 2.
Bars are sometimes mentioned as uM and should be um. Bars sizes, number of replicates, and/or meaning of the error bars are lacking in legends of Figures S2, S3, and S4
This has been corrected in the revised manuscript.
The consistency of Figures could be improved between Figures 3A, 4A, B, and 5A. The results of treated cells could be always shown as dark grey. It would help the reader.
We have used consistent coloring in Figs. 3-5 to indicate the treated cells.
Reviewer #3 (Recommendations for the authors):
(1) Lines 113-118: do Ct cells increase in size as they get closer to starting division? If so, could a pseudo-time course (demograph) be done to bolster the evidence that the base foci formed mainly in predivisional cells and not newborn cells? This evidence might be more convincing than the data in Figures 1F and G.
Chlamydial cells in the population were heterogeneous in size at the timepoint we are studying. This observation is consistent with previous reports in the literature (Liechti et al.,2021). While we agree that a pseudo-time course could potentially bolster the evidence about when FtsK foci appear, we believe our current analysis sufficiently demonstrates that basal foci of FtsK appear prior to the appearance of new buds at the base of dividing cells.
(2) Figure 3E: It looks like MreC localization to foci doesn't strictly require MreB polymerization. Is this known for E. coli or other species?
To our knowledge, MreC assembly into a filament has not been shown to be dependent upon MreB in other bacteria. In Caulobacter crescentus, MreC forms a helical structure that is not dependent upon MreB or MreB filament formation (Dye et al., 2005. PNAS; Divakaruni et al., 2005. PNAS).
(3) Figure 5E: why is nearly half of PBP2 and PBP3 still localized to foci at the membrane even after treatment with mecillinam? This suggests, as the authors mention, that mecillinam reduces the efficiency of localization to the divisome but does not eliminate it. Any ideas why?
At this time, we do not know why inhibiting the catalytic activity of PBP2 with mecillinam does not fully prevent the association of PBP2 with the chlamydial divisome. We have included a statement in the Results (pg. 13 of the revised manuscript) that inhibiting the catalytic activity of PBP2 prevents it from efficiently associating with or maintaining its association with polarized divisome complexes.
(4) Line 262-263: This sentence is confusing-please rephrase. The same as what? Differed from what?
We agree with the reviewer. The wording in lines 262-263 of the original submission has been modified.
(5) Lines 265-267 and Figure 6: Adding cartoon schematics might help readers visualize cell orientations in Fig. 6 (especially 6B).
Cartoons have been added to Fig. 6C (Fig. 6B in the original submission) to orient the reader.
(6) Line 294-298: Do the authors think that the residual 5-10% of PG foci after FtsK knockdown is due to the ability of residual FtsK to organize divisomes?
We show that knockdown of FtsK is not complete, and while we cannot be certain, it is likely, that the PG foci detected in FtsK knockdown cells is due to the ability of the residual FtsK to organize divisomes that direct PG synthesis.
(7) Do the authors have any evidence that FtsK foci are mobile like treadmilling FtsZ?
We have not performed real-time imaging studies, and we currently have no evidence that FtsK foci are mobile.
(8) FtsK foci here are reminiscent of mobile foci formed by the FtsK-like SpoIIIE at the Bacillus subtilis sporulation septum. This might be a good idea to mention in the Discussion. Is it possible that Ct FtsK is also involved in coordinating chromosome partitioning through the developing septum? (That is another reason why it would be useful to know if the translocase domain was dispensable for localization/activity).
We are currently preparing another manuscript that documents the contribution of the various domains of FtsK to its localization profile and whether the division defect in ftsk knockdown cells can be suppressed by specific subdomains of FtsK. This manuscript not only will include these data, it will also include experiments that address how the site of polarized budding is defined. In the revised manuscript, we have included a description of how the domain organization of chlamydial FtsK is similar to E. coli FtsK (pg. 16 of revision). Chlamydial FtsK also has a similar domain organization as SpoIIIE from B. subtilis. The C-terminal catalytic domain of SpoIIIE is 45% identical to chlamydial FtsK. The N-terminus of SpoIIIE is predicted to encode 4 transmembrane spanning helices, like chlamydial FtsK. However, the N-terminus of SpoIIIE shares no sequence homology with the N-terminus of chlamydial FtsK. We have not included the similar domain organization of SpoIIIE and chlamydial FtsK in the revised manuscript.
(9) It seems that FtsK foci localize to a particular spot opposite from the active septum, although how this spot is specified is not clear. Is there any geometric clue for FtsK's localization like there is for Min-specified FtsZ localization?
As mentioned above, we are currently preparing another manuscript that documents our efforts to understand how the site of polarized budding is defined. This analysis is too extensive to include in this study.
(10) As mentioned in the Summary, do the authors know whether the N-terminal membrane binding part of FtsK (FtsKn) sufficient for localization/divisome assembly in Ct as it is in other species? Oullette et al. 2012 showed that FtsKn could interact with MreB in BACTH.
We are currently preparing another manuscript that documents the contribution of the various domains of FtsK to its localization profile.
(11) The previous BACTH result with MreB and FtsKn implies that this interaction is direct, yet the current data suggest that this is not the case. Can the authors comment on this? Is this due to bridging effects inherent in the BACTH system?
We have not presented any data to indicate that FtsK and MreB do not interact. We have only shown that FtsK localization is not dependent upon MreB filament formation (Fig. 3).
(12) The FtsZ-independent role of FtsK in nucleating the divisome suggests that Ct FtsK may differ from other FtsKs structurally - can this be explored, perhaps with AlphaFold 3?
As mentioned above, we have included a paragraph in the discussion of the revised manuscript (pg. 16 of the revised manuscript) that states the domain organization of chlamydial FtsK is similar to E.coli FtsK. This conserved domain organization is evident when we view the structures of the proteins using Alphafold.
(13) Typo on line 559: should be HeLa.
This has been corrected.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public review):
The Bagnat and Rawls groups' previous published work (Park et al., 2019) described the kinetics and genetic basis of protein absorption in a specialized cell population of young vertebrates termed lysosome-rich enterocytes (LREs). In this study they seek to understand how the presence and composition of the microbiota impacts the protein absorption function of these cells and reciprocally, how diet and intestinal protein absorption function impact the microbiome.
Strengths of the study include the functional assays for protein absorption performed in live larval zebrafish, which provides detailed kinetics on protein uptake and degradation with anatomic precision, and the gnotobiotic manipulations. The authors clearly show that the presence of the microbiota or of certain individual bacterial members slows the uptake and degradation of multiple different tester fluorescent proteins.
To understand the mechanistic basis for these differences, the authors also provide detailed single-cell transcriptomic analyses of cells isolated based on both an intestinal epithelial cell identity (based on a transgenic marker) and their protein uptake activity. The data generated from these analyses, presented in Figures 3-5, are valuable for expanding knowledge about zebrafish intestinal epithelial cell identities, but of more limited interest to a broader readership. Some of the descriptive analysis in this section is circular because the authors define subsets of LREs (termed anterior and posterior) based on their fabp2 expression levels, but then go on to note transcriptional differences between these cells (for example in fabp2) that are a consequence of this initial subsetting.
Inspired by their single-cell profiling and by previous characterization of the genes required for protein uptake and degradation in the LREs, the authors use quantitative hybridization chain reaction RNA-fluorescent in situ hybridization to examine transcript levels of several of these genes along the length of the LRE intestinal region of germ-free versus mono-associated larvae. They provide good evidence for reduced transcript levels of these genes that correlate with the reduced protein uptake in the mono-associated larval groups.
The final part of the study (shown in Figure 7) characterized the microbiomes of 30-day-old zebrafish reared from 6-30 days on defined diets of low and high protein and with or without homozygous loss of the cubn gene required for protein uptake. The analysis of these microbiomes notes some significant differences between fish genotypes by diet treatments, but the discussion of these data does not provide strong support for the hypothesis that "LRE activity has reciprocal effects on the gut microbiome". The most striking feature of the MDS plot of Bray Curtis distance between zebrafish samples shown in Figure 7B is the separation by diet independent of host genotype, which is not discussed in the associated text. Additionally, the high protein diet microbiomes have a greater spread than those of the low protein treatment groups, with the high protein diet cubn mutant samples being the most dispersed. This pattern is consistent with the intestinal microbiota under a high protein diet regimen and in the absence of protein absorption machinery being most perturbed in stochastic ways than in hosts competent for protein uptake, consistent with greater beta dispersal associated with more dysbiotic microbiomes (described as the Anna Karenina principle here: https://pubmed.ncbi.nlm.nih.gov/28836573/). It would be useful for the authors to provide statistics on the beta dispersal of each treatment group.
Overall, this study provides strong evidence that specific members of the microbiota differentially impact gene expression and cellular activities of enterocyte protein uptake and degradation, findings that have a significant impact on the field of gastrointestinal physiology. The work refines our understanding of intestinal cell types that contribute to protein uptake and their respective transcriptomes. The work also provides some evidence that microbiomes are modulated by enterocyte protein uptake capacity in a diet-dependent manner. These latter findings provide valuable datasets for future related studies.
We thank the Reviewer for their thorough and kind assessment. We appreciate the suggestion for edits and for pointing out areas that needed further clarification.
One point in need of further explanation is the use fabp6 (referred to as fabp2 by the reviewer) to define anterior LREs and their gene expression pattern, which includes high levels of fabp6, something that was deemed a “circular argument” by the reviewer. The rationale for using fabp6 as a reference is that we were able to define its spatial pattern in relation to other LRE markers and the neighboring ileocyte population using transgenic markers (Lickwar et al., 2017; Wen et al., 2021). Thus, far from being a circular argument, using fabp6 allowed us to identify other markers that are differentially expressed between anterior and posterior LREs, which share a core program that we highlight in our study. In the revised manuscript, we clarified this point (lines 166 – 169).
We followed the Reviewer’s suggestion to test if LRE activity and dietary protein affected beta dispersal. Our analyses revealed that beta dispersion was not significantly different between our experimental conditions. We added details about this analysis (lines 384 – 386) and a new supplemental figure panel (Figure S7C).
Reviewer #2 (Public review):
Summary:
The authors set out to determine how the microbiome and host genotype impact host protein-based nutrition.
Strengths:
The quantification of protein uptake dynamics is a major strength of this work and the sensitivity of this assay shows that the microbiome and even mono-associated bacterial strains dampen protein uptake in the host by causing down-regulation of genes involved in this process rather than a change in cell type.
The use of fluorescent proteins in combination with transcript clustering in the single cell seq analysis deepens our understanding of the cells that participate in protein uptake along the intestine. In addition to the lysozome-rich enterocytes (LRE), subsets of enteroendocrine cells, acinar, and goblet cells also take up protein. Intriguingly, these non-LRE cells did not show lysosomal-based protein degradation; but importantly analysis of the transcripts upregulated in these cells include dab2 and cubn, genes shown previously as being essential to protein uptake.
The derivation of zebrafish mono-associated with single strains of microbes paired with HCR to localize and quantify the expression of host protein absorption genes shows that different bacterial strains suppress these genes to variable extents.
The analysis of microbiome composition, when host protein absorption is compromised in cubn-/- larvae or by reducing protein in the food, demonstrates that changes to host uptake can alter the abundance of specific microbial taxa like Aeramonas.
Weaknesses:
The finding that neurons are positive for protein uptake in the single-cell data set is not adequately discussed. It is curious because the cldn:GFP line used for sorting does not mark neurons and if the neurons are taking up mCherry via trans-synaptic uptake from EECs, those neurons should be mCherry+/GFP-; yet methods indicate GFP+ and GFP+/mCherry+ cells were the ones collected and analyzed.
We thank the Reviewer for the kind and positive assessment of our work, for suggestions to improve the accessibility and clarity of the manuscript, and for pointing out an issue related to a neuronal population that needed further clarification.
It turns out that there is a population of neurons that express cldn15la. They are not easily visualized by microscopy because IECs express this gene much more highly. However, the endogenous cldn15la transcripts can be found in neurons as shown in a recently published dataset (PMID: 35108531) as well as in this study We added a discussion point to clarify this issue (lines 463 – 465).
Reviewer #3 (Public review):
Summary:
Childers et al. address a fundamental question about the complex relationship within the gut: the link between nutrient absorption, microbial presence, and intestinal physiology. They focus on the role of lysosome-rich enterocytes (LREs) and the microbiota in protein absorption within the intestinal epithelium. By using germ-free and conventional zebrafishes, they demonstrate that microbial association leads to a reduction in protein uptake by LREs. Through impressive in vivo imaging of gavaged fluorescent proteins, they detail the degradation rate within the LRE region, positioning these cells as key players in the process. Additionally, the authors map protein absorption in the gut using single-cell sequencing analysis, extensively describing LRE subpopulations in terms of clustering and transcriptomic patterns. They further explore the monoassociation of ex-germ-free animals with specific bacterial strains, revealing that the reduction in protein absorption in the LRE region is strain-specific.
Strengths:
The authors employ state-of-the-art imaging to provide clear evidence of the protein absorption rate phenotype, focusing on a specific intestinal region. This innovative method of fluorescent protein tracing expands the field of in vivo gut physiology.
Using both conventional and germ-free animals for single-cell sequencing analysis, they offer valuable epithelial datasets for researchers studying host-microbe interactions. By capitalizing on fluorescently labelled proteins in vivo, they create a new and specific atlas of cells involved in protein absorption, along with a detailed LRE single-cell transcriptomic dataset.
Weaknesses:
While the authors present tangible hypotheses, the data are primarily correlative, and the statistical methods are inadequate. They examine protein absorption in a specific, normalized intestinal region but do not address confounding factors between germ-free and conventional animals, such as size differences, transit time, and oral gavage, which may impact their in vivo observations. This oversight can lead to bold conclusions, where the data appear valuable but require more nuance.
The sections of the study describing the microbiota or attempting functional analysis are elusive, with related data being overinterpreted. The microbiome field has long used 16S sequencing to characterize the microbiota, but its variability due to experimental parameters limits the ability to draw causative conclusions about the link between LRE activity, dietary protein, and microbial composition. Additionally, the complex networks involved in dopamine synthesis and signalling cannot be fully represented by RNA levels alone. The authors' conclusions on this biological phenomenon based on single-cell data need support from functional and in vivo experiments.
We thank the Reviewer for their assessment and for pointing out some areas that needed to be explained better and/or discussed.
The Reviewer mentions some potential confounding factors (ie., size differences, transit time, oral gavage) in the gnotobiology experiments. We would like to convey that these aspects have been addressed in our experimental design and are now clarified in the revised manuscript: 1- larval sizes were recorded and found to be similar between GF and monoassociated larvae (Figure S6A); 2- while intestinal transit time may be affected by microbes and is a topic of interest, in our assay luminal mCherry cargo is present at high levels throughout the gut and is not limiting at any point during the experiment; 3- gavage, which is necessary for quantitative assays, is indeed an experimental manipulation that may somehow alter the subjects (the same is true for microscopy and virtually any research method). However, it cannot explain differences between GF and CV or alter our conclusions via microbial or dietary effects. We now elaborate the former point in the revised discussion (line 426). A new panel has been added for Fig.S6 to show that standard length was similar in GF and monoassociated larvae (Figure S6A).
We are aware that microbial community composition is often highly variable between experiments and this necessitates adequately high biological replication and inclusion of internal controls to allow conclusions to be drawn. Nevertheless, studies evaluating the utility of 16S rRNA gene sequencing have found that this analysis reveals important impacts of environmental factors on the gut microbiome (PMIDs: 21346791, 31409661, 31324413). Our results provide further evidence that 16S rRNA gene sequencing remains a useful method to detect perturbations to the zebrafish gut microbiome. Reproducing previous findings, we detected many of the core zebrafish microbiota strains in our samples that have been identified by other studies (PMIDs: 26339860, 21472014, 17055441). To ensure the robustness of our results, we included several biological replicates for each condition, co-housed genotypes and included large sample sizes to minimize environmental variability between groups. In response to this reviewer concern, we have added a supplemental beta diversity plot and statistical analyses showing that the microbiomes in our larvae were significantly different from the diets or tank water (Figure S7A). This analysis shows that the host environment influenced microbial community composition (lines 376 – 378). We also added an additional supplemental panel and performed analysis showing that the experimental replicates (i.e., different tanks) were not a significant source of variation in this study (lines 378 – 380) (Figure S7B). This result underscores that the microbiota in these larvae were influenced by both the host and diet.
Regarding dopamine pathways, we acknowledge that it involves complex biology that will require dedicated studies. In this work, we simply point out gene expression patterns we find interesting as they may inform future studies.
Finally, the Reviewer mentions the use of inadequate statistical methods for some analyses without specifying or indicating alternative analyses, only the need to justify the use of two-way ANOVA is made explicit. In this point, we respectfully disagree and would like to emphasize that we use statistical methods that are standard in the field (PMID: 37707499). We nevertheless added a justification for the use of two-way ANOVA where appropriate (lines 635-637, 653-654, 773-776). The two-way ANOVA test was to compare fluorescence profiles of gavages cargoes or HCR probes along the length of the LRE region. This test accounts for differences in fluorescence between experimental conditions in segments (30 μm) along the LRE region (~300 μm). This allows us to capture differences in fluorescence between experimental conditions while accounting for heterogeneity in the LRE region. Please see our comment below for more information about our use of the 2-way ANOVA.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
Please provide in the materials and methods the strain identifiers and sources of the bacteria used in the study.
Thank you for the suggestions. Strain identifiers and source information were added to the methods (lines 576-579).
Reviewer #2 (Recommendations for the authors):
(1) This is a very satisfying and thorough analysis of the reciprocal influence of diet, microbiome, and host genotype on protein absorption by the host. Below I make suggestions that mainly relate to making the paper more accessible to a broader audience.
(2) Line 233 Starts a section that reports the findings of the scRNA dataset. The writing is inconsistent with respect to how the genes are listed: whether abbreviation only or spelled out followed by abbreviation. I prefer the latter. For example, slc10a2 is a bile acid Na cotransporter but for those not in the know, they would have to look this up. Perhaps adding a supplementary table that provides a gene list of those discussed in the text with abbreviation/spelled-out, and KEGG terms.
Thank you for pointing out inconsistent gene labeling. We have revised the text with spelled out gene names followed by abbreviations.
(3) Line 461 Where did the neurons come from when you were sorting cldn+ cells?
Neuronal expression of cldn15la was detected in our data and other published datasets (PMID: 37995681, 35108531). We added a note to the text clarifying that neuronal cells can express cldn15la (lines 463-465).
(4) Line 561 1x tricaine should be converted to percentage in solution or concentration throughout.
The tricaine concentration was 0.2 mg/mL. We added this detail to the methods (line 596).
(5) Line 612 Please clarify how normalizations are carried out: is it to the peak value in the germ-free condition? CV never reaches 1.
AUC values were normalized to the peak value in the GF condition at 60 minutes PG. We clarified this step in the methods (lines 618-619).
(6) Line 654-663 I think mCherry here should be mTourquoise?
Thank you for catching this typo. We corrected it in the text.
(7) In Figure 1 Please consider adding a color so that magenta does not represent BOTH germ-free AND mCherry.
Due to the many colors of fluorescent proteins and HCR probes in this paper, we were not able to find an alternative plot line color to represent GF.
(8) In Figure 2 I suggest consistency with respect to the order you present GF/CV
Figure 1 GF->CV
Figure 2 CV->GF
My preference is GF->CV
Images in Figure 2 were re-ordered following reviewer’s recommendation.
Here, 20 minute time point also appears qualitatively different between GF and CV.
There can be slight differences in LREs between individuals. These images were selected because they represented the average differences in the amount of mTurquoise degradation activity that occurred between 20 – 60 minutes post-flushing in the GF and CV conditions.
In Figure 3E Figure legend refers to being able to see BSA in vacuoles. The image should be modified to show this- currently too small.
In response, we enlarged the confocal microscopy images showing DQ red BSA in the LRE region (Figure 3E). We added a panel with confocal microscopy images of the LREs in 6 dpf larva gavaged with DQ red BSA (Figure S3F). These images show that DQ red BSA fluorescence was localized to the LRE lysosomal vacuole.
In Figure 5D, Posterior LRE should be pink not green in the key to the right of the heatmap.
Thank you for catching this error. We have corrected the colors (Figure 5D).
Reviewer #3 (Recommendations for the authors):
(1) Introduction and context:
Expand the introduction to include more background on microbial-mediated protein absorption, with references to relevant findings in Drosophila. This will provide a stronger foundation for the study's contributions to the field.
Thank you for this suggestion. We added information about microbe-mediated amino acid harvest in Drosophila to the introduction (lines 49-53).
(12) Methodological suggestions:
Measure and report differences between germ-free (GF) and conventional (CV) animals, such as transit time, to account for potential confounding factors in protein absorption dynamics.
We respectfully assert that a transit assay is not required for this study and could actually create confusion as an effect in transit time could be interpreted as a contributing factor when it is in fact not the case due to the experimental design. This is because the concentration of luminal protein was equivalent in GF and CV larvae (Figure S1E), so the LREs had equal saturating access to those proteins in both conditions. Furthermore, we showed the microbiota did not degrade fluorescent protein (Figure S1F). Therefore, we feel confident that there was lower protein uptake in the LREs of CV larvae because the microbiome exerted regulatory effects on LRE activity.
Provide detailed information on the gating strategy used for single-cell sorting to enhance the dataset's utility and support claims about cell changes.
The methods we used for sorting cells were previously described (PMID: 31474562). In this manuscript, we describe them under the heading “Fluorescence activated cell sorting for single cell RNA-sequencing.”
Explain the "GeneRatio" metric in figure legends for clarity.
The GeneRatio is the ratio of genes associated with each individual GO term to the number of genes associated with the domain. An explanation was added to the caption (Figure S3C).
(13) Visual and statistical improvements:
Include images of labeled peptidases within lysosome-rich enterocytes (LREs) to reinforce findings.
Thank you for the suggestion. We added images of labeled peptidases in the LRE region (Figure S6E-D).
For Panels 4-F and 5-D, consider using violin plots of selected genes to improve clarity and emphasize major ideas.
In Figure 4F, the heatmap shows multiple genes were upregulated in mCherry-positive cells. We tried the plotting suggested by the reviewer and felt that violin plots could not convey this message as clearly. Likewise, the heatmap in Figure 5D effectively shows the gradient of expression between ileocytes, anterior and posterior LREs.
Strengthen statistical analysis by employing more rigorous methods and justifying their selection, such as using two-way ANOVA where appropriate.
The two-way ANOVA was used to quantify protein uptake or HCR probe fluorescence along the length of the LRE region. This statistical test allowed us to compare differences in fluorescence between experimental conditions in multiple LRE segments (see Authoer response image 1 below for example). As our assays show, the LRE region is heterogenous with segments showing different levels of activity and gene expression. The two-way ANOVA is appropriate because it allows us to account for this heterogeneity by comparing fluorescence across multiple segments.
Author response image 1.
Our figures display these fluorescent levels in line plots (above, left) rather than bar plots (above, right). The results are easier to visualize interpret in line plots, and they display the fluorescence profiles in greater detail.
(14) Technical corrections:
Correct figure references: Figure 5 about tryptophan metabolism should be 5A, S5G-S5H.
We corrected the figure references.
Line 518: Spell out "heterozygotes" instead of using "gets".
We changed the term from “hets” to “heterozygotes.”
(15) Revise Figure S2 citation to match the actual figure labeling.
We corrected the text to indicate “Figure S2” rather than “Figure S2A.”
Additional manuscript modification
· Figure panels 3B-C, S3A-B, 4A-C: Two cluster were relabeled with improved descriptors based on our updated annotations. The clusters “Pharynx-esophagus-cloaca 1” (PEC1) and PEC2 were relabeled as “Pharynx-cloaca 1” and “Pharynx-cloaca 2.”
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
Summary:
In this study, Basha and colleagues aim to test whether the thalamic nucleus reuniens can facilitate the hippocampus/prefrontal cortex coupling during sleep. Considering the importance of sleep in memory consolidation, this study is important to understand the functional interaction between these three majorly involved regions. This work suggests that the thalamic nucleus reuniens has a functional role in synchronizing the hippocampus and prefrontal cortex.
Strengths:
The authors performed recordings in naturally sleeping cats, and analysed the correlation between the main slow wave sleep oscillatory hallmarks: slow waves, spindles, and hippocampal ripples, and with reuniens' neurons firing. They also associated intracellular recordings to assess the reuniens-prefrontal connectivity, and computational models of large networks in which they determined that the coupling of oscillations is modulated by the strength of hippocampal-thalamic connections.
Thank you for your positive evaluation.
Weaknesses:
The authors' main claim is made on slow waves and spindle coupling, which are recorded both in the prefrontal cortex and surprisingly in reuniens. Known to be generated in the cortex by cortico-thalamic mechanisms, the slow waves and spindles recorded in reuniens show no evidence of local generation in the reuniens, which is not anatomically equipped to generate such activities. Until shown differently, these oscillations recorded in reuniens are most likely volume-conducted from nearby cortices. Therefore, such a caveat is a major obstacle to analysing their correlation (in time or frequency domains) with oscillations in other regions.
(1) We fully agree with the reviewer that reuniens likely does not generate neither slow waves nor spindles. We do not make such claim, which we clearly stated in the discussion (lines 319-324). We propose that Reuniens neurons mediate different forms of activity. In the model, we introduced MD nucleus only because without MD we were unable to generate spindles. While the slow waves and spindles are generated in other thalamocortical regions, the REU neurons show these rhythms due to long-range projections from these regions to REU as has been shown in the model.
(2) Definitely, we cannot exclude some influence of volume conductance on obtained LFP recordings in REU nucleus. However, we show modulation of spiking activity within REU by spindles. Spike modulation cannot be explained by volume conductance but can be explained by either synaptic drive (likely the case here) or some intrinsic neuronal processes (like T-current).
(3) In our REU recordings for spike identification we used tetrode recordings. If slow waves and spindles are volume conducted, then slow waves and spindles recorded with tetrodes should have identical shape. Following reviewer comment, we took these recordings and subtracted one channel from another. The difference in signal during slow waves is in the order 0.1 mV. Considering that the distance between electrodes is in the order of 20 um, such a difference in voltage is major and can only be explained by local extracellular currents, likely due to synaptic activities originating in afferent structures.
Finally, the choice of the animal model (cats) is the best suited one, as too few data, particularly anatomical ones regarding reuniens connectivity, are available to support functional results.
(1) Thalamus of majority of mammals (definitely primates and carnivores, including cats) contain local circuit interneurons (about 30 % of all neurons). A vast majority of studies in rodents (except LGN nucleus) report either absence or extremally low (i.e. Jager P, Moore G, Calpin P, et al. Dual midbrain and forebrain origins of thalamic inhibitory interneurons. eLife. 2021; 10: e59272.) number of thalamic interneurons. Therefore, studies on other species than rodents are necessary, and bring new information, which is impossible to obtain in rodents.
(2) Cats’ brain is much larger than the brain of mice or rats, therefore, the effects of volume conductance from cortex to REU are much smaller, if not negligible. The distance between REU and closest cortical structure (ectosylvian gyrus) in cats is about 15 mm.
(3) Indeed, there is much less anatomical data on cats as opposed to rodents. This is why, we performed experiments shown in the figure 1. This figure contains functional anatomy data. Antidromic responses show that recorded structure projects to stimulated structure. Orthodromic responses show that stimulated structure projects to recorded structure.
Reviewer #2 (Public Review):
Summary:
The interplay between the medial prefrontal cortex and ventral hippocampal system is critical for many cognitive processes, including memory and its consolidation over time. A prominent idea in recent research is that this relationship is mediated at least in part by the midline nucleus reuniens with respect to consolidation in particular. Whereas the bulk of evidence has focused on neuroanatomy and the effects of temproary or permanent lesions of the nucleus reuniens, the current work examined the electrophysiology of these three structures and how they inter-relate, especially during sleep, which is anticipated to be critical for consolidation. They provide evidence from intercellular recordings of the bi-directional functional connectivity among these structures. There is an emphasis on the interactions between these regions during sleep, especially slow-wave sleep. They provide evidence, in cats, that cortical slow waves precede reuniens slow waves and hippocampal sharp-wave ripples, which may reflect prefrontal control of the timing of thalamic and hippocampal events, They also find evidence that hippocampal sharp wave ripples trigger thalamic firing and precede the onset of reuniens and medial prefrontal cortex spindles. The authors suggest that the effectiveness of bidirectional connections between the reuniens and the (ventral) CA1 is particularly strong during non-rapid eye movement sleep in the cat. This is a very interesting, complex study on a highly topical subject.
Strengths:
An excellent array of different electrophysiological techniques and analyses are conducted. The temporal relationships described are novel findings that suggest mechanisms behind the interactions between the key regions of interest. These may be of value for future experimental studies to test more directly their association with memory consolidation.
We thank this reviewer for very positive evaluation of our study.
Weaknesses:
Given the complexity and number of findings provided, clearer explanation(s) and organisation that directed the specific value and importance of different findings would improve the paper. Most readers may then find it easier to follow the specific relevance of key approaches and findings and their emphasis. For example, the fact that bidirectional connections exist in the model system is not new per se. How and why the specific findings add to existing literature would have more impact if this information was addressed more directly in the written text and in the figure legends.
Thank you for this comment. In the revised version, we will do our best to simplify presentation and more clearly explain our findings.
Reviewing Editor (Recommendations for Authors):
Please discuss the ability of reuniens to generate spindles?
We briefly discussed this in previous version. We now extended the discussion (p. 18).
For population data, how many cats were used in acute and chronic experiments, where does the population data originate in Fig. 2? How repeatable were the findings across animals? Was histology verified in each animal?
As previously stated in the beginning of method section we totally used 20 cats: 16 anesthetized (or acute) and 4 non-anesthetized (or chronic). We added number of cats in appropriate places in the result section. Population data in figure 2 comes from 48, 49 or 52 recording sessions (depending on the type of analysis, and indicated in the figure legend) from 4 chronic cats; we clarified this information in the legend. Results were highly repeatable across animals. Histology was verified in all chronic and acute animals, we added a sentence in the method section.
Explanation of figures is very poor, values in figures should be reported in results so they can be compared in the context of the description.
In this revised version, we report most numbers present in figures and their legend to the main text (result section).
The depth of the recording tungsten electrodes are meaningless without the AP and ML coordinates given how heterogenous mPFC is. What is the ventromedial wall of the mPFC in the cat?
We added the ML and AP coordinates in the method section. We corrected ventromedial wall for ventroposterior part of the mPFC.
What are the two vertical lines in 1F?
This was an error while preparing the figure. The panel was corrected.
Line 90 mean +-SD of what? There are no numbers.
Thanks, we now indicate the values.
Panel 2L does not show increased spindling in reuniens prior to PFC as indicated in the results, please explain. It does show SWR in the hippocampus prior to spindles, what is the meaning of such a time relationship?
Panel 2L did show an increased spindling reuniens prior to mPFC, but indeed at the time scale shown, it was not very clear. In this revised manuscript, we added an inset zooming around time zero to make this point clearer.
Panel 2L indeed show an increase in SWR prior to the increase in spindle in both Reuniens and mPFC.
As stated in the discussion, ‘We found that hippocampal SWRs trigger thalamic firing and precede the onset of reuniens and mPFC spindles, which points to SWRs as one of candidate events for spindle initiation.’
It is unclear what the slow waves of PFC mean, these represent filtered PFC lfp, but is this a particular oscillation? They continue to occur during the spindle, while the slow waves supposedly trigger the spindle. Please explain and clarify.
We recently published a review article involving several scientists studying both human and animal sleep that has inserted Box. 1 (Timofeev I, Schoch S, LeBourgeois M, Huber R, Riedner B, Kurth S. Spatio-temporal properties of sleep slow waves and implications for development. Current Opinion in Physiology. 2020; 15: 172–182). In this box among other terms, we provide current definition of slow waves vs slow oscillation. Briefly, if slow waves are repeated with a given rhythm, they typically form slow oscillation. However, if they occur in isolation or are not rhythmic, they remain slow waves, but cannot be called slow oscillation.
Regarding relation of spindles and slow oscillation. We are currently systematically analyzing data on spindles and slow waves obtained from head-restrained and freely behaving cats. One of the main findings is that a majority of ‘cortical’ spindles are local. Local to the extent that spindles can occur in alternation in two neighboring cortical cells. Largely, LFP sleep spindles occur more or less synchronously within suprasylvian gyrus of cats where indeed a large majority of them was triggered by slow waves. The synchrony between LFP spindles in suprasylvian vs other other cortical areas is much less clear. So, it is not surprizing that spindles in one bran region can occur when there is a slow wave present in some other brain region. Something of a kind was also shown in human (Mölle M, Bergmann TO, Marshall L, Born J. Fast and slow spindles during the sleep slow oscillation: disparate coalescence and engagement in memory processing. Sleep. 2011; 34 (10): 1411-1421).
In this regard, we are not ready to include modifications in the manuscript.
Line 134, where is spindle amplitude shown? Plots report power within the spindle frequency band, which obviously captures more than just spindles.
No, plots of figure 3 B, C show the phase-amplitude coupling (PAC) strength. These were calculated with detected spindles, therefore, while we cannot exclude some false spindle detections, we are confident that the false spindle detections are at a negligible level. We modified text and instead of spindle amplitude, we describe SW-spindle amplitude coupling. This reflects our analysis with exactitude.
The discussion must include the medio dorsal nucleus which is the largest thalamic input to the prefrontal cortex and also receives input from the hippocampus. In particular, the case must be made for why reuniens would play a more important or different role than MD? (For example: Occurrence of Hippocampal Ripples is Associated with Activity Suppression in the Mediodorsal Thalamic Nucleus - PMC (nih.gov)).
We cited the suggested study. We cannot say whether reuniens plays a more or less important role. What is clear is that hippocampal ripples at the onset of spindles trigger increased firing in both MD and reuniens. Our extracellular recordings (Fig. 4, K) suggest that the increased firing is associated with spike-bursts. We also have a parallel unpublished study done on anesthetized mice showing SWR triggered inhibitory potentials in both reuniens and MD that reverses around -65mV - -70 mV. Because the majority of SWR occurred at the onset of cortical up state, a relative role of cortico-thalamic vs hippocampo-thalamic drive is not easy to separate. We hope, we will convincingly do this in our forthcoming study, with the limitation that it was done on anesthetized mice.
Reviewer #1 (Recommendations For The Authors):
I strongly encourage the authors to perform current source density analyses on the LFP signals recorded in the nucleus reuniens to make sure that the observed oscillations are indeed locally generated. So far, the anatomical organisation in reuniens cannot support the local generation of oscillations, such as spindles and slow wave. At least in rodents (the cat reuniens does not seem too different, until shown differently), there were no oscillators found in reuniens, and at least not arranged like in cortical areas, allowing the summation in time, and particularly space, of rhythmic input currents. Bipolar recordings with pairs of twisted electrodes might also be useful to assess the local existence of spindles and slow waves.
Current source density calculation is possible when one knows the exact distance between recording sites. As we used tetrodes made with 4 twisted platinum-iridium wires, we know more or less the range of distance between recording sites, but not the exact distance between any given pair of electrodes.
Then, the physical distance between the reuniens and any cortical structure is about 8-9 mm. Therefore, with such distances, volume conductance is expected to be negligible. If slow waves and spindles are volume conducted, then slow waves and spindles recorded with tetrodes should have identical shape. Following reviewer comment, we took these recordings and subtracted one channel from another. The difference in signal during slow waves is in the order 0.1 mV. Considering that the distance between electrodes is in the order of 20 um, such a difference in voltage is major and can only be explained by local extracellular currents, likely due to synaptic activities originating in afferent structures.
Below, we plotted the voltage of one channel of the tetrode versus another channel of the same tetrode. If the signal was simply volume conducted, one would expect to see the vast majority of points on the x=y line (red).
Author response image 1.
Below is a segment of mPFC LFP recording (upper black trace), mPFC LFP filtered for spindle frequency (7-15 Hz) and the spindle detected (black lines above the filtered trace. Then two LFP traces from a tetrode in the Reuniens (orange and light blue) are overlayed. The second trace (Blue) from bottom represents the substraction of Reuniens 1 minus Reuniens 2 channel, and just below (lower Blue trace) is this susbtraction trace filtered for spindle frequency (7-15 Hz) showing clear voltage difference in the spindle range between the two electrodes. Note also that around time 179-179.5 s, there is clear spindle oscillation in the mPFC recording which is not present in the Reuniens recordings.
Author response image 2.
Therefore, we are convinced that in our recordings, volume conductance did not play any significant role.
Another concern regarding delays between events, like slow waves, measured between two regions (as exemplified by Figure 3). It appears that the delays were calculated from the filtered signal. Figure 3G shows a delay between the peak of the mPFC slow wave between the raw and the filtered signal, which might be artifactual of the processing. It is though not (or less) visible for the reuniens recording. Such mismatch might explain the observed differences in delays.
Thanks for this comment. We recomputed the analysis using the original signal (smoothed) and obtained very similar results. Panels H and I of figure 3 were updated using the new analysis performed on original signal.
The overall analyses of LFP-triggered reuniens MUA activity lack of statistics (at least z-scored firing to normalise the firings).
Fig. 2 H and I are representative examples for histograms; statistical data are shown in circular plots as explained in the legend. Fig. 2 L, shows populational data and we provide now standard error. Fig. 4 C and D show individual example. Fig. 4 E shows histograms of activity of all identified putative single units. Units that show significant modulation are displayed above white line. Fig. 4 F shows populational data for significantly modified units.
A last point of detail in the model, which surprisingly shows reuniens to excitatory hippocampal cells' connectivity. Recent literature reports that reuniens only connect hippocampal interneurons, and not principal cells (at least in rodents, I could not find any report in cats). I wonder how changing this parameter would affect the results of the computational investigation, particularly the results shown in Figure 6.
There are several studies in the literature showing a direct excitation from the Reuniens to pyramidal cells in the CA1, here are three of them:
Goswamee, P., et al. (2021). "Nucleus Reuniens Afferents in Hippocampus Modulate CA1 Network Function via Monosynaptic Excitation and Polysynaptic Inhibition." Frontiers in Cellular Neuroscience 15.
Dolleman-Van der Weel MJ, Lopes da Silva FH, Witter MP (1997) Nucleus Reuniens Thalami Modulates Activity in Hippocampal Field CA1 through Excitatory and Inhibitory Mechanisms. The Journal of Neuroscience 17:5640.
Dolleman-van der Weel MJ, Lopes da Silva FH, Witter MP (2017) Interaction of nucleus reuniens and entorhinal cortex projections in hippocampal field CA1 of the rat. Brain Structure and Function 222:2421-2438.
Because this is not a review paper, we opted to not cite all the papers describing connectivity between mPFC, hippocampus and thalamus.
Reviewer #2 (Recommendations For The Authors):
I respectively suggest that the earlier (public) comments listed above should be addressed. In addition, it would be useful to make it clearer when non-rapid eye movement sleep was being addressed and when rapid eye movement was being addressed. Is it of value to use a single term instead of adding "slow wave sleep" or else clarify when either term is used? The addition of more subheadings might help. Moreover, the relative contribution/value of evidence from these two sleep states was not addressed or was not very clear.
We tried to make it clearer when NREM and when REM was analysed.
We replaced slow-wave sleep with NREM sleep in the figure 5 title.
We added several subheadings in the discussion.
Relative contribution of NREM vs REM sleep was not addressed? Sorry but we do not clearly understand your question. Figs. 2 and 3 deal mainly with NREM sleep (Fig 2.B has an example of REM sleep). Fig. 4 essentially describes results obtained during REM sleep.
I was not sure if the Abstract summarised the key take-home messages from the large amount of evidence provided. Some choices are needed, of course, but "evidence of bidirectional connectivity" struck me as less novel than other evidence provided. Given the huge amount of findings provided, which is commendable, it is still useful to present it perhaps in a more digestible fashion. For example, the headings or the first sentence(s) below headings could indicate the aim or the outcome of the specific method/analysis/findings.
We rewrote abstract and we also added some conclusion to highlight major findings and their meaning.
It is more common to use NRe or Re, rather than REU.
We avoided using RE as, for decades, we used RE to abbreviate the thalamic reticular nucleus in several publications. In this revised version, we spell at full - Reuniens.
Line 49 mentions "short-term" memory. Please specify this more clearly as it is otherwise ambiguous. Also, line 303.
We rephrased the sentence: In particular, the hierarchical coupling of slow waves, spindles and SWRs is thought to play a key role in memory consolidation.
Line 303 was likely about the ventromedial wall: we corrected that sentence.
Line 62: the word, "required" (for memory function) is too strong because there is evidence that it is not always required.
We modified the sentence for plays a major role.
The focus within the medial prefrontal cortex could be specified more clearly / earlier.
The mPFC is mentioned in the second sentence of the abstract and in the first sentence of the introduction.
Line 134: The heading states "determine" and then mentions modulation. These terms may not be interchangeable or they need clarification.
We changed it to slow wave-spindle amplitude coupling. This represents exactly our analysis.
Line 204: Does "cortical network" mean prefrontal cortex network"?
Yes, as described in lines 192-193, the two cortical networks (N1 and N2) of the model represent the mPFC layer 5 and 6 respectively.
Lines 283 to 289: These were not very clear to me.
These lines described the potential mechanisms for the responses to hippocampal and reuniens stimulation recorded intracellularly (results in figure 1). We modified this paragraph for clarity.
Line 296: Specify the "claim".
We modified the sentence for “[…] provides supporting evidence for this claim that nucleus Reuniens might synchronize the activity of ventral hippocampus and mPFC.”
The discussion naturally focuses on the thalamic nucleus reuniens, but also occasionally mentions the thalamic mediodorsal nucleus. The distinction, assuming this is highly relevant, could be expressed more clearly (direct comparison with their previous papers).
We never published a study on the mediodorsal nucleus. We do have some unpublished results from recordings in the MD nucleus and they reveal the presence of an inhibitory component at the beginning of cortical active states, therefore behaving in a similar way to first order nuclei. It is then possible that spindles recorded in the reuniens are actually generated in the MD nucleus and then transmitted to Reuniens through the thalamic reticular nucleus, as both MD and reuniens are connected to the rostral thalamic reticular nucleus. We added some discussion about this.
Figure 1B: Do the authors have any additional evidence of the placements in the reuniens, because the photo provided suggests a large area beyond the reuniens boundary. Also, please confirm is the CEM between Rh and Re in the cat (I think the Rh and Re are adjacent in the rat).
Figure 1B is from an electrolytic lesion, which is necessarily bigger than the tip of the electrode. Therefore the center of the electrolytic lesion indicates where the electrode tip was located which is well within the reuniens nucleus.
Also, yes CE (Nucleus centralis thalami, pars medialis) is located between the reuniens and rhomboid in cats. This can be found in two cat atlas:
Reinoso-Suárez, F. (1961). Topographischer Hirnatlas der Katze für experimental-physiologische Untersuchungen (Merck).
Berman AL, Jones EG (1982) The Thalamus and Basal Telencephalon of the Cat: A Cytoarchitectonic Atlas with Stereotaxic Coordinates: University of Wisconsin Press.
The first mention of hippocampus in the figure legends should remind the reader by stating "ventral hippocampus".
In this revised version, we added “ventral” in several instances both in the main text and in figure legend.
Figure 2: It seems unusual to mention "unusually short NREM". Presumably, things are the same otherwise - if so, perhaps mention that, especially if some of the effects reflect an "unusual" episode.
We display this particular segment because we want to show continuous recording in which still individual elements characterizing specific states are still visible.
Some effects look like they are strong and others perhaps weaker. If so, how do these impact the final conclusions?
Sorry, we did not understand clearly what is meant here by the reviewer. In general, if any effect has statistically significant difference (old fashion 0.05) we consider it as significant. Any other cases are described on individual basis.
Perhaps "MAD" should be in full on the first occasion, if not already.
It was spelled out at line 659, but we now spell it out also in the results section and in figure 2 legend.
Methods: the key question is the use of rodent recordings to classify cat recordings. It would be good to have a reference indicating that this can be directly used for cats, which may have different sleep cycles and patterns compared to rats.
We did not use rodent recordings to classify cat recordings, however we did used a state detection script that was developed with rodent recordings. As mentioned in the method section, we adapted the script to cat mPFC recordings and then manual corrections were made to correctly detect REM episodes. Respectfully, our lab investigates sleep-wake in non-anesthetized animals for a few decades; we developed state detection algorithm in mice, cats, marmosets when needed (to analyse months of recordings), and we have an extensive expertise in identifying states of vigilance from electrophysiological recordings.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public review):
Summary:
This manuscript presents a comprehensive exploration of the role of liver-specific Survival Motor Neuron (SMN) depletion in peripheral and central nervous system tissue pathology through a well-constructed mouse model. This study is pioneering in its approach, focusing on the broader physiological implications of SMN, which has traditionally been associated predominantly with spinal muscular atrophy (SMA).
Strengths:
(1) Novelty and Relevance: The study addresses a significant gap in understanding the role of liver-specific SMN depletion in the context of SMA. This is a novel approach that adds valuable insights into the multi-organ impact of SMN deficiency.
(2) Comprehensive Methodology: The use of a well-characterized mouse model with liver-specific SMN depletion is a strength. The study employs a robust set of techniques, including genetic engineering, histological analysis, and various biochemical assays.
(3) Detailed Analysis: The manuscript provides a thorough analysis of liver pathology and its potential systemic effects, particularly on the pancreas and glucose metabolism.
(4) Clear Presentation: The manuscript is well written. The results are presented clearly with well-designed figures and detailed legends.
We thank the reviewer for their positive comments. They had some concerns for us to consider (see below). We provide a point-by-point response to their comments.
Weaknesses:
(1) Limited Time Points: The study primarily focuses on a single time point (P19). This limits the understanding of the temporal progression of liver and pancreatic pathology in the context of SMN depletion. Longitudinal studies would provide a better understanding of disease progression.
We thank the reviewer for the suggestion. We extended our analysis to include P60 mice and performed both liver and pancreatic analyses at this time point to address this suggestion.
(2) Incomplete Recombination: The mosaic pattern of Cre-mediated excision leads to variability in SMN depletion, which complicates the interpretation of some results. Ensuring more consistent recombination across samples would strengthen the conclusions.
The variability in Cre-mediated excision is inherently stochastic, influenced by factors such as Cre expression levels, timing of recombination, and the accessibility of the target locus in individual cells. Achieving complete consistency across samples is particularly challenging, especially given the complexity of our breeding scheme, which occasionally results in litters without any animals of the desired genotype. Importantly, our study not only demonstrates that liver-specific SMN depletion results in liver alterations and pancreatic dysfunction but also highlights the limitations and challenges associated with this mouse model. By doing so, we aim to provide valuable insights for other researchers considering similar approaches in future studies.
Reviewer #2 (Public review):
Summary:
Marylin Alves de Almeida et al. developed a novel mouse cross via conditionally depleting functional SMN protein in the liver (AlbCre/+;Smn2B/F7). This mouse model retains a proportion of SMN in the liver, which better recapitulates SMN deficiency observed in SMA patients and allows further investigation into liver-specific SMN deficiency and its systemic impact. They show that AlbCre/+;Smn2B/F7 mice do not develop an apparent SMA phenotype as mice did not develop motor neuron death, neuromuscular pathology or muscle atrophy, which is observed in the Smn2B/- controls. Nonetheless, at P19, these mice develop mild liver steatosis, and interestingly, this conditional depletion of SMN in the liver impacts cells in the pancreas.
Strengths:
The current model has clearly delineated the apparent metabolic perturbations which involve a significantly increased lipid accumulation in the liver and pancreatic cell defects in AlbCre/+;Smn2B/F7 mice at P19. Standard methods like H&E and Oil Red-O staining show that in AlbCre/+;Smn2B/F7 mice, their livers closely mimic the livers of Smn2B/- mice, which have the full body knockout of SMN protein. Unlike previous work, this liver-specific conditional depletion of SMN is superior in that it is not lethal to the mouse, which allows an opportunity to investigate the long-term effects of liver-specific SMN on the pathology of SMA.
We thank the reviewer for their positive comments. They had some concerns for us to consider (see below). We provide a point-by-point response to their comments (review comments in black, our response in red).
Weaknesses:
Given that SMA often involves fatty liver, dyslipidemia and insulin resistance, using the current mouse model, the authors could have explored the long-term effects of liver-specific depletion of SMN on metabolic phenotypes beyond P19, as well as systemic effects like glucose homeostasis. Given that the authors also report pancreatic cell defects, the long-term effect on insulin secretion and resistance could be further explored. The mechanistic link between a liver-specific SMN depletion and apparent pancreatic cell defects is also unclear.
We extended our analysis to include P60 mice and performed both liver and pancreatic analyses at this time point to address this suggestion. In addition, we discussed the liver-pancreas axis in the Discussion.
Discussion:
This current work explores a novel mouse cross in order to specifically deplete liver SMN using an Albumin-Cre driver line. This provides insight into the contribution of liver-specific SMN protein to the pathology of SMA, which is relevant for understanding metabolic perturbations in SMA patients. Nonetheless, given that SMA in patients involve a systemic deletion or mutation of the SMN gene, the authors could emphasize the utility of this liver-specific mouse model, as opposed to using in vitro models, which have been recently reported (Leow et al, 2024, JCI). Authors should also discuss why a mild metabolic phenotype is observed in this current mouse model, as opposed to other SMA mouse models described in literature.
We appreciate the reviewer’s insightful comment. We have thoroughly addressed this suggestion in the Discussion section, particularly in lines 284-298; 309-322 and 334-359.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
(1) Longitudinal Studies: Conducting studies at maybe one more time points postnatally to provide a clearer picture of how liver-specific SMN depletion affects tissue pathology over time.
We extended our analysis to include P60 mice and performed both liver and pancreatic analyses at this time point to address this suggestion.
(2) Functional Assays: Incorporate glucose tolerance tests, insulin sensitivity tests, and more detailed metabolic profiling to better understand the physiological consequences of liver-specific SMN depletion on glucose metabolism and pancreatic function.
We sincerely thank the reviewer for this suggestion. We have included a full panel of metabolic hormones associated with glucose metabolism from animals at P19 and P60. These new data, along with additional figures, have now been provided in our revised manuscript.
(3) Mechanism: Discuss the molecular pathways affected by SMN depletion in the liver and pancreas. Mechanistic studies including transcriptomic or proteomic analyses to identify dysregulated pathways will help.
We appreciate the reviewer’s insightful comment. We have thoroughly addressed this suggestion in the Discussion section, particularly in lines 284-298 and 334-359.
(4) Typos in the abstract: beta cells secret insulin and alpha cells produce gulcagon.
Thank you for catching this error. It has been corrected to reflect that insulin is produced by beta cells and glucagon by alpha cells.
(5) Efficiency and specificity of the Alb-Cre: if possible, cross the Alb-Cre with the Rosa26 reporter line to test the efficiency and specificity of the Alb-Cre.
We agree that this would provide valuable insights. However, initiating a new breeding program to generate the required genotypes would take over a year and is beyond the scope of this study. To address this in part, we performed Cre immunostaining of the liver, pancreas, and spinal cord at P19, as well as the liver at P60. These results, now included in Supplemental Figure 1, demonstrate liver-specific expression and variability across hepatocytes.
Reviewer #2 (Recommendations for the authors):
The title of this manuscript is potentially misleading. The manuscript largely investigates the involvement of SMN protein on peripheral organs such as the liver, muscles, neuromuscular junction, and the pancreas. Yet, the title could be interpreted that the peripheral nervous system or central nervous system is the main focus. The title should be edited to indicate key terms such as "motor neuron and peripheral tissue pathology".
Thank you for pointing this out. We have revised the title to better represent the study’s focus. It is now “Impact of liver-specific survival motor neuron (SMN) depletion on central nervous system and peripheral tissue pathology”.
Suggestions:
Please clarify and explain clearly the various mouse lines (Smn2B/+, Smn2B/- and +/+; Smn2B/F7 ) used as controls as the nomenclature used is confusing. In addition, authors could consider the use of a wild-type mouse line to be used as a control to validate changes in AlbCre/+;Smn2B/F7 mice.
We have now provided clarification on mouse line nomenclature in the Results section (lines 104–124). Full-body heterozygous mice (_Smn_2B/+) are used as controls due to their slightly reduced SMN protein levels and absence of phenotypic changes compared to wild-type mice.
Given that the main phenotype implicated by the liver-specific depletion of SMN protein in AlbCre/+;Smn2B/F7 mice is pancreatic abnormalities (changes in alpha- and beta- cell numbers and blood glucose levels), authors should expand further on the pancreatic phenotype.
We added a full panel of metabolic hormones related to glucose metabolism in animals at P19 and P60. Furthermore, this has been discussed in detail in lines 284-298 and 334-344 of the Discussion.
A pancreas-specific depletion of SMN would provide this current manuscript with a better understanding of the role of SMN in regulating SMA pathology and provide more definitive conclusions on the contribution of liver-specific SMN depletion on normal pancreatic function.
We agree that this would be very informative. However, to do this would require initiation of a new breeding program that will take more than a year to arrive at the right genotypes. Although valuable, it is beyond the scope of the present study.
The authors should also delineate the role of hepatic SMN in pancreatic function, and how the intrinsic liver-specific loss of SMN directly impacts the pancreas. Currently, literature demonstrates that the fatty liver phenotype in SMA patients is a primary SMN-dependent hepatocyte-intrinsic liver defect associated with mitochondrial and other hepatic metabolism implications (see Leow et al, 2024 J Clin Invest). Given that the authors describe that SMN protein levels are not altered in the pancreas of AlbCre/+;Smn2B/F7 mice at P19, the authors ought to clarify how pancreas development and function is impacted in this mouse model, whether in-utero or postnatally. This could potentially underscore the cross-talk between liver SMN and pancreas function.
We have discussed the relationship between hepatic SMN and pancreatic function in the Discussion at lines 284-298 and 334-359.
Authors should also perform some metabolic tolerance tests to both oral glucose and insulin at an older age (e.g. P60) to study their homeostasis in these mice. These would help to substantiate the authors' conclusion and provide the paper with a greater level of novelty.
We thank the reviewer for this suggestion. A full panel of metabolic hormones related to glucose metabolism at P19 and P60 has been included, supported by additional figures that enhance the manuscript's novelty and depth.
Authors mentioned in the Discussion in lines 238 to 240: "Altogether, our findings underscore the necessity of conducting further investigations at later time points to unveil potential modifications in other pathways and their repercussions on liver physiology". Please elucidate the effects of longer term liver-specific depletion of SMN beyond P19, such as the onset of NAFLD or a diabetic phenotype due to pancreatic dysfunctions.
We extended our data to include P60 mice and performed liver and pancreatic analyses at these time points. The observed effects were transient, possibly due to the stochastic nature of Cre expression.
In addition, while AlbCre/+;Smn2B/F7 mice had similar weight gain trends as controls, it does appear that AlbCre/+;Smn2B/F7 mice weigh more than their controls by P60 (Figure 9C). This data would provide more convincing evidence of the metabolic defects observed in these mice.
As per the reviewer’s suggestion, we included new data (Figure 9D) showing % weight gain at P60 normalized to basal weight at P7. However, no statistically significant differences were detected.
Other than protein quantification, authors should perform immunohistochemistry or in-situ hybridization of SMN and imaging of AlbCre/+;Smn2B/F7 organs to validate the loss of liver-specific SMN. It is unclear from western blots that the expression of SMN is only in hepatocytes.
We thank the reviewer for the suggestion. Unfortunately, SMN antibodies have not produced reliable tissue immunostaining. To address this, we performed Cre immunostaining of the liver, pancreas, and spinal cord at P19, and the liver at P60, which demonstrated liver-specific expression. These results are now included in Supplemental Figure 1.
Authors should consider re-wording lines 228 through 231: "While our current analysis did not reveal significant differences in AlbCre/+;Smn2B/F7 mice, the observed upward trend in transferrin and HO levels suggests ongoing changes in iron metabolism, which may not be fully manifested at P19". Alternatively, a higher number of mouse samples would allow them to qualify this statement. Authors should also consider comparing levels of liver biomarkers such as ALT and AST, to check for liver homeostatic function.
We have removed speculative statements to avoid unsupported claims.
Recommendations:
The methods and additional details to generate the AlbCre/+;Smn2B/F7 should be explained better in section 2.1 of the Results. It is potentially confusing as to why these mice had to carry both 2B and F7 alleles. Additionally, the role of the F7 allele is not deliberately clear in the Introduction.
Additional details are now included in the Introduction (lines 87-90) and the Results section (lines 104-124).
Authors should refer to Leow et al 2024 (J Clin Invest) and discuss how their current findings compare with their hepatocyte-intrinsic SMN deficiency IPSCs model.<br /> We note a previous publication (Deguise et al 2021 Cell Mol Gastroenterol Hepatol) by the authors which characterized the Smn2B/- mouse model and its NAFLD/NASH features. From our understanding, the Smn2B/- mouse model appears to recapitulate SMA phenotype well, such as the early onset of hepatic steatosis and neurological conditions. As a follow-up to this publication, authors should discuss why this current study of a liver-specific SMN depletion is important and relevant to the study of SMA pathology.
We thank the reviewer for the insightful suggestions. We have included a discussion of these findings and their relevance to the study of SMA pathology in lines 284-298 and 309-322.
Minor corrections:
Abstract (line 32) reads: "a decrease in insulin producing alpha-cells and an increase in glucagon producing beta-cells". The authors should clarify and correct as insulin producing beta-cells and glucagon producing alpha-cells.
Thank you for catching the error. We corrected the description of insulin- and glucagon-producing cells.
Please clarify the number and gender of mice used for weight tracking and motor function experiments up to P60 (Figure 9C). It would be inappropriate if male and female mice were plotted together. If so, authors should stratify data by gender.
We thank the reviewer for the suggestion. Unfortunately, we did not stratify the animals by sex due to the unequal and insufficient number of males and females in our study. To address this, we normalized weight gain to each animal’s starting weight, and no significant differences were observed (now shown in Figure 9D).
The number of figures should be reduced. We recommend merging Figures 1 and 2 (generation of AlbCre/+;Smn2B/F7 mouse line and validation) and Figures 3 and 4 (liver function). Figures 5 through 9 may be supplemental figures instead.
We thank the reviewer for the suggestions. We merged Figures 1 and 2, and Figures 3 and 4, as requested. However, we would prefer to keep the other figures within the main results as they assess the impact of liver-specific depletion of SMN on other pathologies within the mouse model.
Standardize the use of asterisks and reporting p-values in Figure 2. All other figures in the manuscript utilize asterisks, but Figures 2C', 2D' and 2E' use p-values across comparisons.
P-values were included only when they approached statistical significance, providing additional clarity to the results.
It is unclear what the white arrow in Figure 7A indicates.
It is meant to point out the absence of an innervating axon. Please see Figure 5 legend, lines 801-802.
Note spelling errors in Figures 8B and 8C: 'Muscle flber'.
Thank you for catching this. We have corrected the typo to indicate muscle fiber instead.
Please clarify if muscle fiber size should be indicated as µm2 instead of µ2 in Figures 8B and 8C, as written in Materials and Methods under line 394.
Thank you for catching this. We corrected the typo to indicate µm2 instead.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer 1:
(1) The overall conclusion, as summarized in the abstract as "Together, our study documents the diversification of locomotor and oculomotor adaptations among hunting teleost larvae" is not that compelling. What would be much more interesting would be to directly relate these differences to different ecological niches (e.g. different types of natural prey, visual scene conditions, height in water column etc), and/or differences in neural circuit mechanisms. While I appreciate that this paper provides a first step on this path, by itself it seems on the verge of stamp collecting, i.e. collecting and cataloging observations without a clear, overarching hypothesis or theoretical framework.
There are limited studies on the prey capture behaviors of larval fishes, and ours is the first to compare multiple species systematically using a common analysis framework. Our analysis approach could have uncovered a common set of swim kinematics and capture strategies shared by all species; but instead, we found that medaka used a monocular strategy rather than the binocular strategy of cichlids and zebrafish. Our analysis similarly could have revealed first-feeding larvae of all species go through a “bout” stage, which was previously proposed as important for sensorimotor decision making (Bahl et al., 2019), but instead we found that medaka and some cichlids have more continuous swimming from an early life stage. Finally, the rate at which prey capture kinematics evolves is not known. Our approach could have revealed rapid diversification of feeding strategies in cichlids (similarly to how adult feeding behavior evolves), but instead we found smaller differences within cichlids than between cichlids and medaka.
(2) The data to support some of the claims is either weak or lacking entirely.
Highlighted timestamps in videos, new stats in fig 1H and fig 2, updated supplementary figures now provide additional support for claims.
- It would be helpful to include previously published data from zebrafish for comparison.
We appreciate the suggestion. Mearns et al. (2020) provided a comprehensive account of prey capture in zebrafish larvae in an almost identical setup with similar analyses. We do not feel it is necessary to recount all the findings in that paper here. There are many studies on prey capture in zebrafish from the past 20 years, and reproducing these here would not add anything to that extensive pre-existing literature.
- Justification is required for why it is meaningful to compare hunting strategies when both fish species and prey species are being varied. For instance, artemia and paramecia are different sizes and have different movement statistics.
We added text explaining why different food was chosen for medaka/cichlids. There is no easy way to stage match fishes as evolutionarily diverged as cichlids, medaka, and zebrafish. Size is a reasonable metric within a species, but there is no guarantee that sizematched larvae of two different species are at the same level of maturity. Therefore, we thought the most appropriate stage to address is when larvae first start feeding, as this enables us to study innate prey capture behavior before any learning or experience-dependent changes have taken place. Given that zebrafish, medaka and cichlid larvae are different sizes when they first start feeding, it was necessary to study their hunting behavior to different prey items.
- It would be helpful in Figure 1A to add the abbreviations used elsewhere in the paper. I found it slightly distracting that the authors switch back and forth in the paper between using "OL" and "medaka" to refer to the same species: please pick one and then remain consistent.
Medaka is the common name for the japanese rice fish, O. latipes. Cichlilds do not have common names are only referred to by their scientific names. Since readers are more likely to be familiar with the common name, medaka, we now use medaka (OL) throughout the manuscript, which we hope makes the text clearer.
- The conceptual meaning of behavioral segmentation is somewhat unclear. For zebrafish, the bouts already come temporally segmented. However in medaka for instance, swimming is more continuous, and the segmentation is presumably more in terms of "behavioral syllables" as have been discussed for example mouse or drosophila behavior (in the last row of Figure S1 it is not at all obvious why some of the boundaries were placed at their specific locations). It's not clear whether it's meaningful to make an equivalence between syllables and bouts, and so whether for instance Figure 1H is making an apples-to-apples comparison.
We clarified the text to say we are comparing syllables, rather than bouts.
- The interpretation of 1H is that "medaka exhibited significantly longer swims than cichlids"; however this is not supported by the appropriate statistical test. The KS test only says that two probability distributions are different; to say that one quantity is larger than another requires a comparison of means.
Updated Fig 1H; boostrap test (difference of medians) and re plotted data as violin plots.
(2) The data to support some of the claims is either weak or lacking entirely.
Highlighted timestamps in videos, new stats in fig 1H and fig 2, updated supplementary figures now provide additional support for claims.
- I think the evidence that there are qualitatively different patterns of eye convergence between species is weak. In Figure 2A I admire the authors addressing this using BIC, and the distributions are clearly separated in LA (the Hartigan dip test could be a useful additional test here). However for LO, NM, and AB the distributions only have one peak, and it's therefore unclear why it's better to fit them with two Gaussians rather than e.g. a gamma distribution. Indeed the latter has fewer parameters than a two-gaussian model, so it would be worthwhile to use BIC to make that comparison. The positions of the two Gaussians for LO, NM, and AB are separated by only a handful of degrees (cf LA, where the separation is ~20 degrees), which further supports the idea that there aren't really two qualitatively different convergence states here.
Added explanation to text.
- Figure S2 is unfortunately misleading in this regard. I don't claim the authors aimed to mislead, but they have made the well-known error of using colors with very different luminances in a plot where size matters (see e.g.
https://nam12.safelinks.protection.outlook.com/?url=https%3A %2F%2Fwww.r-project.org%2Fconferences%2FDSC2003%2FProceedings%2FIhaka.pdf&data=05%7C02%7Cdme arns%40princeton.edu%7C17ae2b44f0f246f15ddd08dc9b8e2 01c%7C2ff601167431425db5af077d7791bda4%7C0%7C0%7
C638556282750568814%7CUnknown%7CTWFpbGZsb3d8ey
JWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJ XVCI6Mn0%3D%7C0%7C%7C%7C&sdata=Ll4J4Xo39JEtKb %2FNnRWNoyedZAu5aAOMq0lHJCwsfXI%3D&reserved=0).
Thus, to the eye, it appears there's a big valley between the red and blue regions, but actually, that valley is full of points: it's really just one big continuous blob.
Kernel density estimation of eye convergence angles were added to Figure S2. The point we wish to make is that there is higher density when both eyes are rotated invwards (converged) in cichlids, but not medaka (O. latipes). The valley between converged and unconverged states being full of points is due to (1) slight variation with placement of key points in SLEAP, which blurs the boundary between states and (2) the eye convergence angle must pass through the valley in order to become converged, so necessarily there are points in between the two extremes of eye convergence.
- In Figure 2D please could the authors double-check the significance of the difference between LO and NM: they certainly don't look different in the plot.
Thank for for flagging this. We realize the way we previously reported the stats was open to misinterpretation. We have updated figure 2C, D and F to use letters to indicate statistical groupings, which hopefully makes it clearer which species are statistically different from each other.
- In Figure 2G it's not clear why AB is not included. It is mentioned that the artemia was hard to track in the AB videos, but the supplementary videos provided do not support this.
The contrast of the artemia in the AB videos is sufficiently different from the other cichlid videos that our pre-trained YOLO model fails. Retraining the model would be a lot of extra work and we feel like a comparison of three species is sufficient to address the sensorimotor transformations that occur over the course of prey capture in cichlids.
- The statement "Zebrafish larvae have a unique swim repertoire during prey capture, which is distinct from exploratory swim bouts" is not supported by the work of others or indeed the authors' own work. In Figure 4F all types of bouts can occur at any time, it's just the probability at which they occur that varies during prey capture versus other times (see also Mearns et al (2020) Figure S4B).
The point is well taken that there probably is not a hard separation between spontaneous and prey capture swims based on tail kinematics alone, which is also shown in Marques et al. (2018). However, we think that figure 2I of Mearns et al., which plots the probability of swims being drawn from different parts of the behavior space during prey capture (eyes converged) or not (eyes unconverged), shows that the repertoire of swims during the two states is substantially different. Points are blue or red; there are very few pale blue/pale red points in that figure panel. Figure S4B is showing clustered data, and clustering is a notoriously challenging problem for which there exists no perfect solution (Kleinberg, 2002). The clusters in Mearns et al. incorporated information about transition structure, as this was necessary for obtaining interpretable clusters for subsequent analyses. However, a different clustering approach could have yielded different boundaries, which may have shown more (or less) separation of bout types during prey capture/exploratory swimming. Therefore, we have updated the text to say that zebrafish perferentially perform different swim types during prey capture and exploration, and re-interpreted the behavior of cichlids similarly.
- More discussion is warranted of the large variation in the number of behavioral clusters found between species (11-32). First, how much is this variation really to be trusted? I appreciate the affinity propogation parameters were the same in all cases, but what parameters "make sense" is somewhat dependent on the particular data set. Second, if one does believe this represents real variation, then why? This is really the key question, and it's unsatisfying to merely document it without trying to interpret it.
Extended paragraph with more interpretation.
- What is the purpose of "hovers"? Why not stay motionless? Could it be a way of reducing the latency of a subsequent movement? Is this an example of the scallop theorem?
Added a couple of sentences speculating on function.
- I'm not sure "spring-loaded" is a good term here: the tension force of a coiled tail is fairly negligible since there's little internal force actively trying to straighten it.
Rewrote this part to highlight that fish spring toward the prey, without the implication that tension forces in the tail are responible for the movement. However, we are not aware of any literature measuring passive forces within the tail of fishes. Presumably the notochord is relatively stiff and may provide an internal force trying to straighten the tail.
- There are now several statements for which no direct evidence is presented. We shouldn't have to rely on the author's qualitative impressions of what they observed: show us quantitative analysis.
* "often hover"
* "cichlids often alternate between approaches and hover swims"
* "over many hundreds of milliseconds"
* "we have also observed suction captures and ram-like attacks"
* "may swim backwards"
* "may expel prey from their mouth"
* "cichlid captures often occur in two phases"
Added references to supplementary videos with timestamps to highlight these behaviors.
- I don't find it plausible that sated fish continue hunting prey that they know they're not going to eat just for the practice.
Removed the speculation.
- In Figure 3 is it not possible to include medaka, based on the hand-tracked paramecia?
The videos are recorded at high frame rate, so it would be a lot of additional work to track these manually. Furthermore, earlier in prey capture it is very difficult to tell by watching videos which prey the medaka are tracking, especially as single paramecia can drift in and out of focus in the videos. Since there is no eye convergence, it is very difficult to ascertain for certain when tracking a given prey begins. In Fig 4, it was only possible to track paramecia by hand since it is immediately prior to the strike and from the video it is possible to see which paramecium the fish targeted. Our analyses of heading changes was performed over the 200 ms prior to a strike, which we think is a conservative enough cutoff to say that fish were probably pursuing prey in this window (it is shorter than the average behavioral syllable duration in medaka).
- Figure 3 (particularly 3D) suggests the interesting finding that LA essentially only hunt prey that is directly in front of them (unlike LO and NM, the distribution of prey azimuth actually seems to broaden slightly over the duration of hunting events).
This is worthy of discussion.
We offer a suggestion for the many instances of prey capture being initiated in the central visual field in LA later in the manuscript when we discuss spitting behavior. We have added text to make this point earlier in the manuscript. The increase in azimuthal range at the end of prey capture may be due to abort swims (e.g. supp. vid. 1, 00:21). The widening of azimuthal angles is present in LO and NM also and is not unique to LA.
- The reference Ding et al (2016) is not in the reference list.
Wrong paper was referenced. Should be Ding 2019, which has been added to bibliography.
- I am not convinced that medaka exhibit a unique side-swing behavior. I agree there is this tendency in the example movie, however, the results of the quantification (Figure 4) are underwhelming. First, cluster 5 in 4K appears to include a proportion of cases from LA and AB. These proportions may be small, but anything above zero means this is not unique to medaka. Second, the heading angle (4N) starts at 4 degrees for LA and 8 degrees for medaka. This difference is genuine but very small, much smaller than what's drawn in the schematic (4M). I'm not sure it's justifiable to call a difference of 4 degrees a qualitatively different strategy.
We have changed the text to highlight that side swing is highly enriched in medaka. Comparing 4J to 3B we would argue that there is a qualitative difference in the strategy used to capture prey in the cichlid larvae we study here and medaka. We agree that further work is required to understand distance estimation behaviors in different species. In this manuscript, we use heading angle as a proxy for how prey position might change on the retina over a hunting sequence. But as the heading and distance are changing over time, the actual change in angle on the retina for prey may be much larger than the ~8 degree shift reported here. The actual position of the prey is also important here, which, for reasons mentioned above, we could not track. Given the final location of prey in the visual field prior to the strike (Fig 4J), the most parsimonious explanation of the data is that the prey is always in the monocular visual field. In cichlids, the prey is more-or-less centered in the 200 ms preceding the strike. While it is true theat the absolute difference in heading is 4 degrees, when converted to an angular velocity (4N, right), the medaka (OL) effectively rotate twice as fast as LA (20 deg/s vs 40 deg/s), which we think is a substantial difference and evidence of a different targeting strategy.
- 4K: This is referred to in the caption as a confusion matrix, which it's not.
Fixed.
- 4N right panel: how many fish contributed to the points shown?
Added to figure legend (n=113, LA; n=36, OL). Same data in left and right panels.
- In the Discussion it is hypothesized that medaka use their lateral line in hunting more than in other species. Testing this hypothesis (even just compared to one other species) would be fairly straightforward, and would add significant interest to the paper overall.
We agree that this is an interesting experiment for follow up studies, but it is beyond the scope of the current manuscript as we do not have the appropriate animal license for this experiment.
Reviewer 2:
The paper is rather descriptive in nature, although more context is provided in the discussion. Most figures are great, but I think the authors could add a couple of visual aids in certain places to explain how certain components were measured.
Added new supplemental figure (Supp Fig 2)
Figure 1B- it could be useful to add zebrafish and medaka to the scientific names (I realize it's already in Figure A but I found myself going back and forth a couple of times, mostly trying to confirm that O. latipes is medaka).
Added common names to 1B, sprinkled reminders of OL/medaka throughout text.
Figure 1G. I wasn't sure how to interpret the eye angle relative to the midline. Can they rotate their eyes or is this due to curvature in the 'upper' body of the fish? Adding a schematic figure or something like that could help a reader who is not familiar with these methods. Related to this, I was a bit confused by Figure 2A. After reading the methods section, I think I understand - but I little cartoon to describe this would help. It also reminds the reader (especially if they don't work with fish) that fish eyes can rotate. I also wanted to note that initially, I thought convergence was a measure of how the two eyes were positioned relative to the prey given the emphasis given on binocular vision, and only after reading certain sections again did I realize convergence was a measure of eye rotation/movement.
New supplemental figure explaining how eye tracking is performed
Figure 3. It was not immediately clear to me what onset, middle, and end represented - although it is explained in the caption. I think what tripped me up is the 'eye convergence' title in the top right corner of Figure 3A.
Updated figure with schematic illustrating that time is measured relative to eye convergence onset and end.
The result section about attack swim, S-strike, capture spring, etc. was a bit confusing to read and could benefit from a couple of concise descriptions of these behaviors. For example, I am not familiar with the S strike but a couple of paragraphs into this section, the reader learns more about the difference between S strike vs. attack swim. This can be mentioned in the first paragraph when these distinct behaviors are mentioned.
Added description of behavior earlier in text.
Figure 4. Presents lots of interesting data! I wonder if using Figure 1E could help the reader better understand how these measurements were taken.
New supplemental figure added, explaining how tail tracking is performed.
I probably overlooked this, but I wonder why so many panels are just focused on one species.
Added explanation to the text.
Is the S-shaped capture strategy the same as an S strike?
Clarified in text to say "S-strike-like". This is a description of prey capture from adult largemouth bass in New et al. (2002). From the still frames shown in that paper, the kinematics looks similar to an S-strike or capture spring. The important point we wish to make is that tail is coiled in an S-shape prior to a strike, which indicates this that a kinematically similar behavior exists fishes beyond just larval cichlids and zebrafish.
At the end of the page, when continuous swimming versus interrupted swimming is discussed, please remind the reader that medaka shows more continuous swimming (longer bouts).
Added "while medaka swim continuously with longer bouts ("gliding")".
After reading the discussion, it looks like many findings are unique. For example, given that medaka is such a popular model species in biology, it strikes me that nobody has ever looked into their hunting movements before. If their findings are novel, perhaps they should state so it is clear that the authors are not ignoring the literature.
We have highlighted what we believe to be the novelty of our findings (first description of prey capture in larval cichlids and medaka). To our knowledge, we are first to describe hunting in medaka; but there is an extensive literature on medaka dating back to the early 20th century, some of which is only published in Japanese. We have done our best to review the literature, but we cannot rule out that there are papers that we missed. No English language article or review we found mentions literature on hunting behavior in medaka larvae.
Reviewer 3:
More evidence is needed to assess the types of visual monocular depth cues used by medaka fish to estimate prey location, but that is beyond the scope of this compelling paper. For example, medaka may estimate depth through knowledge of expected prey size, accommodation, defocus blur, ocular parallax, and/or other possible algorithms to complement cues from motion parallax.
Added sentence to discussion highlighting that other cues may also contribute to distance estimation in cichlids and medakas. Follow-up studies will require new animal license.
None. It's quite nice, timely, and thorough work! For future work, one could use 3D pose estimation of eye and prey kinematics to assess the dynamics of the 2D image (prey and background) cast onto the retina. This sort of representation could be useful to infer which monocular depth cues may be used by medaka during hunting.
Great suggestion for follow up studies. Bolton et al. and Mearns et al. both find changes in z associated with prey capture, and it would be interesting to see how other fish species use the full 3-dimensional water column during prey capture, especially considering the diversity of hunting strategies in adult cichlids (ranging from piscivorous species, like LA, to algar grazers).
In Figure 4N, you use "change in heading leading up to a strike as a proxy for the change in visual angle of the prey for cichlids and medaka." This proxy makes sense, but you also have the eye angles and (in some cases) the prey positions. One could estimate the actual change in visual angle from this information, which would also allow one to measure whether the fish are trying to stabilize the position of the prey on a high-acuity patch of the retina during the final moments of the hunt. This information may also shed light on which monocular depth cues are used.
As addressed in comment to reviewer 1, this would require actually manually tracking individual paramecia over hundreds of frames. It is not possible to determine exactly when hunting begins in medaka, and it is prone to errors if medaka switch between targets over the course of a hunting episode. This question is better addressed with psychophysics experiments in embedded animals where it is possible to precisely control the stimulus, but this requires new animal licenses and is beyond the scope of this paper.
In Figure 5, you could place the prey object a little farther from the D. rerio fish for the S-strike diagram.
Fixed.
Figure 4F legend should read "...at the peak of each bout."
Fixed.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
Thank you for your constructive feedback and recognition of our work. We followed your suggestion and improved the accuracy of the language used to interpret some of our findings.
Summary:
The present study by Mikati et al demonstrates an improved method for in-vivo detection of enkephalin release and studies the impact of stress on the activation of enkephalin neurons and enkephalin release in the nucleus accumbens (NAc). The authors refine their pipeline to measure met and leu enkephalin using liquid chromatography and mass spectrometry. The authors subsequently measured met and leu enkephalin in the NAc during stress induced by handling, and fox urine, in addition to calcium activity of enkephalinergic cells using fiber photometry. The authors conclude that this improved tool for measuring enkephalin reveals experimenter handling stress-induced enkephalin release in the NAc that habituates and is dissociable from the calcium activity of these cells, whose activity doesn't habituate. The authors subsequently show that NAc enkephalin neuron calcium activity does habituate to fox urine exposure, is activated by a novel weigh boat, and that fox urine acutely causes increases in met-enk levels, in some animals, as assessed by microdialysis.
Strengths:
A new approach to monitoring two distinct enkephalins and a more robust analytical approach for more sensitive detection of neuropeptides. A pipeline that potentially could help for the detection of other neuropeptides.
Weaknesses:
Some of the interpretations are not fully supported by the existing data or would require further testing to draw those conclusions. This can be addressed by appropriately tampering down interpretations and acknowledging other limitations the authors did not cover brought by procedural differences between experiments.
We have taken time to go through the manuscript ensuring we are more detailed and precise with our interpretations as well as appropriately acknowledging limitations.
Reviewer #2 (Public Review):
Thank you for your constructive and thorough assessment of our work. In our revised manuscript, we adjusted the text to reflect the references you mentioned regarding the methionine oxidation procedure. Additionally, we expanded the methods section to include the key details of the statistical tests and procedures that you outlined.
Summary:
The authors aimed to improve the detection of enkephalins, opioid peptides involved in pain modulation, reward, and stress. They used optogenetics, microdialysis, and mass spectrometry to measure enkephalin release during acute stress in freely moving rodents. Their study provided better detection of enkephalins due to the implementation of previously reported derivatization reaction combined with improved sample collection and offered insights into the dynamics and relationship between Met- and Leu-Enkephalin in the Nucleus Accumbens shell during stress.
Strengths:
A strength of this work is the enhanced opioid peptide detection resulting from an improved microdialysis technique coupled with an established derivatization approach and sensitive and quantitative nLC-MS measurements. These improvements allowed basal and stimulated peptide release with higher temporal resolution, lower detection thresholds, and native-state endogenous peptide measurement.
Weaknesses:
The draft incorrectly credits itself for the development of an oxidation method for the stabilization of Met- and Leu-Enk peptides. The use of hydrogen peroxide reaction for the oxidation of Met-Enk in various biological samples, including brain regions, has been reported previously, although the protocols may slightly vary. Specifically, the manuscript writes about "a critical discovery in the stabilization of enkephalin detection" and that they have "developed a method of methionine stabilization." Those statements are incorrect and the preceding papers that relied on hydrogen peroxide reaction for oxidation of Met-Enk and HPLC for quantification of oxidized Enk forms should be cited. One suggested example is Finn A, Agren G, Bjellerup P, Vedin I, Lundeberg T. Production and characterization of antibodies for the specific determination of the opioid peptide Met5-Enkephalin-Arg6-Phe7. Scand J Clin Lab Invest. 2004;64(1):49-56. doi: 10.1080/00365510410004119. PMID: 15025428.
Thank you for highlighting this. It was not our intention to imply that we developed the oxidation method, rather that we were able improve the detection of metenkephalin by oxidation of the methionine without compromising the detection resolution of leu-enkephalin, enabling the simultaneous detection of both peptides. We have addressed this is the manuscript and included the suggested citation.
Another suggestion for this draft is to make the method section more comprehensive by adding information on specific tools and parameters used for statistical analysis:
(1) Need to define "proteomics data" and explain whether calculations were performed on EIC for each m/z corresponding to specific peptides or as a batch processing for all detected peptides, from which only select findings are reported here. What type of data normalization was used, and other relevant details of data handling? Explain how Met- and Leu-Enk were identified from DIA data, and what tools were used.
Thank you for pointing out this source of confusion. We believe it is because we use a different DIA method than is typically used in other literature. Briefly, we use a DIA method with the targeted inclusion list to ensure MS2 triggering as opposed to using large isolation widths to capture all precursors for fragmentation, as is typically done with MS1 features. For our method, MS2 is triggered based on the 4 selected m/z values (heavy and light versions of Leu and Met-Enkephalin peptides) at specific retention time windows with isolation width of 2 Da; regardless of the intensity of MS1 of the peptides.
(2) Simple Linear Regression Analysis: The text mentions that simple linear regression analysis was performed on forward and reverse curves, and line equations were reported, but it lacks details such as the specific variables being regressed (although figures have labels) and any associated statistical parameters (e.g., R-squared values).
Additional detail about the linear regression process was added to the methods section, please see lines 614-618. The R squared values are also now shown on the figure.
‘For the forward curves, the regression was applied to the measured concentration of the light standard as the theoretical concentration was increased. For plotting purposes, we show the measured peak area ratios for the light standards in the forward curves. For the reverse curves, the regression was applied to the measured concentration of the heavy standard, as the theoretical concentration was varied.’
(3) Violin Plots: The proteomics data is represented as violin plots with quartiles and median lines. This visual representation is mentioned, but there is no detail regarding the software/tools used for creating these plots.
We used Graphpad Prism to create these plots. This detail has been added to the statistical analysis section. See line 630.
(4) Log Transformation: The text states that the data was log-transformed to reduce skewness, which is a common data preprocessing step. However, it does not specify the base of the logarithm used or any information about the distribution before and after transformation.
We have added the requested details about the log transformation, and how the data looked before and after, into the statistical analysis section. We followed convention that the use of log is generally base 10 unless otherwise specified as natural log (base 2) or a different base. See lines 622-625
‘The data was log10 transformed to reduce the skewness of the dataset caused by the variable range of concentrations measured across experiments/animals. Prior to log transformation, the measurements failed normality testing for a Gaussian distribution. After the log transformation, the data passed normality testing, which provided the rationale for the use of statistical analyses that assume normality.’
(5) Two-Way ANOVA: Two-way ANOVA was conducted with peptide and treatment as independent variables. This analysis is described, but there is no information regarding the software or statistical tests used, p-values, post-hoc tests, or any results of this analysis.
Information about the two-way ANOVA analysis has been added to the statistical analysis section. Additionally, more detailed information has been added to the figure legends about the statistical results. Please see lines 625-628.
‘Two-way ANOVA testing with peptide (Met-Enk or Leu-Enk) and treatment (buffer or stress for example) as the two independent variables. Post-hoc testing was done using Šídák's multiple comparisons test and the p values for each of these analyses are shown in the figures (Figs. 1F, 2A).’
(6) Paired T-Test: A paired t-test was performed on predator odor proteomic data before and after treatment. This step is mentioned, but specific details like sample sizes, and the hypothesis being tested are not provided.
The sample size is included in the figure legend to which we have included a reference. We have also included the following text to highlight the purpose of this test. See lines 628-630
A paired t-test was performed on the predator odor proteomic data before and after odor exposure to test that hypothesis that Met-Enk increases following exposure to predator odor (Fig. 3F). These analyses were conducted using Graphpad Prism.
(7) Correlation Analysis: The text mentions a simple linear regression analysis to correlate the levels of Met-Enk and Leu-Enk and reports the slopes. However, details such as correlation coefficients, and p-values are missing.
We apologize for the use of the word correlation as we think it may have caused some confusion and have adjusted the language accordingly. Since this was a linear regression analysis, there is no correlation coefficient. The slope of the fitted line is reported on the figures to show the fitted values of Met-Enk to Leu-Enk.
(8) Fiber Photometry Data: Z-scores were calculated for fiber photometry data, and a reference to a cited source is provided. This section lacks details about the calculation of zscores, and their use in the analysis.
These details have been added to the statistical analysis section. See lines 634-637
‘For the fiber photometry data, the z-scores were calculated as described in using GuPPy which is an open-source python toolbox for fiber photometry analysis. The z-score equation used in GuPPy is z=(DF/F-(mean of DF/F)/standard deviation of DF/F) where F refers to fluorescence of the GCaMP6s signal.’
(9) Averaged Plots: Z-scores from individual animals were averaged and represented with SEM. It is briefly described, but more details about the number of animals, the purpose of averaging, and the significance of SEM are needed.
We have added additional information about the averaging process in the statistical analysis section. See lines 639-643.
‘The purpose of the averaged traces is to show the extent of concordance of the response to experimenter handling and predator odor stress among animals with the SEM demonstrating that variability. The heatmaps depict the individual responses of each animal. The heatmaps were plotted using Seaborn in Python and mean traces were plotted using Matplotlib in Python.’
A more comprehensive and objective interpretation of results could enhance the overall quality of the paper.
We have taken this opportunity to improve our manuscript following comments from all the reviewers that we hope has resulted in a manuscript with a more objective interpretation of results.
Reviewer #3 (Public Review):
Thank you for your thoughtful review of our work. To clarify some of the points you raised, we revised the manuscript to include more detail on how we distinguish between the oxidized endogenous and standard signal, as well as refine the language concerning the spatial resolution. We also edited the manuscript regarding the concentration measurements. We conducted technical replicates, so we appreciate you raising this point and clarify that in the main text.
Summary:
This important paper describes improvements to the measurement of enkephalins in vivo using microdialysis and LC-MS. The key improvement is the oxidation of met- to prevent having a mix of reduced and oxidized methionine in the sample which makes quantification more difficult. It then shows measurements of enkephalins in the nucleus accumbens in two different stress situations - handling and exposure to predator odor. It also reports the ratio of released met- and leu-enkephalin matching what is expected from the digestion of proenkephalin. Measurements are also made by photometry of Ca2+ changes for the fox odor stressor. Some key takeaways are the reliable measurement of met-enkephalin, the significance of directly measuring peptides as opposed to proxy measurements, and the opening of a new avenue into the research of enkephalins due to stress based on these direct measurements.
Strengths:
-Improved methods for measurement of enkephalins in vivo.
-Compelling examples of using this method.
-Opening a new area of looking at stress responses through the lens of enkephalin concentrations.
Weaknesses:
(1) It is not clear if oxidized met-enk is endogenous or not and this method eliminates being able to discern that.
We clarified our wording in the text copied below to provide an explanation on how we distinguish between the two. Even after oxidation, the standard signal has a higher m/z ratio due to the presence of the Carbon and Nitrogen isotopes as described in the Chemicals section of the methods ‘For Met Enkephalin, a fully labeled L-Phenylalanine (<sup>13</sup>C<sub>9</sub>, <sup>15</sup>N) was added (YGGFM). The resulting mass shift between the endogenous (light) and heavy isotope-labeled peptide are 7Da and 10Da, respectively.’, so they can still be differentiated from the endogenous signal. We have clarified the language in the results section. See lines 82-87.
‘After each sample collection, we add a consistent known concentration of isotopically labeled internal standard of Met-Enk and Leu-Enk of 40 amol/sample to the collected ISF for the accurate identification and quantification of endogenous peptide. These internal standards have a different mass/charge (m/z) ratio than endogenous Met- and Leu-Enk. Thus, we can identify true endogenous signal for Met-Enk and Leu-Enk (Suppl Fig. 1A,C) versus noise, interfering signals, and standard signal (Suppl. Fig. 1B,D).’
(2) It is not clear if the spatial resolution is really better as claimed since other probes of similar dimensions have been used.
Apologies for any confusion here. To clarify we primarily state that our approach improves temporal resolution and in a few cases refer to improved spatiotemporal resolution, which we believe we show. The dimensions of the microdialysis probe used in these experiments allow us to target the nucleus accumbens shell and as well as being smaller – especially at the membrane level - than a fiber photometry probe.
(3) Claims of having the first concentration measurement are not quite accurate.
Thank you for your feedback. To clarify, we do not claim that we have the first concentration measurements, rather we are the first to quantify the ratio of Met-Enk to Leu-Enk in vivo in freely behaving animals in the NAcSh.
(4) Without a report of technical replicates, the reliability of the method is not as wellevaluated as might be expected.
We have added these details in the methods section, please see lines 521-530.
‘Each sample was run in two technical replicates and the peak area ratio was averaged before concentration calculations of the peptides were conducted. Several quality control steps were conducted prior to running the in vivo samples. 1) Two technical replicates of a known concentration were injected and analyzed – an example table from 4 random experiments included in this manuscript is shown below. 2) The buffers used on the day of the experiment (aCSF and high K+ buffer) were also tested for any contaminating Met-Enk or Leu-Enk signals by injecting two technical replicates for each buffer. Once these two criteria were met, the experiment was analyzed through the system. If either step failed, which happened a few times, the samples were frozen and the machines were cleaned and restarted until the quality control measures were met.’
Recommendations For The Authors:
Reviewer #1 (Recommendations For The Authors):
• The authors should provide appropriate citations of a study that has validated the Enkephalin-Cre mouse line in the nucleus accumbens or provide verification experiments if they have any available.
Thank you for your comment. We have added a reference validating the Enk-Cre mouse line in the nucleus accumbens to the methods section and is copied here.
D.C. Castro, C.S. Oswell, E.T. Zhang, C.E. Pedersen, S.C. Piantadosi, M.A. Rossi, A.C. Hunker, A. Guglin, J.A. Morón, L.S. Zweifel, G.D. Stuber, M.R. Bruchas, An endogenous opioid circuit determines state-dependent reward consumption, Nature 2021 598:7882 598 (2021) 646–651. https://doi.org/10.1038/s41586-02104013-0.
• Better definition of the labels y1,y2,b3 in Figures 1 and S1 would be useful. I may have missed it but it wasn't described in methods, results, or legends.
Thank you for this comment. We have added this information to Fig.1 legend ‘Y1, y2, b3 refer to the different elution fragments resulting from Met-Enk during LC-MS.
• It is interesting that the ratio of KCl-evoked release is what changes differentially for Met- vs Leu. Leu enk increases to the range of met-enk. There is non-detectable or approaching being non-detectable leu-enk (below the 40 amol / sample limit of quantification) in most of the subjects that become apparent and approach basal levels of met-enkephalin. This suggests that the K+ evoked response may be more pronounced for leu-enk. This is something that should be considered for further analysis and should be discussed.
Thank you for this astute observation, and you make a great point. We have added some discussion of this finding in the results and discussion sections see lines 111112 and lines 253-257.
‘Interestingly, Leu-Enk showed a greater fold change compared to baseline than did Met-Enk with the fold changes being 28 and 7 respectively based on the data in Fig.1F.’
‘We also noted that Leu-Enk showed a greater fold increase relative to baseline after depolarization with high K+ buffer as compared to Met-Enk. This may be due to increased Leu-Enk packaging in dense core vesicles compared to Met-Enk or due to the fact that there are two distinct precursor sources for Leu-Enk, namely both proenkephalin and prodynorphin while Met-Enk is mostly cleaved from proenkephalin (see Table 1 [48]).’
• For example in 2E, it would be helpful to label in the graph axis what samples correspond to the manipulation and also in the text provide the reader with the sample numbers. The authors interpret the relationship between the last two samples of baseline and posthandling stress as the following in the figure legend "the concentration released in later samples is affected; such influence suggests that there is regulation of the maximum amount of peptide to be released in NAcSh. E. The negative correlation in panel d is reversed by using a high K+ buffer to evoke Met-Enk release, suggesting that the limited release observed in D is due to modulation of peptide release rather than depletion of reserves." However, the correlations are similar between 2D and E and it appears that two mice are mediating the difference between the two groups. The appropriate statistical analysis would be to compare the regressions of the two groups. Statistics for the high K+ (and all other graphs where appropriate) need to be reported, including the r2 and p-value.
Thank you for your constructive critique. To elucidate the effect of high K+, we have plotted the regression line and reported the slope for Fig. 2E. Notably, the slope is reduced by a factor of 2 and appears to be driven by a large subset of the animals. The statistics for the high K+ graph are shown on the figure (Fig 1F) which test the hypothesis of whether high K+ leads to the release of Leu-Enk and Met-Enk respectively compared to baseline with aCSF. We have added the test statistics to the figure legend for additional clarity. Fig. 1G has no statistics because it is only there to elucidate the ratio between Met-Enk and Leu-Enk in the same samples. We did not test any hypotheses related to whether there are differences between their levels as that is not relevant to our question. The correlation on the same data is depicted in Fig. 1H, and we have added the R<sup>2</sup> value per your request.
• The interpretation that handling stress induces enkephalin release from microdialysis experiments is also confounded by other factors. For instance, from the methods, it appears that mice were connected and sample collection started 30 min after surgery, therefore recovery from anesthesia is also a confounding variable, among other technical aspects, such as equilibration of the interstitial fluid to the aCSF running through the probe that is acting as a transmitter and extracellular molecule "sink". Did the authors try to handle the mice post hookup similar to what was done with photometry to have a more direct comparison to photometry experiments? This procedural difference, recording from recently surgerized animals (microdialysis) vs well-recovered animals with photometry should be mentioned in addition to the other caveats the authors mention.
Thank you for your comment. We are aware of this technical limitation, and it is largely why we sought to conduct the fiber photometry experiments to get at the same question. As you requested, we have included additional language in the discussion to acknowledge this limitation and how we chose to address it by measuring calcium activity in the enkephalinergic neurons, which would presumably be the same cell population whose release we are quantifying using microdialysis. See lines 262-273.
‘Our findings showed a robust increase in peptide release at the beginning of experiments, which we interpreted as due to experimenter handling stress that directly precedes microdialysis collections. However, there are other technical limitations to consider such as the fact that we were collecting samples from mice that were recently operated on. Another consideration is that the circulation of aCSF through the probe may cause a sudden shift in oncotic and hydrostatic forces, leading to increased peptide release to the extracellular space. As such, we wanted to examine our findings using a different technique, so we chose to record calcium activity from enkephalinergic neurons - the same cell population leading to peptide release. Using fiber photometry, we showed that enkephalinergic neurons are activated by stress exposure, both experimenter handling and fox odor, thereby adding more evidence to suggest that enkephalinergic neurons are activated by stress exposure which could explain the heightened peptide levels at the beginning of microdialysis experiments.’
• The authors should provide more details on handling stress manipulation during photometry. For photometry what was the duration of the handling bout, what was the interval between handling events, and can the authors provide a description of what handling entailed? Were mice habituated to handling days before doing photometry recording experiments?
Thank you for your suggestion. We have addressed all of your points in the methods section. See lines 564-570.
‘The handling bout which mimicked traditional scruffing lasted about 3-5 seconds. The mouse was then let go and the handling was repeated another two times in a single session with a minimum of 1-2 minutes between handling bouts. Mice were habituated to this manipulation by being attached to the fiber photometry rig, for 3-5 consecutive days prior to the experimental recording. Additionally, the same maneuver was employed when attaching/detaching the fiber photometry cord, so the mice were subjected to the same process several times.’
• For the novel weigh boat experiments, the authors should explicitly state when these experiments were done in relation to the fox urine, was it a different session or the same session? Were they the same animals? Statements like the following (line 251) imply it was done in the same animals in the same session but it should be clarified in the methods "We also showed using fiber photometry that the novelty of the introduction of a foreign object to the cage, before adding fox odor, was sufficient to activate enkephalinergic neurons."
As shown in supplementary figure 4, individual animal data is shown for both water and fox urine exposure (overlaid) to depict whether there were differences in their responses to each manipulation – in the same animal. And yes, you are correct, the animals were first exposed to water 3 times in the recording session and then exposed to fox urine 3 times in the same session. We have added that to the methods section describing in vivo fiber photometry. See lines 575-576.
• Statistical testing would be needed to affirm the conclusions the authors draw from the fox urine and novel weigh boat experiments. For example, it shows stats that the response attenuates, that it is not different between fox urine and novel (it looks like the response is stronger to the fox urine when looking at the individual animals), etc. These data look clear but stats are formally needed. Formal statistics are also missing in other parts of the manuscript where conclusions are drawn from the data but direct statistical comparisons are not included (e.g. Fig 2.G-I).
The photometry data is shown as z-scores which is a formal statistical analysis. ANOVA would be inappropriate to run to compare z-scores. We understand that this is erroneously done in fiber photometry literature, however, it remains incorrect. The z-scores alone provide all the information needed about the deviation from baseline. We understand that this is not immediately clear to readers, and we thank you for allowing us to explain why this is the case. We have added test statistics to figure legends where hypothesis testing was done and p-values were reported.
• Did the authors try to present the animals with repeated fox urine exposure to see if this habituates like the photometry?
No, we did not do that experiment due to the constrained timing within which we had to run our microdialysis/LC-MS timeline, but it is a great point for future exploration.
• It would be useful to present the time course of the odor experiment for the microdialysis experiment.
The timeline is shown in Fig.1a and Fig.3e. To reiterate, each sample is 13 minutes long.
• Can the authors determine if differences in behavior (e.g. excessive avoidance in animals with with one type of response) or microdialysis probe location dictate whether animals fall into categories of increased release, no release, or no-detection? From the breakdown, it looks like it is almost equally split into three parts but the authors' descriptions of this split are somewhat misleading (line 210). " The response to predator odor varies appreciably: although most animals show increased Met-Enk release after fox odor exposure, some show continued release with no elevation in Met-Enk levels, and a minority show no detectable release".
Thank you for your constructive feedback. We do not believe the difference in behavior is correlated with probe placement. The hit map can be found in suppl. Fig 3 and shows that all mice included in the manuscript had probes in the NAcSh. We purposely did not distinguish between dorsal and ventral because of our 1 mm membrane would make it hard to presume exclusive sampling from one subregion. That is a great point though, and we have thought about it extensively for future studies. We have edited the language to reflect the almost even split of responses for Met-Enk and appreciate you pointing that out.
• Overall, given the inconsistencies in experimental design and overall caveats associated, I think the authors are unable to draw reasonable conclusions from the repeated stressor experiments and something they should either consider is not trying to draw strong conclusions from these observations or perform additional experiments that provide the grounds to derive those conclusions.
We have included additional language on the caveats of our study, and our use of a dual approach using fiber photometry and microdialysis was largely driven by a
desire to offer additional support of our conclusions. We expected pushback about our conclusions, so we wanted to offer a secondary analysis using a different technique to test our hypothesis. To be honest the tone of this comment and content is not particularly constructive (especially for trainees) nor does it offer a space to realistically address anything. This work took multiple years to optimize, it was led by a graduate student, and required a multidisciplinary team. As highlighted, we believe it offers an important contribution to the literature and pushes the field of peptide detection forward.
Reviewer #2 (Recommendations For The Authors):
A more comprehensive and objective interpretation of results could enhance the overall quality of the paper. The manuscript contains statements like "we are the first to confirm," which can be challenging to substantiate and may not significantly enhance the paper. It's essential to ensure that novelty statements are well-founded. For example, the release of enkephalins from other brain regions after stress exposure is well-documented but not addressed in the paper. Similarly, the role of the NA shell in stress has been extensively studied but lacks coverage in this manuscript.
We have edited the language to reflect your feedback. We have also included relevant literature expanding on the demonstrated roles of enkephalins in the literature. We would like to note that most studies have focused on chronic stress, and we were particularly interested in acute stress. See lines 129-134.
‘These studies have included regions such as the locus coeruleus, the ventral medulla, the basolateral nucleus of the amygdala, and the nucleus accumbens core and shell. Studies using global knockout of enkephalins have shown varying responses to chronic stress interventions where male knockout mice showed resistance to chronic mild stress in one study, while another study showed that enkephalin-knockout mice showed delayed termination of corticosteroid release. [33,34]’
Finally, not a weakness but a clarification suggestion: the method description mentions the use of 1% FA in the sample reconstitution solution and LC solvents, which is an unusually high concentration of acid. If this concentration is intentional for maintaining the peptides' oxidation state, it would be beneficial to mention this in the text to assist readers who might want to replicate the method.
This is correct and has been clarified in the methods section
Reviewer #3 (Recommendations For The Authors):
-The Abstract should state the critical improvements that are made. Also, quantify the improvements in spatiotemporal resolution.
Thank you for your comment. We have edited the abstract to reflect this.
- The use of "amol/sample" as concentration is less informative than an SI units (e.g., pM concentration) and should be changed. Especially since the volume used was the same for in vivo sampling experiments.
Thank you for your comment. We chose to report amol/sample because we are measuring such a small concentration and wanted to account for any slight errors in volume that can make drastic differences on reported concentrations especially since samples are dried and resuspended.
-Please check this sentence: "After each collection, the samples were spiked with 2 µL of 12.5 fM isotopically labeled Met-Enkephalin and Leu-Enkephalin" This dilution would yield a concentration of ~2 fM. In a 12 uL sample, that would be ~0.02 amol, well below the detection limit. (note that fM would femtomolar concentration and fmol would be femtomoles added).
-"liquid chromatography/mass spectrometry (LC-MS) [9-12]"... Reference 9 is a RIA analysis paper, not LC-MS as stated.
Thank you for catching these. We have corrected the unit and citation.
-Given that improvements in temporal resolution are claimed, the lack of time course data with a time axis is surprising. Rather, data for baseline and during treatment appear to be combined in different plots. Time course plots of individuals and group averages would be informative.
Due to the expected variability between individual animal time course data, where for example, we measure detectable levels in one sample followed by no detection, it was very difficult to combine data across time. Therefore, to maximize data inclusion from all animals that showed baseline measurements and responses to individual manipulations, we opted to report snapshot data. Our improvement in temporal resolution refers to the duration of each sample rather than continuous sampling, so those two are unrelated. Thank you for your feedback and allowing us to clarify this.
- I do not understand this claim "We use custom-made microdialysis probes, intentionally modified so they are similar in size to commonly used fiber photometry probes to avoid extensive tissue damage caused by traditional microdialysis probes (Fig. 1B)." The probes used are 320 um OD and 1 mm long. This is not an uncommon size of microdialysis probes and indeed many are smaller, so is their probe really causing less damage than traditional probes?
Thank you for your comment. We are only trying to make the point that the tissue damage from these probes is comparable to commonly used fiber photometry probes. We only point that out because tissue damage is used as a point to dissuade the usage of microdialysis in some literature, and we just wanted to disambiguate that. We have clarified the statement you pointed out.
-The oxidation procedure is a good idea, as mentioned above. It would be interesting to compare met-enk with and without the oxidation procedure to see how much it affects the result (I would not say this is necessary though). It is not uncommon to add antioxidants to avoid losses like this. Also, it should be acknowledged that the treatment does prevent the detection of any in vivo oxidation, perhaps that is important in met-enk metabolism?
The comparison between oxidized and unoxidized Met-Enk detection is in figure 1C.
-It would be a best practice to report the standard deviation of signal for technical replicates (say near in vivo concentrations) of standards and repeated analysis of a dialysate sample to be able to understand the variability associated with this method. Similarly, an averaged basal concentration from all rats.
Thank you for your comment. We have included a table showing example quality control standard injections from 4 randomly selected experiments included in the manuscript that were run before and after each experiment and descriptive statistics associated with these technical replicates. We also added some detail to the methods section to describe how quality control is done. See lines 521-530.
‘Each sample was run in two technical replicates and the peak area ratio was averaged before concentration calculations of the peptides were conducted. Several quality control steps were conducted prior to running the in vivo samples. 1) Two technical replicates of a known concentration were injected and analyzed – an example table from 4 random experiments included in this manuscript is shown below. 2) The buffers used on the day of the experiment (aCSF and high K+ buffer) were also tested for any contaminating Met-Enk or Leu-Enk signals by injecting two technical replicates for each buffer. Once these two criteria were met, the experiment was analyzed through the system. If either step failed, which happened a few times, the samples were frozen and the machines were cleaned and restarted until the quality control measures were met.’
EDITORS NOTE
Should you choose to revise your manuscript, please include full statistical reporting including exact p-values wherever possible alongside the summary statistics (test statistic and df) and 95% confidence intervals. These should be reported for all key questions and not only when the p-value is less than 0.05.
Thank you for your suggestion. We have included more detail about statistical analysis in the figure legends per this comment and reviewer comments.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
The propagation of electrical signals within neuronal circuits is tightly regulated by the physical and molecular properties of neurons. Since neurons vary in size across species, the question arises whether propagation speed also varies to compensate for it. The present article compares numerous speed-related properties in human and rat neurons. They found that the larger size of human neurons seems to be compensated by a faster propagation within dendrites but not the axons of these neurons. The faster dendritic signal propagation was found to arise from wider dendritic diameters and greater conductance load in human neurons. In addition, the article provides a careful characterization of human dendrites and axons, as the field has only recently begun to characterize post-operative human cells. There are only a few studies reporting dendritic properties and these are not all consistent, hence there is the added value of reporting these findings, particularly given that the characterization is condensed in a compartmental model.
Strengths:
The study was performed with great care using standard techniques in slice electrophysiology (pharmacological manipulation with somatic patch-clamp) as well as some challenging ones (axonal and dendritic patch-clamp). Modeling was used to parse out the role of different features in regulating dendritic propagation speed. The finding that propagation speed varies across species is novel as previous studies did not find a large change in membrane time constant or axonal diameters (a significant parameter affecting speed). A number of possible, yet less likely factors were carefully tested (Ih, membrane capacitance). The main features outlined here are well-known to regulate speed in neuronal processes. The modeling was also carefully done to verify that the magnitude of the effects is consistent with the difference in biophysical properties. Hence, the findings appear very solid to me.
Weaknesses:
The role of diameter in regulating propagation speed is well-known in the axon literature.
We thank the reviewer for this comment. This is indeed true. The paper does not claim that this is new – we just refereed to Waxman’s book to acknowledge this established effect. Our main emphasize is on the impact of dendritic (rather than axonal) diameter – highlighting the faster EPSP speed near the input synapse and converging to steady-state value further away from the soma and using this to explore the impact of differences in dendritic diameter of rat vs. human on EPSP latency and velocity. We now made this point clearer in the revised text.
Reviewer #2 (Public Review):
Summary:
In this paper, Oláh and colleagues introduce new research data on the cellular and biophysical elements involved in transmission within the pyramidal circuits of the human neocortex. They gathered a comprehensive set of patch-clamp recordings from human and rat pyramidal neurons to compare how the temporal aspect of neuronal processing is maintained in the larger human neocortex. A broad range of experimental, theoretical, and computational methods are used, including two-photon guided dual whole-cell recordings, electron microscopy, and computational simulations of reconstructed neurons.
Recordings from synaptically connected pyramidal neurons revealed longer intercellular path lengths within the human neocortex. Further, by using dual whole-cell recordings from somadendrite and soma-axon locations, they found that short latencies from soma to soma can be partly attributed to an increased propagation speed for synaptic potentials, but not for the propagation of action potentials along the axon.
Next, in a series of extensive computational modeling studies focusing on the synaptic potentials, the authors observe that the short-latency within large human pyramidal neural circuits may have a passive origin. For a wide array of local synaptic input sites, the authors show that the conductance load of the dendrites, electrically coupled to a large diameter apical dendrite, affects the cable properties. The result is a relatively faster propagation of EPSPs in the human neuron.
The manuscript is well-written and the physiological experiments and biophysical arguments are very well explained. I appreciated the in-depth theoretical steps for the simulations. That passive cable properties of the dendrites are causing a higher velocity in human dendrites is interesting but there is a disconnect between the experimental findings and the model simulations. Based on the present data the contribution of active membrane properties cannot be dismissed and deserves further experiments.
See our response below
Strengths:
The authors present state-of-the-art 2P-guided dual whole-cell recordings in human neurons. In combination with detailed reconstructions, these approaches represent the next steps in unravelling the information processing in human circuits.
The computational modeling based on cable theory and experimentally constrained simulations provides an excellent integrated view of the passive membrane properties.
Weaknesses:
There are smaller and larger issues with the statistical analyses of the experimental data which muddles the interim conclusions.
That the cable properties alone are the main explanation for speeding the electrical signaling in human pyramidal neurons appears inconsistent with the experimental data.
This is an excellent point – we indeed performed analysis on only passive cases – highlighting (and now also ranking) the impact of the various morpho-electrical properties of the neurons on the differences in signal latency in human vs. rats. We did explored (not shown) the effect of active channels in the dendrites (including the h-current); as expected the results strongly depend on channel density and their spatial distribution over the dendritic tree. As we do not know these parameters for the modelled cells, we decided to remain focus on the impact of passive/morphological parameters. We also note that the experimental results (page 4-5 in manuscript) show minor contribution of h-current emphasizing that the passive properties have the main role in differentiating human and rats. differences between human and rat.
Some of the electrophysiological experiments require further control experiments to make robust conclusions.
Reviewer #3 (Public Review):
Summary:
This study indicates that connections across human cortical pyramidal cells have identical latencies despite a larger mean dendritic and axonal length between somas in the human cortex. A precise demonstration combining detailed electrophysiology and modeling indicates that this property is due to faster propagation of signals in proximal human dendrites. This faster propagation is itself due to a slightly thicker dendrite, a larger capacitive load, and stronger hyperpolarizing currents. Hence, the biophysical properties of human pyramidal cells are adapted such that they do not compromise information transfer speed.
Strengths:
The manuscript is clear and very detailed. The authors have experimentally verified a large number of aspects that could affect propagation speed and have pinpointed the most important one. This paper provides an excellent comparison of biophysical properties between rat and human pyramidal cells. Thanks to this approach a comprehensive description of the mechanisms underlying the acceleration of propagation in human dendrite is provided.
Weaknesses:
Several aspects having an impact on propagation speed are highlighted (dendritic diameter, ionic channels, capacitive load) and there is no clear ranking of their impact on signal propagation speed. It seems that the capacitive load plays a major role, much more than dendritic diameter for which only a 10% increase is observed across species. Both aspects actually indicate that there is an increase in passive signal propagation speed with bigger cells at least close to the soma. This suggests that bigger cells are mechanically more rapid. An intuitive reason why capacitive load increases speed would also help the reader follow the demonstration.
We thank the referee for both these excellent points. In response to them:
(i) We now performed a new comprehensive statistical analysis and show the ranking of the effect of the different morphological/cable factors on EPSP propagation. This analysis appears in both Supp. Table 5& 6, Fig. S16 and also in the main text as follows:
To rank the impact of the various factors affecting EPSP propagation latency in human and rat neurons, we conducted a comprehensive statistical analysis using two complementary approaches: the generalized linear model (GLM) (Kiebel & Holmes, 2007) as well as SHAP (SHapley Additive exPlanations) (Lundberg & Lee, 2017) based on fitting Gradient Tree Boosting (Friedman, 2002)model. We began by fitting a GLM without interaction terms among the factors affecting EPSP latency (Suppl. Table 5). This enables us to quantify the primary individual factors affecting EPSP propagation. Our analysis revealed the following ranking order: 1) physical distance of synapses from soma had the strongest effect; 2) species differences; 3) conductance load, as demonstrated by our “hybrid cells” manipulation; 4) radii of the apical dendrite, affecting the cables’ space constant, λ; and 5) the specific cable parameters, as revealed when using per-cell fitted parameters versus uniform cable parameters, was minimal. We next performed GLM analysis with interaction terms showing that, as expected, there are significant interactions between the factors affecting EPSP latency (Suppl. Table 6). To further validate the above ranking while incorporating the interactions between the various factors affecting EPSP latency, we performed a SHAP analysis. Notably, even with interactions included, the ranking of the factors affecting signal propagation are aligned with the results from the analysis based on the GLM without interaction terms (see Fig S.16).
(ii) As for the intuitive explanation required by the referee. We added the following paragraph In the Discussion:
The intuitive reason for this enhancement is that the large conductance load (the “leaky end” boundary conditions) more effectively “steals” the synaptic (axial) current (like water pouring faster into a large pool). The more mathematical intuition is that the large soma (sink) adds fast time constants to the system (see also related explanation in Fig. 4 in Eyal et al., 2014).
We thank the editors for considering and revising our manuscript for publication in eLife. We appreciate the positive appreciation of the work and the critical points raised by the reviewers. We have responded in detail to all the excellent comments from all reviewers. We believe that these revisions have significantly improved the quality of our study.
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
There are two points that could improve the reading experience of this nice manuscript. These should be easily addressed with minor re-phrasing.
Credit to conduction velocity literature. Less widely known in the dendrite literature, in the axon literature, the relationship between propagation speed and process diameter is well established. I thought the two articles cited (Jack Noble Tsien and Agmon-Snir & Segev) were not as direct in the treatment of this relationship. The work of Stephen Waxman, for instance, made clear how axon diameter tightly controls propagation speed (see for instance the Scholarpedia entry by Swadlow and Waxman). In my opinion, this is a widely known piece of work, that is part of some introductory books to neuroscience. While the article does not claim they found this relationship, parts of the presentation are better understood if we ignore this well-known fact. I am referring to the abstract, intro, and the beginning of results where 'larger' is presented as synonymous with 'slower'. For instance 'to compensate for the increase neurons' size' (abstract) or 'the increase in size of dendrites and axons might come with a cost of longer signal propagation times' only makes sense if 'size' refers to spatial extent, not diameter.
We thank for this valid point; leaving out axon diameter references was not intentional. We have now added the suggested reference to our manuscript. In the size comparisons, we have only pointed out the obvious size differences between the body and the dendritic processes. We have reworded sentences with size comparisons.
In Abstract (lines 1-6):
Human-specific cognitive abilities depend on information processing in the cerebral cortex, where neurons are significantly larger, their processes are longer and sparser compared to rodents. We found that, in synaptically-connected layer 2/3 pyramidal cells (L2/3 PCs), soma-tosoma signal propagation delay is similar in humans and rodents. Thus, to compensate for the increase in neurons’s longer processes, membrane potential changes must propagate faster in human axons and/or dendrites.
In section “Effect of dendritic thickness” in Results we have modified it as follows:
The relationship between conduction velocity and axon diameter is well known for small myelinated and unmyelinated axons (Waxman and Bennett, 1972). Anatomical features of neuronal processes dendrites also have a major influence on signal propagation properties 5,19, thus …
Waxman, S. G. and Bennett, M. V. L. Relative conduction velocity of small myelinated and nonmyelinated fibres in the central nervous system. Nature New Biol., 238217-219, 1972.
Two or four dendritic factors? The study identifies two major dendritic factors influencing the propagation speed (diameter and load), however the end of the results highlights four factors. I did not understand how factor 2 was different than factor 1. Neither did I understand how factor 4 was different from the other factors. There seemed to be a little redundancy here that could be streamlined.
We thank the reviewer for pointing this out. We now have changes the respective text, added the ranking statistics (see above) to assess the effect of the different parameters on signal propagation in dendrites.
Microcircuits? The study found that the changes in speed arise from the dendrites rather than the axons, as such it seems it would be more precise to replace 'microcircuits' with 'dendrites'.
We are thankful for this suggestion. We change the title to Accelerated signal propagation speed in human neocortical dendrites.
Typos
P3 line 24 'find significant difference the propagation'.
P6 line 35 'how morphological differences' it would be useful to specify which morphological difference here.
Corrected.
Reviewer #2 (Recommendations For The Authors):
(1) The statistical analyses should be changed. T-testing populations and comparing visual differences of differences ("human minus rats") is a common but egregious error in the field of neurosciences (see doi:10.1038/nn.2886). The conclusion that HCN channels "... do not by themselves explained the differences between the two species" (lines 174-176) is not compelling. The design of the experiments presented in Figure 3 is paired recordings and the addition of a blocker (ZD7288 or TTX cocktail). These are classic 2 x 2 factorial designs (species x drug). The authors will need to perform a repeated-measured analysis of variance (RM-ANOVA) and provide information on the interaction significance. Please revise the figures and improve statistical reporting. Post-hoc comparisons of the velocity populations are required to support the idea of whether h-channels are explaining the observed differences.
Thank you for drawing our attention to this error. The statistical analysis of the pharmacological experiments was re-performed as suggested. After the 2-way ANOVA with repeated measures and Bonferroni post-hoc correction, we can indeed find significant differences only in the control group, namely that the propagation speed of bAPs in human dendrites was significantly higher. The implementation of the proposed statistical analysis demonstrates that the administration of ZD has no statistically significant effect on the propagation speed of human or rat dendrites. The treatment with TTX cocktail resulted in a significant difference in signal propagation in humans but not in rodents. However the trend is discernible and the P = 0.0588 value is close to the widely accepted 0.05 threshold. After the TTX cocktail treatment, the speed of signal propagation did not differ significantly between the two species. However, on average, the human dendrites remained faster. These alterations in P-values do not affect our primary conclusions. The MS text has been modified accordingly.
(2) Although ZD7288, in my opinion, influences the bAP (see point #1) the authors subsequently leave the h-current unblocked in the experiments in Figures 3D, E. Here, they use sodium, potassium, and calcium currents as well as synaptic conductances. I am puzzled why (in line 188) they claim the dendrites are "passive" although the data show h-currents are contributing to the shape of the bAP in human neurons. In line 196 they conclude voltage-gated conductances have a "minor" contribution and passive properties a main role. Please revise conclusions or provide better experimental support.
Thank you for this point. We meant to refer to the state in which no action potential can be generated, although the word 'passive' might be misleading in this context; we rephrase these sentences in the MS accordingly.
(3) A major concern is the injection of an AP in voltage-clamp mode. Although this is the right choice and I'm in support of the experiment, it is technically challenging to space clamp the soma and fully recapitulate the speed and amplitude of a 100 mV depolarization. The voltage drop in peak amplitude as well as the increased delay between the baseline AP (current clamp) and AP in blocker conditions (voltage clamp) could be fully explained by switching between current- and voltage-clamp modes. In additional control experiments, the authors should add a second voltage follower electrode (CC) at the soma showing whether the authors can preserve the original AP (from CC) in VC/blocker condition. It may well be they need to adjust the injection protocol.
Our experiments were designed to replicate the work of Stuart et al. (1994), in which they compared the attenuation of active and passive backpropagating signals. When they blocked Na+ channels with TTX they injected simulated action potentials in voltage-clamp mode. They concluded that TTX-sensitive Na+ channels cause somatic action potential entry into the dendritic compartment. They found a comparable attenuation of the backward propagating action potential in the dendrites in control conditions (~70 %).
We performed control recordings based on the reviewer’s suggestion (Author response image 1).
Author response image 1.
Injection of the previously recorded AP (blue) in VC mode produced a completely similar somatic AP in CC mode (orange). The slight temporal delay between the two signal caused by the different position of the pipettes on the cell body. The right panel shows the plot of the two peak-aligned APs as a function of each other, close to the blue ‘equality’ line. We concluded that the original AP is well preserved in VC/blocker condition.
(5) From the paragraph entitled "Modeling EPSP propagation in dendrites" and onwards the authors make countless conclusions based on theory and modelling results but without any statistical support. Multiple neurons are used thus it is rather straightforward to provide numerical support for the assertions. For example, but this is not an exhaustive list, how should we interpret that latency ranges are different (line 240, line 253) etc.? Or were the estimated Cm values of human and rat neurons (0.6 versus 1.1) significantly different? And if so, how does this align with the Cm estimates in the nucleated patch experiments?
We thank the referee for this comment and now added a set of statistical analyses. The results appear now throughout the whole theoretical paper in revised article. In particular with respect to Figs. 6&7 where we now show that, indeed, our various manipulations (e.g., hybrid vs. original cells) as well as the cable parameters (Cm, Rm) are indeed significantly different between human and rats whereas the membrane time constant is not significantly different between human and rat. As for Cm in human. Our limited sample size shows significant difference between human and rat. Yet, the range of values for Cm that we found in our modeling study does fall within the experimental range reported in the present study.
Minor
Line 44. The "simulated EPSP" example in Figure 2C is not a command waveform for an EPSC. Line 526 in the methods states that also ramp currents were used. Please revise to clarify the main text.
Thank you for bringing this discrepancy to our attention. In the experiments, we used ramp injections. We have made this clear in the main text as follows: ”... we tested orthodromic or forward propagating signal propagation velocity by injecting short-duration current ramps to simulate EPSP (sEPSP) signals in the dendrites and recorded the resultant subthreshold voltage response in the soma”
Line 522. The authors state the recordings were all carried out "in current clamp mode" but detailed VC method information is lacking. Did they use series resistance compensation?
We did not use series resistance compensation.
Line 479 From which region(s) where human "neocortical slices" sampled? Please add this information.
We have added regions of origin to the Methods section: frontal (n = 21), temporal (n = 20), parietal (n = 20), and occipital (n = 1).
Please show higher temporal resolution example traces, for example in Figure 3. Differences are at the micrometer scale, but APs are shown at the millisecond scale. Hard to judge the quality of the data. Showing the command potentials (inset Figure 3D, E) is misleading (see major point #3).
In response to the reviewer's request, we have redrawn the example traces in Figure 3.
Please check the labeling of figures. There is information missing. For example, in Figure 5 A to C I am missing information and the units of the axes.
In the black plots on the right side of panels B and C, the y-axis shows the thickness measurements for the given dendrite stacked on top of each other and the x-axis shows the measurement values, the units for the x-axis are µm as mentioned in the figure legend.
Line 981 "scalebars" should read scale bars."
Line 986 "bootstraped" should read "bootstrapped".
Done.
Are the dendritic diameters increased for all basal and apical higher-order branches? It is unclear how the model simulations were built on diameters of primary and higher-order branches.
In our modelling study we took the actual diameter of the reconstructed PCs in both proximal and higher order branches. We did compare per-distance differences in diameter – but it is automatically incorporated into the computation of the basal load (“equivalent cables” in Figs 6&8).
The velocity calculation for axonal propagation (yielding a ~0.9 m/s conduction velocity, Figure 2B) is incorrect. Using the peak of the action potentials between soma and axon misses the fact that action potentials start earlier and spatially distally from the soma in the axon. Please revise the calculation to include the temporal delay and actual distance travelled by the forward propagating action potential.
Thank you for this question. We are aware that the AP is generated at the AIS and that it is located between the two recording electrodes and we have to take into account that the signal propagates from the AIS to the soma and this may shorten the delay in the system. To the best of our knowledge, there is no experimental evidence of the location of the AP generation site on the AIS in layer 2-3 pyramidal cells in the human neocortex, so we assumed that it is located 35 microns from the soma, and that the propagation speed from the AIS to the two directions is the same. Consequently, we have corrected our propagation velocity values as follows:
“For the axon bleb recordings we assumed that the axon initial segment (AIS) of the cells are 35 µm from the axon hillock, and the APs propagate to forward (to the bleb) and backward (to the soma) at the same speed. For the correction of the AIS we used the following formula: (2)
where vcorr is the corrected propagation speed for AIS position, l is the axonal distance between the soma and the axon bleb, t is the latency between the two measuring point, ais is the assumed position of the AIS alongside the axon (35 µm).”
What explains the strongly attenuated axonal action potential at the bleb? Is this representative?
The strongly attenuated axonal action potential at the bleb can be explained by a few key factors:
(1) Membrane Integrity: Bleb formation often indicates some level of membrane damage or alteration. This can disrupt the normal ionic gradients across the membrane, leading to a failure in generating or propagating action potentials effectively.
(2) Current Leakage: Bleb formation may create additional pathways for ion leakage, which can dissipate the electrical current that would normally propagate the action potential. This leakage reduces the overall amplitude of the action potential.
Line 275 "To our delight", please rephrase.
Corrected.
Reviewer #3 (Recommendations For The Authors):
- In Figure 1, the number of cells used to assess intersomatic distance is quite low. A larger number of neuron pairs should be analyzed to be more representative. Or at least an explanation of why such a low sampling can be conclusive.
We appreciate the reviewer’s concerns on sample sizes of the first set of experiments, where the anatomical pathways were measured through the synapses of coupled cells with electrophysiological recordings. We acknowledge that this is a limitation of our study. However, in this series of experiments, we simply wanted to experimentally confirm already known results which consisted of two parts: first, that in humans the dendrites and axons of neurons are longer, and second, that they have the same time delay in terms of synaptic latency.
The reported similarity in synaptic latencies is consistent with the results of a recent study by Campagnola et al. (2022) showing that EPSP latencies of local connections between layer 2/3 pyramidal cells are in the same range in humans and mice (human median latency = 1.73 ms vs. mouse median latency = 1.49 ms). We came to the same conclusion in our previous work where we compared pyramidal basket cell synaptically coupled pairs in human and rat pairs (Molnár et al. 2016).
On the other hand, we report interspecific differences in cable pathways from soma to soma, again consistent with the literature suggesting that the length of pyramidal neural processes is longer in humans than in rodents (see Supplementary Figure 1 and e.g. Berg et al. 2021).
From a practical point of view the collection of experimental data in this hard won experiment is particularly difficult. The electrophysiological recording of a connected pair with an appropriate pre- and postsynaptic series resistance, where human tissue samples are limited, is the first step here. To obtain information about the path of the signals between pre- and postsynaptic cells, an anatomical reconstruction is required. This requires a) a high-quality recovery of postsynaptic dendrites and presynaptic axons, b) successful tracing of all potential contact points between presynaptic axons and postsynaptic dendrites back to the pre- and postsynaptic soma. The difficulty of the latter point in particular arises from the fact that parts of the presynaptic axonal arbor are myelinated and the success of biocytin-based tracing depends on the length of the myelinated axon branches. The success/failure of complete axonal tracing only becomes apparent at the end of these efforts.
- The author should provide an intuitive explanation of why capacitive load accelerates propagation in the dendrite.
See answer above
- The author should more clearly rank the contribution of each difference between rat and human neurons. The 10% increase in dendritic diameter which affects velocity only via a square root seems a very weak contribution. This should be clarified.
We now added a set of statistical methods to perform such a ranking in the theoretical part of this study, as described above (and in a new paragraph, attached above) in the revised article.
References
Eyal, G., Mansvelder, H. D., de Kock, C. P. J., & Segev, I. (2014). Dendrites impact the encoding capabilities of the axon. Journal of Neuroscience, 34(24), 8063–8071. https://doi.org/10.1523/JNEUROSCI.5431-13.2014
Friedman, J. H. (2002). Stochastic gradient boosting. In Computational Statistics & Data Analysis (Vol. 38). www.elsevier.com/locate/csda
Kiebel, S. J., & Holmes, A. P. (2007). The General Linear Model. In K. Friston, J. Ashburner, S. Kiebel, T. Nichols, & P. William (Eds.), Statistical Parametric Mapping (pp. 101–125). Academic Press.
Lundberg, S. M., & Lee, S.-I. (2017). A unified approach to interpreting model predictions. Proceedings of the 31st International Conference on Neural Information Processing Systems, 4768–4777.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the current reviews.
Responses to Reviewer #1:
We thank the reviewer for these additional comments, and more generally for their extensive engagement with our work, which is greatly appreciated. Here, we respond to the three points in their latest review in turn.
The results of these experiments support a modest but important conclusion: If sub-optimal methods are used to collect retrospective reports, such as simple yes/no questions, inattentional blindness (IB) rates may be overestimated by up to ~8%.
It is true, of course, that we think the field has overstated the extent of IB, and we appreciate the reviewer characterizing our results as important along these lines. Nevertheless, we respectfully disagree with the framing and interpretation the reviewer attaches to them. As explained in our previous response, we think this interpretation — and the associated calculations of IB overestimation ‘rates’ — perpetuates a binary approach to perception and awareness which we regard as mistaken.
A graded approach to IB and visual awareness
Our sense is that many theorists interested in IB have conceived of perception and awareness as ‘all or nothing’: You either see a perfectly clear gorilla right in front of you, or you see nothing at all. This is implicit in the reviewer’s characterization of our results as simply indicating that fewer subjects fail to see the critical stimulus than previously assumed. To think that way is precisely to assume the orthodox binary position about perception, i.e., that any given subject can neatly be categorized into one of two boxes, saw or didn’t see.
Our perspective is different. We think there can be degraded forms of perception and awareness that fall neatly into neither of the categories “saw the stimulus perfectly clearly” or “saw nothing at all”. On this graded conception, the question is not: “What proportion of subjects saw the stimulus?” but: “What is the sensitivity of subjects to the stimulus?” This is why we prefer signal detection measures like d′ over % noticing and % correct. This powerful framework has been successful in essentially every domain to which it has been applied, and we think perception and visual awareness are no exception. We understand that the reviewer may not think the same way about this foundational issue, but since part of our goal is to promote a graded approach to perception, we are keen to highlight our disagreement here and so resist the reviewer’s interpretation of our results (even to the extent that it is a positive one!).
Finally, we note that given this perspective, we are correspondingly inclined to reject many of the summary figures following below in Point (1) by the reviewer. These calculations (given in terms of % noticing and not noticing) make sense on the binary conception of awareness, but not on the SDT-based approach we favor. We say more about this below.
(1) In experiment 1, data from 374 subjects were included in the analysis. As shown in figure 2b, 267 subjects reported noticing the critical stimulus and 107 subjects reported not noticing it. This translates to a 29% IB rate if we were to only consider the "did you notice anything unusual Y/N" question. As reported in the results text (and figure 2c), when asked to report the location of the critical stimulus (left/right), 63.6% of the "non-noticer" group answered correctly. In other words, 68 subjects were correct about the location while 39 subjects were incorrect. Importantly, because the location judgment was a 2-alternative-forced-choice, the assumption was that if 50% (or at least not statistically different than 50%) of the subjects answered the location question correctly, everyone was purely guessing. Therefore, we can estimate that ~39 of the subjects who answered correctly were simply guessing (because 39 guessed incorrectly), leaving 29 subjects from the nonnoticer group who were correct on the 2AFC above and beyond the pure guess rate. If these 29 subjects are moved from the non-noticer to the noticer group, the corrected rate of IB for Experiment 1 is 20.86% instead of the original 28.61% rate that would have been obtained if only the Y/N question was used. In other words, relying only on the "Y/N did you notice anything" question led to an overestimate of IB rates by 7.75% in Experiment 1.
In the revised version of their manuscript, the authors provided the data that was missing from the original submission, which allows this same exercise to be carried out on the other 4 experiments.
(To briefly interject: All of these data were provided in our public archive since our original submission and remain available at https://osf.io/fcrhu. The difference now is only that they are included in the manuscript itself.)
Using the same logic as above, i.e., calculating the pure-guess rate on the 2AFC, moving the number of subjects above this pure-guess rate to the non-noticer group, and then re-calculating a "corrected IB rate", the other experiments demonstrate the following:
Experiment 2: IB rates were overestimated by 4.74% (original IB rate based only on Y/N question = 27.73%; corrected IB rate that includes the 2AFC = 22.99%)
Experiment 3: IB rates were overestimated by 3.58% (original IB rate = 30.85%; corrected IB rate = 27.27%)
Experiment 4: IB rates were overestimated by ~8.19% (original IB rate = 57.32%; corrected IB rate for color* = 39.71%, corrected IB rate for shape = 52.61%, corrected IB rate for location = 55.07%)
Experiment 5: IB rates were overestimated by ~1.44% (original IB rate = 28.99%; corrected IB rate for color = 27.56%, corrected IB rate for shape = 26.43%, corrected IB rate for location = 28.65%)
*note: the highest overestimate of IB rates was from Experiment 4, color condition, but the authors admitted that there was a problem with 2AFC color guessing bias in this version of the experiment which was a main motivation for running experiment 5 which corrected for this bias.
Taken as a whole, this data clearly demonstrates that even with a conservative approach to analyzing the combination of Y/N and 2AFC data, inattentional blindness was evident in a sizeable portion of the subject populations. An important (albeit modest) overestimate of IB rates was demonstrated by incorporating these improved methods.
We appreciate the work the reviewer has put into making these calculations. However, as noted above, such calculations implicitly reflect the binary approach to perception and awareness that we reject.
Consider how we’d think about the single subject case where the task is 2afc detection of a low contrast stimulus in noise. Suppose that this subject achieves 70% correct. One way of thinking about this is that the subject fully and clearly sees the stimulus on 40% of trials (achieving 100% correct on those) and guesses completely blindly on the other 60% (achieving 50% correct on those) for a total of 40% + 30% = 70% overall. However, this is essentially a ‘high threshold’ approach to the problem, in contrast to an SDT approach. On an SDT approach — an approach with tremendous evidential support — on every trial the subject receives samples from probabilistic distributions corresponding to each interval (one noise and one signal + noise) and determines which is higher according to the 2afc decision rule. Thus, across trials, they have access to differentially graded information about the stimulus. Moreover, on some trials they may have significant information from the stimulus (perhaps, well above their single interval detection criterion) but still decide incorrectly because of high noise from the other spatial interval. From this perspective, there is no nonarbitrary way of saying whether the subject saw/did not see on a given trial. Instead, we must characterize the subject’s overall sensitivity to the stimulus/its visibility to them in terms of a parameter such as d′ (here, ~ 0.7).
We take the same attitude to the subjects in our experiments (and specifically to our ‘super subject’). Instead of calculating the proportion of subjects who saw or failed to see the stimulus (with some characterized as aware and some as unaware), we think the best way to characterize our results is that, across subjects (and so trials also), there was differential graded access to information from the stimulus, and this is best represented in terms of the group-level sensitivity parameter d′. This is why we frame our results as demonstrating that subjects traditionally considered inattentionally blind exhibit significant residual visual sensitivity to the critical stimulus.
(2) One of the strongest pieces of evidence presented in this paper was the single data point in Figure 3e showing that in Experiment 3, even the super subject group that rated their non-noticing as "highly confident" had a d' score significantly above zero. Asking for confidence ratings is certainly an improvement over simple Y/N questions about noticing, and if this result were to hold, it could provide a key challenge to IB. However, this result can most likely be explained by measurement error.
In their revised paper, the authors reported data that was missing from their original submission: the confidence ratings on the 2AFC judgments that followed the initial Y/N question. The most striking indication that this data is likely due to measurement error comes from the number of subjects who indicated that they were highly confident that they didn't notice anything on the critical trial, but then when asked to guess the location of the stimulus, indicated that they were highly confident that the stimulus was on the left (or right). There were 18 subjects (8.82% of the high-confidence non-noticer group) who responded this way. To most readers, this combination of responses (high confidence in correctly judging a stimulus feature that one is highly confident in having not seen at all) indicates that a portion of subjects misunderstood the confidence scales (or just didn't read the questions carefully or made mistakes in their responses, which is common for experiments conducted online).
In the authors' rebuttal to the first round of peer review, they wrote, "it is perfectly rationally coherent to be very confident that one didn't see anything but also very confident that if there was anything to be seen, it was on the left." I respectfully disagree that such a combination of responses is rationally coherent. The more parsimonious interpretation is that a measurement error occurred, and it's questionable whether we should trust any responses from these 18 subjects.
In their rebuttal, the authors go on to note that 14 of the 18 subjects who rated their 2AFC with high confidence were correct in their location judgment. If these 14 subjects were removed from analysis (which seems like a reasonable analysis choice, given their contradictory responses), d' for the high-confidence non-noticer group would most likely fall to chance levels. In other words, we would see a data pattern similar to that plotted in Figure 3e, but with the first data point on the left moving down to zero d'. This corrected Figure 3e would then provide a very nice evidence-based justification for including confidence ratings along with Y/N questions in future inattentional blindness studies.
We appreciate the reviewer’s highlighting of this particular piece of evidence as amongst our strongest. (At the same time, we must resist its characterization as a “single data point”: it derives from a large pre-registered experiment involving some 7,000 subjects total, with over 200 subjects in the relevant bin — both figures being far larger than a typical IB experiment.) We also appreciate their raising the issue of measurement error.
Specifically, the reviewer contends that our finding that even highly confident non-noticers exhibit significant sensitivity is “most likely … explained by measurement error” due to subjects mistakenly inverting our confidence scale in giving their response. In our original reply, we gave two reasons for thinking this quite unlikely; the reviewer has not addressed these in this revised review. First, we explicitly labeled our confidence scale (with 0 labeled as ‘Not at all confident’ and 3 as ‘Highly confident’) so that subjects would be very unlikely simply to invert the scale. This is especially so as it is very counterintuitive to treat “0” as reflecting high confidence. More importantly, however, we reasoned that any measurement error due to inverting or misconstruing the confidence scale should be symmetric. That is: If subjects are liable to invert the confidence scale, they should do so just as often when they answer “yes” as when they answer “no” – after all the very same scale is being used in both cases. This allows us to explore evidence of measurement error in relation to the large number of high-confidence “yes” subjects (N = 2677), thus providing a robust indicator as to whether subjects are generally liable to misconstrue the confidence scale. Looking at the number of such high confidence noticers who subsequently respond to the 2afc question with low confidence (a pattern which might, though need not, suggest measurement error), we found that the number was tiny. Only 28/2677 (1.05%) of high-confidence noticers subsequently gave the lowest level of confidence on the 2afc question, and only 63/2677 (2.35%) subjects gave either of the two lower levels of confidence. For these reasons, we consider any measurement error due to misunderstanding the confidence scale to be extremely minimal.
The reviewer is correct to note that 18/204 (9%) subjects reported both being highly confident that they didn't notice anything and highly confident in their 2afc judgment, although only 14/18 were correct in this judgment. Should we exclude these 14? Perhaps if we agree with the reviewer that such a pattern of responses is not “rationally coherent” and so must reflect a misconstrual of the scale. But such a pattern is in fact perfectly and straightforwardly intelligible. Specifically, in a 2afc task, two stimuli can individually fall well below a subject’s single interval detection criterion — leading to a high confidence judgment that nothing was presented in either interval. Quite consistent with this, the lefthand stimulus may produce a signal that is much higher than the right-hand stimulus — leading to a high confidence forced-choice judgment that, if something was presented, it was on the left. (By analogy, consider how a radiologist could look at a scan and say the following: “We’re 95% confident there’s no tumor. But even on the 5% chance that there is, our tests completely rule out that it’s a malignant one, so don’t worry.”)
(3) In most (if not all) IB experiments in the literature, a partial attention and/or full attention trial is administered after the critical trial. These control trials are very important for validating IB on the critical trial, as they must show that, when attended, the critical stimuli are very easy to see. If a subject cannot detect the critical stimulus on the control trial, one cannot conclude that they were inattentionally blind on the critical trial, e.g., perhaps the stimulus was just too difficult to see (e.g., too weak, too brief, too far in the periphery, too crowded by distractor stimuli, etc.), or perhaps they weren't paying enough attention overall or failed to follow instructions. In the aggregate data, rates of noticing the stimuli should increase substantially from the critical trial to the control trials. If noticing rates are equivalent on the critical and control trials, one cannot conclude that attention was manipulated in the first place.
In their rebuttal to the first round of peer review, the authors provided weak justification for not including such a control condition. They cite one paper that argues such control conditions are often used to exclude subjects from analysis (those who fail to notice the stimulus on the control trial are either removed from analysis or replaced with new subjects) and such exclusions/replacements can lead to underestimations of inattentional blindness rates. However, the inclusion of a partial or full attention condition as a control does not necessitate the extra step of excluding or replacing subjects. In the broadest sense, such a control condition simply validates the attention manipulation, i.e., one can easily compare the percent of subjects who answered "yes" or who got the 2AFC judgment correct during the critical trial versus the control trial. The subsequent choice about exclusion/replacement is separate, and researchers can always report the data with and without such exclusions/replacements to remain more neutral on this practice.
If anyone were to follow-up on this study, I highly recommend including a partial or full attention control condition, especially given the online nature of data collection. It's important to know the percent of online subjects who answer yes and who get the 2AFC question correct when the critical stimulus is attended, because that is the baseline (in this case, the "ceiling level" of performance) to which the IB rates on the critical trial can be compared.
We agree with the reviewer that future studies could benefit from including a partial or full attention condition. They are surely right that we might learn something additional from such conditions.
Where we differ from the reviewer is in thinking of these conditions as “controls” appropriate to our research question. This is why we offered the justification we did in our earlier response. When these conditions are used as controls, they are used to exclude subjects in ways that serve to inflate the biases we are concerned with in our work. For our question, the absence of these conditions does not impact the significance of the findings, since such conditions are designed to answer a question which is not the one at the heart of our paper. Our key claim is that subjects who deny noticing an unexpected stimulus in a standard inattentional blindness paradigm nonetheless exhibit significant residual sensitivity (as well as a conservative bias in their response to the noticing question); the presence or absence of partial- or full-attention conditions is orthogonal to that question.
Moreover, we note that our tasks were precisely chosen to be classic tasks widely used in the literature to manipulate attention. Thus, by common consensus in the field, they are effective means to soak up attention, and have in effect been tested in partial- and full-attention control settings in a huge number of studies. Second, we think it very doubtful that subjects in a full-attention trial would not overwhelmingly have detected our critical stimuli. The reviewer worries that they might have been “too weak, too brief, too far in the periphery, too crowded by distractor stimuli, etc.” But consider E5 where the stimulus was a highly salient orange or green shape, present on the screen for 5 seconds. The reviewer also suggests that subjects in the full-attention control might not have detected the stimulus because they “weren't paying enough attention overall”. But evidently if they weren’t paying attention even in the full-attention trial this would be reason for thinking that there was inattentional blindness even in this condition (a point made by White et al. 2018) and certainly not a reason for thinking there was not an attentional effect in the critical trial. Lastly, the reviewer suggests that a full-attention condition would have helped ensure that subjects were following instructions. But we ensured this already by (as per our pre-registration) excluding subjects who performed poorly in the relevant primary tasks.
Thus, both in principle and in practice, we do not see the absence of such conditions as impacting the interpretation of our findings, even as we agree that future work posing a different research question could certainly learn something from including such conditions.
Responses to Reviewer #2:
We note that this report is unchanged from an earlier round of review, and not a response to our significantly revised manuscript. We believe our latest version fully addresses all the issues which the reviewer originally raised. The interested reader can see our original response below. We again thank the reviewer for their previous report which was extremely helpful.
—-
The following is the authors’ response to the original reviews.
eLife Assessment
This study presents valuable findings to the field interested in inattentional blindness (IB), reporting that participants indicating no awareness of unexpected stimuli through yes/no questions, still show above-chance sensitivity to specific properties of these stimuli through follow-up forced-choice questions (e.g., its color). The results suggest that this is because participants are conservative and biased to report not noticing in IB. The authors conclude that these results provide evidence for residual perceptual awareness of inattentionally blind stimuli and that therefore these findings cast doubt on the claim that awareness requires attention. Although the samples are large and the analysis protocol novel, the evidence supporting this interpretation is still incomplete, because effect sizes are rather small, the experimental design could be improved and alternative explanations have not been ruled out.
We are encouraged to hear that eLife found our work “valuable”. We also understand, having closely looked at the reviews, why the assessment also includes an evaluation of “incomplete”. We gave considerable attention to this latter aspect of the assessment in our revision. In addition to providing additional data and analyses that we believe strengthen our case, we also include a much more substantial review and critique of existing methods in the IB literature to make clear exactly the gap our work fills and the advance it makes. (Indeed, if it is appropriate to say this here, we believe one key aspect of our work that is missing from the assessment is our inclusion of ‘absent’ trials, which is what allows us to make the crucial claims about conservative reporting of awareness in IB for the first time.) Moreover, we refocus our discussion on only our most central claims, and weaken several of our secondary claims so that the data we’ve collected are better aligned with the conclusions we draw, to ensure that the case we now make is in fact complete. Specifically, our two core claims are (1) that there is residual sensitivity to visual features for subjects who would ordinarily be classified as inattentionally blind (whether this sensitivity is conscious or not), and (2) that there is a tendency to respond conservatively on yes/no questions in the context of IB. We believe we have very compelling support for these two core claims, as we explain in detail below and also through revisions to our manuscript.
Given the combination of strengthened and clarified case, as well as the weakening of any conclusions that may not have been fully supported, we believe and hope that these efforts make our contribution “solid”, “convincing”, or even “compelling” (especially because the “compelling” assessment characterizes contributions that are “more rigorous than the current state-of-the-art”, which we believe to be the case given the issues that have plagued this literature and that we make progress on).
Reviewer #1 (Public review):
Summary:
In the abstract and throughout the paper, the authors boldly claim that their evidence, from the largest set of data ever collected on inattentional blindness, supports the views that "inattentionally blind participants can successfully report the location, color, and shape of stimuli they deny noticing", "subjects retain awareness of stimuli they fail to report", and "these data...cast doubt on claims that awareness requires attention." If their results were to support these claims, this study would overturn 25+ years of research on inattentional blindness, resolve the rich vs. sparse debate in consciousness research, and critically challenge the current majority view in cognitive science that attention is necessary for awareness.
Unfortunately, these extraordinary claims are not supported by extraordinary (or even moderately convincing) evidence. At best, the results support the more modest conclusion: If sub-optimal methods are used to collect retrospective reports, inattentional blindness rates will be overestimated by up to ~8% (details provided below in comment #1). This evidence-based conclusion means that the phenomenon of inattentional blindness is alive and well as it is even robust to experiments that were specifically aimed at falsifying it. Thankfully, improved methods already exist for correcting the ~8% overestimation of IB rates that this study successfully identified.
We appreciate here the reviewer’s recognition of the importance of work on inattentional blindness, and the centrality of inattentional blindness to a range of major questions. We also recognize their concerns with what they see as a gap between our data and the claims made on their basis. We address this in detail below (as well as, of course, in our revised manuscript). However, from the outset we are keen to clarify that our central claim is only the first one the reviewer mentions — and the one which appears in our title — namely that, as a group, participants can successfully report the location, color, and shape of stimuli they deny noticing, and thus that there is “Sensitivity to visual features in inattentional blindness”. This is the claim that we believe is strongly supported by our data, and all the more so after revising the manuscript in light of the helpful comments we’ve received.
By contrast, the other claims the reviewer mentions, concerning awareness (as opposed to residual sensitivity–which might be conscious or unconscious) were intended as both secondary and tentative. We agree with the referee that these are not as strongly supported by our data (and indeed we say so in our manuscript), whereas we do think our data strongly support the more modest — and, to us central — claim that, as a group, inattentionally blind participants can successfully report the location, color, and shape of stimuli they deny noticing.
We also feel compelled to resist somewhat the reviewer’s summary of our claims. For example, the reviewer attributes to us the claim that “subjects retain awareness of stimuli they fail to report”; but while that phrase does appear in our abstract, what we in fact say is that our data are “consistent with an alternative hypothesis about IB, namely that subjects retain awareness of stimuli they fail to report”. We do in fact believe that our data are consistent with that hypothesis, whereas earlier investigations seemed not to be. We mention this only because we had used that careful phrasing precisely for this sort of reason, so that we wouldn’t be read as saying that our results unequivocally support that alternative.
Still, looking back, we see how we may have given more emphasis than we intended to some of these more secondary claims. So, we’ve now gone through and revised our manuscript throughout to emphasize that our main claim is about residual sensitivity, and to make clear that our claims about awareness are secondary and tentative. Indeed, we now say precisely this, that although we favor an interpretation of “our results in terms of residual conscious vision in IB … this claim is tentative and secondary to our primary finding”. We also weaken the statements in the abstract that the reviewer mentions, to better reflect our key claims.
Finally, we note one further point: Dialectically, inattentional blindness has been used to argue (e.g.) that attention is required for awareness. We think that our data concerning residual sensitivity at least push back on the use of IB to make this claim, even if (as we agree) they do not provide decisive evidence that awareness survives inattention. In other words, we think our data call that claim into question, such that it’s now genuinely unclear whether awareness does or does not survive inattention. We have adjusted our claims on this point accordingly as well.
Comments:
(1) In experiment 1, data from 374 subjects were included in the analysis. As shown in figure 2b, 267 subjects reported noticing the critical stimulus and 107 subjects reported not noticing it. This translates to a 29% IB rate, if we were to only consider the "did you notice anything unusual Y/N" question. As reported in the results text (and figure 2c), when asked to report the location of the critical stimulus (left/right), 63.6% of the "non-noticer" group answered correctly. In other words, 68 subjects were correct about the location while 39 subjects were incorrect. Importantly, because the location judgment was a 2-alternative-forced-choice, the assumption was that if 50% (or at least not statistically different than 50%) of the subjects answered the location question correctly, everyone was purely guessing. Therefore, we can estimate that ~39 of the subjects who answered correctly were simply guessing (because 39 guessed incorrectly), leaving 29 subjects from the nonnoticer group who may have indeed actually seen the location of the stimulus. If these 29 subjects are moved to the noticer group, the corrected rate of IB for experiment 1 is 21% instead of 29%. In other words, relying only on the "Y/N did you notice anything" question leads to an overestimate of IB rates by 8%. This modest level of inaccuracy in estimating IB rates is insufficient for concluding that "subjects retain awareness of stimuli they fail to report", i.e. that inattentional blindness does not exist.
In addition, this 8% inaccuracy in IB rates only considers one side of the story. Given the data reported for experiment 1, one can also calculate the number of subjects who answered "yes, I did notice something unusual" but then reported the incorrect location of the critical stimulus. This turned out to be 8 subjects (or 3% of the "noticer" group). Some would argue that it's reasonable to consider these subjects as inattentionally blind, since they couldn't even report where the critical stimulus they apparently noticed was located. If we move these 8 subjects to the non-noticer group, the 8% overestimation of IB rates is reduced to 6%.
The same exercise can and should be carried out on the other 4 experiments, however, the authors do not report the subject numbers for any of the other experiments, i.e., how many subjects answered Y/N to the noticing question and how many in each group correctly answered the stimulus feature question. From the limited data reported (only total subject numbers and d' values), the effect sizes in experiments 2-5 were all smaller than in experiment 1 (d' for the non-noticer group was lower in all of these follow-up experiments), so it can be safely assumed that the ~6-8% overestimation of IB rates was smaller in these other four experiments. In a revision, the authors should consider reporting these subject numbers for all 5 experiments.
We now report, as requested, all these subject numbers in our supplementary data (see Supplementary Tables 1 and 2 in our Supplementary Materials).
However, we wish to address the larger question the reviewer has raised: Do our data only support a relatively modest reduction in IB rates? Even if they did, we still believe that this would be a consequential result, suggesting a significant overestimation of IB rates in classic paradigms. However, part of our purpose in writing this paper is to push back against a certain binary way of thinking about seeing/awareness. Our sense is that the field has conceived of awareness as “all or nothing”: You either see a perfectly clear gorilla right in front of you, or you see nothing at all. Our perspective is different: We think there can be degraded forms of awareness that fall into neither of those categories. For that reason, we are disinclined to see our results in the way that the reviewer suggests, namely as simply indicating that fewer subjects fail to see the stimulus than previously assumed. To think that way is, in our view, to assume the orthodox binary position about awareness. If, instead, one conceives of awareness as we do (and as we believe the framework of signal detection theory should compel us to), then it isn’t quite right to think of the proportion of subjects who were aware, but rather (e.g.) the sensitivity of subjects to the relevant stimulus. This is why we prefer measures like d′ over % noticing and % correct. We understand that the reviewer may not think the same way about this issue as we do, but part of our goal is to promote that way of thinking in general, and so some of our comments below reflect that perspective and approach.
For example, consider how we’d think about the single subject case where the task is 2afc detection of a low contrast stimulus in noise. Suppose that this subject achieves 70% correct. One way of thinking about that is that the subject sees the stimulus on 40% of trials (achieving 100% correct on those) and guesses blindly on the other 60% (achieving 50% correct on those) for a total of 40% + 30% = 70% overall. However, this is essentially a “high threshold” approach to the problem, in contrast to an SDT approach. On an SDT approach (an approach with tremendous evidential support), on every trial the subject receives samples from probabilistic distributions corresponding to each interval (one noise and one signal + noise) and determines which is higher according to the 2afc decision rule. Thus, across trials they have access to differentially graded information about the stimulus. Moreover, on some trials they may have significant information from the stimulus (perhaps, well above their single interval detection criterion) but still decide incorrectly because of high noise from the other spatial interval. From this perspective, there is no non-arbitrary way of saying whether the subject saw/did not see on a given trial. Instead, we must characterize the subject’s overall sensitivity to the stimulus/its visibility to them in terms of a parameter such as d′ (here, ~ 0.7).
We take the same attitude to our super subject. Instead of saying that some subjects saw/failed to see the stimuli, instead we suggest that the best way to characterize our results is that across subjects (and so trials also) there was differential graded access to information from the stimulus best represented in terms of the group-level sensitivity parameter d′.
We acknowledge that (despite ourselves) we occasionally fell into an all-too-natural binary/high threshold way of thinking, as when we suggested that our data show that “inattentionally blind subjects consciously perceive these stimuli after all” and “the inattentionally blind can see after all." (p.17) We have removed such problematic phrasing as well as other problematic phrasing as noted below.
(2) Because classic IB paradigms involve only one critical trial per subject, the authors used a "super subject" approach to estimate sensitivity (d') and response criterion (c) according to signal detection theory (SDT). Some readers may have issues with this super subject approach, but my main concern is with the lack of precision used by the authors when interpreting the results from this super subject analysis.
Only the super subject had above-chance sensitivity (and it was quite modest, with d' values between 0.07 and 0.51), but the authors over-interpret these results as applying to every subject. The methods and analyses cannot determine if any individual subject could report the features above-chance. Therefore, the following list of quotes should be revised for accuracy or removed from the paper as they are misleading and are not supported by the super subject analysis: "Altogether this approach reveals that subjects can report above-chance the features of stimuli (color, shape, and location) that they had claimed not to notice under traditional yes/no questioning" (p.6)
"In other words, nearly two-thirds of subjects who had just claimed not to have noticed any additional stimulus were then able to correctly report its location." (p.6)
"Even subjects who answer "no" under traditional questioning can still correctly report various features of the stimulus they just reported not having noticed, suggesting that they were at least partially aware of it after all." (p.8)
"Why, if subjects could succeed at our forced-response questions, did they claim not to have noticed anything?" (p.8)
"we found that observers could successfully report a variety of features of unattended stimuli, even when they claimed not to have noticed these stimuli." (p.14)
"our results point to an alternative (and perhaps more straightforward) explanation: that inattentionally blind subjects consciously perceive these stimuli after all... they show sensitivity to IB stimuli because they can see them." (p.16)
"In other words, the inattentionally blind can see after all." (p.17)
We thank the reviewer for pointing out how these quotations may be misleading as regards our central claim. We intended them all to be read generically as concerning the group, and not universally as claiming that all subjects could report above-chance/see the stimuli etc. We agree entirely that the latter universal claim would not be supported by our data. In contrast, we do contend that our super-subject analysis shows that, as a group, subjects traditionally considered intentionally blind exhibit residual sensitivity to features of stimuli (color, shape, and location) that they had all claimed not to notice, and likewise that as a group they could succeed at our forced-choice questions.
To ensure this claim is clear throughout the paper, and that we are not interpreted as making an unsupported universal claim we have revised the language in all of the quotations above, as follows, as well as in numerous other places in the paper.
“Altogether this approach reveals that subjects can report above-chance the features of stimuli (color, shape, and location) that they had claimed not to notice under traditional yes/no questioning” (p.6) => “Altogether this approach reveals that as a group subjects can report above-chance the features of stimuli (color, shape, and location) that they had all claimed not to notice under traditional yes/no questioning” (p.6)
“Even subjects who answer “no” under traditional questioning can still correctly report various features of the stimulus they just reported not having noticed, suggesting that they were at least partially aware of it after all.” (p.8) => “... even subjects who answer “no” under traditional questioning can, as a group, still correctly report various features of the stimuli they just reported not having noticed, indicating significant group-level sensitivity to visual features. Moreover, these results are even consistent with an alternative hypothesis about IB, that as a group, subjects who would traditionally be classified as inattentionally blind are in fact at least partially aware of the stimuli they deny noticing.” (p.8)
“Why, if subjects could succeed at our forced-response questions, did they claim not to have noticed anything?” (p.8) => “Why, if subjects could succeed at our forcedresponse questions as a group, did they all individually claim not to have noticed anything?” (p.8)
“we found that observers could successfully report a variety of features of unattended stimuli, even when they claimed not to have noticed these stimuli.” (p.14) => “we found that groups of observers could successfully report a variety of features of unattended stimuli, even when they all individually claimed not to have noticed those stimuli.” (p.14)
“our results point to an alternative (and perhaps more straightforward) explanation: that inattentionally blind subjects consciously perceive these stimuli after all... they show sensitivity to IB stimuli because they can see them.” (p.16) => “our results just as easily raise an alternative (and perhaps more straightforward) explanation: that inattentionally blind subjects may retain a degree of awareness of these stimuli after all.” (p.16) Here deleting: “they show sensitivity to IB stimuli because they can see them.”
“In other words, the inattentionally blind can see after all.” (p.17) => “In other words, as a group, the inattentionally blind enjoy at least some degraded or partial sensitivity to the location, color and shape of stimuli which they report not noticing.” (p.17)
In one case, we felt the sentence was correct as it stood, since it simply reported a fact about our data:
“In other words, nearly two-thirds of subjects who had just claimed not to have noticed any additional stimulus were then able to correctly report its location.” (p.6)
After all, if subjects were entirely blind and simply guessed, it would be true to say that 50% of subjects would be able to correctly report the stimulus location (by guessing).
In addition to these and numerous other changes, we also added the following explicit statement early in the paper to head-off any confusion on this point: “Note that all analyses reported here relate to this super subject as opposed to individual subjects”.
(3) In addition to the d' values for the super subject being slightly above zero, the authors attempted an analysis of response bias to further question the existence of IB. By including in some of their experiments critical trials in which no critical stimulus was presented, but asking subjects the standard Y/N IB question anyway, the authors obtained false alarm and correct rejection rates. When these FA/CR rates are taken into account along with hit/miss rates when critical stimuli were presented, the authors could calculate c (response criterion) for the super subject. Here, the authors report that response criteria are biased towards saying "no, I didn't notice anything". However, the validity of applying SDT to classic Y/N IB questioning is questionable.
For example, with the subject numbers provided in Box 1 (the 2x2 table of hits/misses/FA/CR), one can ask, 'how many subjects would have needed to answer "yes, I noticed something unusual" when nothing was presented on the screen in order to obtain a non-biased criterion estimate, i.e., c = 0?' The answer turns out to be 800 subjects (out of the 2761 total subjects in the stimulus-absent condition), or 29% of subjects in this condition.
In the context of these IB paradigms, it is difficult to imagine 29% of subjects claiming to have seen something unusual when nothing was presented. Here, it seems that we may have reached the limits of extending SDT to IB paradigms, which are very different than what SDT was designed for. For example, in classic psychophysical paradigms, the subject is asked to report Y/N as to whether they think a threshold-level stimulus was presented on the screen, i.e., to detect a faint signal in the noise. Subjects complete many trials and know in advance that there will often be stimuli presented and the stimuli will be very difficult to see. In those cases, it seems more reasonable to incorrectly answer "yes" 29% of the time, as you are trying to detect something very subtle that is out there in the world of noise. In IB paradigms, the stimuli are intentionally designed to be highly salient (and unusual), such that with a tiny bit of attention they can be easily seen. When no stimulus is presented and subjects are asked about their own noticing (especially of something unusual), it seems highly unlikely that 29% of them would answer "yes", which is the rate of FAs that would be needed to support the null hypothesis here, i.e., of a non-biased criterion. For these reasons, the analysis of response bias in the current context is questionable and the results claiming to demonstrate a biased criterion do not provide convincing evidence against IB.
We are grateful to the reviewer for highlighting this aspect of our data. We agree with several of these points. For example, it is indeed striking that — given the corresponding hit rate — a false alarm rate of 29% would be needed to obtain an unbiased criterion. At the same time, we would respectfully push back on other points above. In our first experiment that uses the super-subject analysis, for example, d′ is 0.51 and highly significant; to describe that figure, as the reviewer does, as “slightly above zero” seemed not quite right to us (and all the more so given that these experiments involve very large samples and preregistered analysis plans).
We also respectfully disagree that our data call into question the validity of applying SDT to classic yes/no IB questioning. The mathematical foundations of SDT are rock solid, and have been applied far more broadly than we have applied them here. In fact, in a way we would suggest that exactly the opposite attitude is appropriate: rather than thinking that IB challenges an immensely well-supported, rigorously tested and broadly applicable mathematical model of perception, we think that the conflict between our SDT-based model of IB and the standard interpretation constitutes strong reason to disfavor the standard interpretation. Several points are worth making here.
First, it is already surprising that 11.03% of our subjects in E2 (46/417) and 7.24% of our subjects in E5 (200/2761) E5 reported noticing a stimulus when no stimulus was present. But while this may have seemed unlikely in advance of inquiry, this is in fact what the data show and forms the basis of our criterion calculations. Thus, our criterion calculations already factor in a surprising but empirically verified high false alarm rate of subjects answering “yes” when no stimulus was presented and were asked about their noticing. (We also note that the only paper we know of to report a false alarm rate in an IB paradigm, though not one used to calculate a response criterion, found a very consistent false alarm rate of 10.4%. See Devue et al. 2009.)
Second, while the reviewer is of course correct that a common psychophysical paradigm involves detection of a “threshold-level”/faint stimulus in noise, it is widely recognized that SDT has an extremely broad application, being applicable to any situation in which two kinds of event are to be discriminated (Pastore & Scheirer 1975) and being “almost universally accepted as a theoretical account of decision making in research on perceptual detection and recognition and in numerous extensions to applied domains” quite generally (Estes 2002, see also: Wixted 2020). Indeed, cases abound in which SDT has been successfully applied to situations which do not involve near threshold stimuli in noise. To pick two examples at random, SDT has been used in studying acceptability judgments in linguistics (Huang and Ferreira 2020) and the assessment of physical aggression in childstudent interactions (Lerman et al. 2010; for more general discussion of practical applications, see Swets et al. 2000). Given that the framework of SDT is so widely applied and well supported, and that we see no special reason to make an exception, we believe it can be relied on in the present context.
Finally, we note that inattentional blindness can in many ways be considered analogous to “near threshold” detection since inattention is precisely thought to degrade or even abolish awareness of stimuli, meaning that our stimuli can be construed as near threshold in the relevant sense. Indeed, our relatively modest d′ values suggest that under inattention stimuli are indeed hard to detect. Thus, even were SDT more limited in its application, we think it still would be appropriate to apply to the case of IB.
(4) One of the strongest pieces of evidence presented in the entire paper is the single data point in Figure 3e showing that in Experiment 3, even the super subject group that rated their non-noticing as "highly confident" had a d' score significantly above zero. Asking for confidence ratings is certainly an improvement over simple Y/N questions about noticing, and if this result were to hold, it could provide a key challenge to IB. However, this result hinges on a single data point, it was not replicated in any of the other 4 experiments, and it can be explained by methodological limitations. I strongly encourage the authors (and other readers) to follow up on this result, in an in-person experiment, with improved questioning procedures.
We agree that our finding that even the super-subject group that rated their non-noticing as “highly confident” had a d' score significantly above zero is an especially strong piece of evidence, and we thank the reviewer for highlighting that here. At the same time, we note that while the finding is represented by a single marker in Figure 3e, it seemed not quite right to call this a “single data point”, as the reviewer does, given that it derives from a large pre-registered experiment involving some 7,000 subjects total, with over 200 subjects in the relevant bin — both figures being far larger than a typical IB experiment. It would of course be tremendous to follow up on this result – and we certainly hope our work inspires various follow-up studies. That said, we note that recruiting the necessary numbers of in person subjects would be an absolutely enormous, career-level undertaking – it would involve bringing more than the entire undergraduate population at our own institution, Johns Hopkins, into our laboratory! While those results would obviously be extremely valuable, we wouldn’t want to read the reviewer’s comments as implying that only an experiment of that magnitude — requiring thousands upon thousands of in-person subjects — could make progress on these issues. Indeed, because every subject can only contribute one critical trial in IB, it has long been recognized as an extremely challenging paradigm to study in a sufficiently well-powered and psychophysically rigorous way. We believe that our large preregistered online approach represents a major leap forward here, even if it involves certain trade-offs.
In the current Experiment 3, the authors asked the standard Y/N IB question, and then asked how confident subjects were in their answer. Asking back-to-back questions, the second one with a scale that pertains to the first one (including a tricky inversion, e.g., "yes, I am confident in my answer of no"), may be asking too much of some subjects, especially subjects paying half-attention in online experiments. This procedure is likely to introduce a sizeable degree of measurement error.
An easy fix in a follow-up study would be to ask subjects to rate their confidence in having noticed something with a single question using an unambiguous scale:
On the last trial, did you notice anything besides the cross?
(1): I am highly confident I didn't notice anything else
(2): I am confident I didn't notice anything else
(3): I am somewhat confident I didn't notice anything else
(4): I am unsure whether I noticed anything else
(5): I am somewhat confident I noticed something else
(6): I am confident I noticed something else
(7): I am highly confident I noticed something else
If we were to re-run this same experiment, in the lab where we can better control the stimuli and the questioning procedure, we would most likely find a d' of zero for subjects who were confident or highly confident (1-2 on the improved scale above) that they didn't notice anything. From there on, the d' values would gradually increase, tracking along with the confidence scale (from 3-7 on the scale). In other words, we would likely find a data pattern similar to that plotted in Figure 3e, but with the first data point on the left moving down to zero d'. In the current online study with the successive (and potentially confusing) retrospective questioning, a handful of subjects could have easily misinterpreted the confidence scale (e.g., inverting the scale) which would lead to a mixture of genuine high-confidence ratings and mistaken ratings, which would result in a super subject d' that falls between zero and the other extreme of the scale (which is exactly what the data in Fig 3e shows).
One way to check on this potential measurement error using the existing dataset would be to conduct additional analyses that incorporate the confidence ratings from the 2AFC location judgment task. For example, were there any subjects who reported being confident or highly confident that they didn't see anything, but then reported being confident or highly confident in judging the location of the thing they didn't see? If so, how many? In other words, how internally (in)consistent were subjects' confidence ratings across the IB and location questions? Such an analysis could help screen-out subjects who made a mistake on the first question and corrected themselves on the second, as well as subjects who weren't reading the questions carefully enough.
As far as I could tell, the confidence rating data from the 2AFC location task were not reported anywhere in the main paper or supplement.
We are grateful to the reviewer for raising this issue and for requesting that we report the confidence rating data from our 2afc location task in Experiment 3. We now report all this data in our Supplementary Materials (see Supplementary Table 3).
We of course agree with the reviewer’s concern about measurement error, which is a concern in all experiments. What, then, of the particular concern that some subjects might have misunderstood our confidence question? It is surely impossible in principle to rule out this possibility; however, several factors bear on the plausibility of this interpretation. First, we explicitly labeled our confidence scale (with 0 labeled as ‘Not at all confident’ and 3 as ‘Highly confident’) so that subjects would be very unlikely simply to invert the scale. This is especially so as it is very counterintuitive to treat “0” as reflecting high confidence. However, we accept that it is a possibility that certain subjects might nonetheless have been confused in some other way.
So, we also took a second approach. We examined the confidence ratings on the 2afc question of subjects who reported being highly confident that they didn't notice anything.
Reassuringly, the large majority of these high confidence “no” subjects (~80%) reported low confidence of 0 or 1 on the 2afc question, and the majority (51%) reported the lowest confidence of 0. Only 18/204 (9%) subjects reported high confidence on both questions.
Still, the numbers of subjects here are small and so may not be reliable. This led us to take a third approach. We reasoned that any measurement error due to inverting or misconstruing the confidence scale should be symmetric. That is: If subjects are liable to invert the confidence scale, they should do so just as often when they answer “yes” as when they answer “no” – after all the very same scale is being used in both cases. This allows us to explore evidence of measurement error in relation to the much larger number of highconfidence “yes” subjects (N = 2677), thus providing a much more robust indicator as to whether subjects are generally liable to misconstrue the confidence scale. Looking at the number of such high confidence noticers who subsequently respond to the 2afc question with low-confidence, we found that the number was tiny. Only 28/2677 (1.05%) of highconfidence noticers subsequently gave the lowest level of confidence on the 2afc question, and only 63/2677 (2.35%) subjects gave either of the two lower levels of confidence. In this light, we consider any measurement error due to misunderstanding the confidence scale to be extremely minimal.
What should we make of the 18 subjects who were highly confident non-noticers but then only low-confidence on the 2afc question? Importantly, we do not think that these 18 subjects necessarily made a mistake on the first question and so should be excluded. There is no a priori reason why one’s confidence criterion in a yes/no question should carry over to a 2afc question. After all, it is perfectly rationally coherent to be very confident that one didn’t see anything but also very confident that if there was anything to be seen, it was on the left. Moreover, these 18 subjects were not all correct on the 2afc question despite their high confidence (4/18 or 22% getting the wrong answer).
Nonetheless, and again reassuringly, we found that the above-chance patterns in our data remained the same even excluding these 18 subjects. We did observe a slight reduction in percent correct and d′ but this is absolutely what one should expect since excluding the most confident performers in any task will almost inevitably reduce performance.
In this light, we consider it unlikely that measurement error fully explains the residual sensitivity found even amongst highly confident non-noticers. That said, we appreciate this concern. We now raise the issue and the analysis of high confidence noticers which addresses it in our revised manuscript. We also thank the reviewer for pressing us to think harder about this issue, which led directly to these new analyses that we believed have strengthened the paper.
(5) In most (if not all) IB experiments in the literature, a partial attention and/or full attention trial (or set of trials) is administered after the critical trial. These control trials are very important for validating IB on the critical trial, as they must show that, when attended, the critical stimuli are very easy to see. If a subject cannot detect the critical stimulus on the control trial, one cannot conclude that they were inattentionally blind on the critical trial, e.g., perhaps the stimulus was just too difficult to see (e.g., too weak, too brief, too far in the periphery, too crowded by distractor stimuli, etc.), or perhaps they weren't paying enough attention overall or failed to follow instructions. In the aggregate data, rates of noticing the stimuli should increase substantially from the critical trial to the control trials. If noticing rates are equivalent on the critical and control trials one cannot conclude that attention was manipulated.
It is puzzling why the authors decided not to include any control trials with partial or full attention in their five experiments, especially given their online data collection procedures where stimulus size, intensity, eccentricity, etc. were uncontrolled and variable across subjects. Including such trials could have actually helped them achieve their goal of challenging the IB hypothesis, e.g., excluding subjects who failed to see the stimulus on the control trials might have reduced the inattentional blindness rates further. This design decision should at least be acknowledged and justified (or noted as a limitation) in a revision of this paper.
We acknowledge that other studies in the literature include divided and full attention trials, and that they could have been included in our work as well. However, we deliberately decided not to include such control trials for an important reason. As the referee comments, the main role of such trials in previous work has been to exclude from analysis subjects who failed to report the unexpected stimulus on the divided and/or full attention control trials.
(For example, as Most et al. 2001 write: “Because observers should have seen the object in the full-attention trial (Mack & Rock, 1998), we used this trial as a control … Accordingly, 3 observers who failed to see the cross on this trial were replaced, and their data were excluded from the analyses.") As the reviewer points out, excluding such subjects would very likely have ‘helped' us. However, the practice is controversial. Indeed, in a review of 128 experiments, White et al. 2018 argue that the practice has “problematic consequences” and “may lead researchers to understate the pervasiveness of inattentional blindness". Since we wanted to offer as simple and demanding a test of residual sensitivity in IB as possible, we thus decided not to use any such exclusions, and for that reason decided not to include divided/full attention trials.
As recommended, we discuss this decision not to include divided/full attention trials and our logic for not doing so in the manuscript. As we explain, not having those conditions makes it more impressive, not less impressive, that we observed the results we in fact did — it makes our results more interpretable, not less interpretable, and so absence of such conditions from our manuscript should not (in our view) be considered any kind of weakness.
(6) In the discussion section, the authors devote a short paragraph to considering an alternative explanation of their non-zero d' results in their super subject analyses: perhaps the critical stimuli were processed unconsciously and left a trace such that when later forced to guess a feature of the stimuli, subjects were able to draw upon this unconscious trace to guide their 2AFC decision. In the subsequent paragraph, the authors relate these results to above-chance forced-choice guessing in blindsight subjects, but reject the analogy based on claims of parsimony.
First, the authors dismiss the comparison of IB and blindsight too quickly. In particular, the results from experiment 3, in which some subjects adamantly (confidently) deny seeing the critical stimulus but guess a feature at above-chance levels (at least at the super subject level and assuming the online subjects interpreted and used the confidence scale correctly), seem highly analogous to blindsight. Importantly, the analogy is strengthened if the subjects who were confident in not seeing anything also reported not being confident in their forced-choice judgments, but as mentioned above this data was not reported.
Second, the authors fail to mention an even more straightforward explanation of these results, which is that ~8% of subjects misinterpreted the "unusual" part of the standard IB question used in experiments 1-3. After all, colored lines and shapes are pretty "usual" for psychology experiments and were present in the distractor stimuli everyone attended to. It seems quite reasonable that some subjects answered this first question, "no, I didn't see anything unusual", but then when told that there was a critical stimulus and asked to judge one of its features, adjusted their response by reconsidering, "oh, ok, if that's the unusual thing you were asking about, of course I saw that extra line flash on the left of the screen". This seems like a more parsimonious alternative compared to either of the two interpretations considered by the authors: (1) IB does not exist, (2) super-subject d' is driven by unconscious processing. Why not also consider: (3) a small percentage of subjects misinterpreted the Y/N question about noticing something unusual. In experiments 4-5, they dropped the term "unusual" but do not analyze whether this made a difference nor do they report enough of the data (subject numbers for the Y/N question and 2AFC) for readers to determine if this helped reduce the ~8% overestimate of IB rates.
Our primary ambition in the paper was to establish, as our title suggests, residual sensitivity in IB. The ambition is quite neutral as to whether the sensitivity reflects conscious or unconscious processing (i.e. is akin to blindsight as traditionally conceived). We were evidently not clear about this, however, leading to two referees coming away with an impression of our claims that is different than we intended. We have revised our manuscript throughout to address this. But we also want to emphasize here that we take our data primarily to support the more modest claim that there is residual sensitivity (conscious or unconscious) in the group of subjects who are traditionally classified as inattentionally blind. We believe that this claim has solid support in our data.
We do in the discussion section offer one reason for believing that there is residual awareness in the group of subjects who are traditionally classified as inattentionally blind. However, we acknowledge that this is controversial and now emphasize in the manuscript that this claim “is tentative and secondary to our primary finding”. We also emphasize that part of our point is dialectical: Inattentional blindness has been used to argue (e.g.) that attention is required for awareness. We think that our data concerning residual sensitivity at least push back on the use of IB to make this claim, even if they do not provide decisive evidence (as we agree) that awareness survives inattention. (Cf. here, Hirshhorn et al. 2024 who take up a common suggestion in the field that awareness is best assessed by using both subjective and objective measures, with claims about lack of awareness ideally being supported by both; our data suggest at a minimum that in IB objective measures do not neatly line up with subjective measures.)
We hope this addresses the referee’s concern that we dismiss the “the comparison of IB and blindsight too quickly”. We do not intend to dismiss that comparison at all, indeed we raise it because we consider it a serious hypothesis. Our aim is simply to raise one possible consideration against it. But, again, our main claim is quite consistent with sensitivity in IB being akin to “blindsight”.
We also agree with the referee that a possible explanation of why some subjects say they do not notice something unusual in IB paradigms, is not because they didn’t notice anything but because they didn’t consider the unexpected stimulus sufficiently unusual. However, the reviewer is incorrect that we did not mention this interpretation; to the contrary, it was precisely the kind of concern which led us to be dissatisfied with standard IB methods and so motivated our approach. As we wrote in our main text: “However, yes/no questions of this sort are inherently and notoriously subject to bias… For example, observers might be under-confident whether they saw anything (or whether what they saw counted as unusual); this might lead them to respond “no” out of an excess of caution.” On our view, this is exactly the kind of reason (among other reasons) that one cannot rely on yes/no reports of noticing unusual stimuli, even though the field has relied on just these sorts of questions in just this way.
We do not, however, think that this explanation accounts for why all subjects fail to report noticing, nor do we think that it accounts for our finding of above-chance sensitivity amongst non-noticers. This is for two critical reasons. First, whereas the word “unusual” did appear in the yes/no question in our Experiments 1-3, it did not appear in our Experiments 4 and 5 on dynamic IB. (In both cases, we used the exact wording of such questions in the experiments we were basing our work on.) And, of course, we still found significant residual sensitivity amongst non-noticers in Experiments 4 and 5. Second, in relation to our confidence experiment, we think it unlikely that subjects who were highly confident that they did not notice anything unusual only said that because they thought what they had seen was insufficiently unusual. Yet even in this group of subjects who were maximally confident that they did not notice anything unusual, we still found residual sensitivity.
(7) The authors use sub-optimal questioning procedures to challenge the existence of the phenomenon this questioning is intended to demonstrate. A more neutral interpretation of this study is that it is a critique on methods in IB research, not a critique on IB as a manipulation or phenomenon. The authors neglect to mention the dozens of modern IB experiments that have improved upon the simple Y/N IB questioning methods. For example, in Michael Cohen's IB experiments (e.g., Cohen et al., 2011; Cohen et al., 2020; Cohen et al., 2021), he uses a carefully crafted set of probing questions to conservatively ensure that subjects who happened to notice the critical stimuli have every possible opportunity to report seeing them. In other experiments (e.g., Hirschhorn et al., 2024; Pitts et al., 2012), researchers not only ask the Y/N question but then follow this up by presenting examples of the critical stimuli so subjects can see exactly what they are being asked about (recognition-style instead of free recall, which is more sensitive). These follow-up questions include foil stimuli that were never presented (similar to the stimulus-absent trials here), and ask for confidence ratings of all stimuli. Conservative, pre-defined exclusion criteria are employed to improve the accuracy of their IB-rate estimates. In these and other studies, researchers are very cautious about trusting what subjects report seeing, and in all cases, still find substantial IB rates, even to highly salient stimuli. The authors should consider at least mentioning these improved methods, and perhaps consider using some of them in their future experiments.
The concern that we do not sufficiently discuss the range of “improved” methods in IB studies is well-taken. A similar concern is raised by Reviewer #2 (Dr. Cohen). To address the concern, we have added to our manuscript a substantial new discussion of such improved methods. However, although we do agree that these methods can be helpful and may well address some of the methodological concerns which our paper raises, we do not think that they are a panacea. Thus, our discussion of these methods also includes a substantial discussion of the problems and pitfalls with such methods which led us to favor our own simple forced-response and 2afc questions, combined with SDT analysis. We think this approach is superior both to the classic approach in IB studies and to the approach raised by the reviewers.
In particular, we have four main concerns about the follow up questions now commonly used in the field:
First, many follow up questions are used not to exclude people from the IB group but to include people in the IB group. Thus, Most et al. 2001 asked follow up questions but used these to increase their IB group, only excluding subjects from the IB group if they both reported seeing and answered their follow ups incorrectly: “Observers were regarded as having seen the unexpected object if they answered 'yes' when asked if they had seen anything on the critical trial that had not been present before and if they were able to describe its color, motion, or shape." This means that subjects who saw the object but failed to see its color, say, would be treated as inattentionally blind. This has the purpose of inflating IB rates, in exactly the way our paper is intended to critique. So, in our view this isn’t an improvement but rather part of the approach we take issue with.
Second, many follow up questions remain yes/no questions or nearby variants, all of which are subject to response bias. For example, in Cohen’s studies which the reviewer mentions, it is certainly true that “he uses a carefully crafted set of probing questions to conservatively ensure that subjects who happened to notice the critical stimuli have every possible opportunity to report seeing them.” We agree that this improves over a simple yes/no question in some ways. However, such follow up probes nonetheless remain yes/no questions, subject to response bias, e.g.:
(1) “Did you notice anything strange or different about that last trial?”
(2) “If I were to tell you that we did something odd on the last trial, would you have a guess as to what we did?”
(3) “If I were to tell you we did something different in the second half of the last trial, would you have a guess as to what we did?”
(4) “Did you notice anything different about the colors in the last scene?”
Indeed, follow up questions of this kind can be especially susceptible to bias, since subjects may be reluctant to “take back” their earlier answers and so be conservative in responding positively to avoid inconsistency or acknowledgement of earlier error. This may explain why such follow up questions produce remarkable consistency despite their rather different wording. Thus, Simons and Chabris (1999) report: “Although we asked a series of questions escalating in specificity to determine whether observers had noticed the unexpected event, only one observer who failed to report the event in response to the first question (“did you notice anything unusual?'') reported the event in response to any of the next three questions (which culminated in “did you see a ... walk across the screen?''). Thus, since the responses were nearly always consistent across all four questions, we will present the results in terms of overall rates of noticing.” Thus, while there are undoubtedly merits to these follow ups, they do not resolve problems of bias.
This same basic issue affects the follow up question used in Pitts et al. 2012 which the reviewer mentions. Pitts et al. write: “If a participant reported not seeing any patterns and rated their confidence in seeing the square pattern (once shown the sample) as a 3 or less (1 = least confident, 5 = most confident), she or he was placed in Group 1 and was considered to be inattentionally blind to the square patterns.” The confidence rating follow-up question here remains subject to bias. Moreover, and strikingly, the inclusion criterion used means that subjects who were moderately confident that they saw the square pattern when shown (i.e. answered 3) were counted as inattentionally blind (!). We do not think this is an appropriate inclusion criterion.
The third problem is that follow up questions are often free/open-response. For instance, Most et al. (2005) ask the follow up question: "If you did see something on the last trial that had not been present during the first two trials, what color was it? If you did not see something, please guess." This is a much more difficult and to that extent less sensitive question than our binary forced-response/2afc questions. For this reason, we believe our follow up questions are more suitable for ascertaining low levels of sensitivity.
The fourth and final issue is that whereas 2afc questions are criterion free (in that they naturally have an unbiased decision rule), this is in fact not true of n_afc questions in general, nor is it true in general of _delayed n-alternative match to sample designs. Thus, even when limited response options are given, they are not immune to response biases and so require SDT analysis. Moreover, some such tasks can involve decision spaces which are often poorly understood or difficult to analyze without making substantial assumptions about observer strategy.
This last point (as well as the first) is relevant to Hirshhorn et al. 2024. Hirshhorn et al. write that they “used two awareness measures. Firstly, participants were asked to rate stimulus visibility on the Perceptual Awareness Scale (PAS, a subjective measure of awareness: Ramsøy & Overgaard, 2004), and then they were asked to select the stimulus image from an array of four images (an objective measure: Jakel & Wichmann, 2006).”
While certainly an improvement on simple yes/no questioning, the PAS remains subject to response bias. On the other hand, we applaud Hirshhorn et al.’s use of objective measures in the context of IB which of course our design implements. However, while Hirshhorn et al. 2024 suggest that their task is a spatial 4afc following the recommendation of this design by Jakel & Wichmann (2006), it is strictly a 4-alternative delayed match to sample task, so it is doubtful if it can be considered a preferred psychophysical task for the reasons Jakel & Wichmann offer. Regardless, the more crucial point is that observers in such a task might be biased towards one alternative as opposed to another. Thus, use of d′ (as opposed to percent correct as in Hirshhorn et al. 2024) is crucial in assessing performance in such tasks.
For all these reasons, then, while we agree that the field has taken significant steps to move beyond the simple yes/no question traditionally used in IB studies (and we have revised our manuscript to make this clear); we do not think it has resolved the methodological issues which our paper seeks to highlight and address, and we believe that our approach contributes something additional that is not yet present in the literature. We have now revised our manuscript to make these points much more clearly, and we thank the reviewer for prompting these improvements.
Reviewer #2 (Public review):
In this study, Nartker et al. examine how much observers are conscious of using variations of classic inattentional blindness studies. The key idea is that rather than simply asking observers if they noticed a critical object with one yes/no question, the authors also ask follow-up questions to determine if observers are aware of more than the yes/no questions suggest. Specifically, by having observers make forced choice guesses about the critical object, the authors find that many observers who initially said "no" they did not see the object can still "guess" above chance about the critical object's location, color, etc. Thus, the authors claim, that prior claims of inattentional blindness are mistaken and that using such simple methods has led numerous researchers to overestimate how little observers see in the world. To quote the authors themselves, these results imply that "inattentionally blind subjects consciously perceive these stimuli after all... they show sensitivity to IB stimuli because they can see them."
Before getting to a few issues I have with the paper, I do want to make sure to explicitly compliment the researchers for many aspects of their work. Getting massive amounts of data, using signal detection measures, and the novel use of a "super subject" are all important contributions to the literature that I hope are employed more in the future.
We really appreciate this comment and that the reviewer found our work to make these important contributions to the literature. We wrote this paper expecting not everyone to accept our conclusions, but hoping that readers would see the work as making a valuable contribution to the literature promoting an underexplored alternative in a compelling way. Given that this reviewer goes on to express some skepticism about our claims, it is especially encouraging to see this positive feedback up top!
Main point 1: My primary issue with this work is that I believe the authors are misrepresenting the way people often perform inattentional blindness studies. In effect, the authors are saying, "People do the studies 'incorrectly' and report that people see very little. We perform the studies 'correctly' and report that people see much more than previously thought." But the way previous studies are conducted is not accurately described in this paper. The authors describe previous studies as follows on page 3:
"Crucially, however, this interpretation of IB and the many implications that follow from it rest on a measure that psychophysics has long recognized to be problematic: simply asking participants whether they noticed anything unusual. In IB studies, awareness of the unexpected stimulus (the novel shape, the parading gorilla, etc.) is retroactively probed with a yes/no question, standardly, "Did you notice anything unusual on the last trial which wasn't there on previous trials?". Any subject who answers "no" is assumed not to have any awareness of the unexpected stimulus.
If this quote were true, the authors would have a point. Unfortunately, I do not believe it is true. This is simply not how many inattentional blindness studies are run. Some of the most famous studies in the inattentional blindness literature do not simply as observes a yes/no question (e.g., the invisible gorilla (Simons et al. 1999), the classic door study where the person changes (Simons and Levin, 1998), the study where observers do not notice a fight happening a few feet from them (Chabris et al., 2011). Instead, these papers consistently ask a series of follow-up questions and even tell the observers what just occurred to confirm that observers did not notice that critical event (e.g., "If I were to tell you we just did XYZ, did you notice that?"). In fact, after a brief search on Google Scholar, I was able to relatively quickly find over a dozen papers that do not just use a yes/no procedure, and instead as a series of multiple questions to determine if someone is inattentionally blind. In no particular order some papers (full disclosure: including my own):
(1) Most et al. (2005) Psych Review
(2) Drew et al. (2013) Psych Science
(3) Drew et al. (2016) Journal of Vision
(4) Simons et al. (1999) Perception
(5) Simons and Levin (1998) Perception
(6) Chabris et al. (2011) iPerception
(7) Ward & Scholl (2015) Psych Bulletin and Review
(8) Most et al. (2001) Psych Science
(9) Todd & Marois (2005) Psych Science
(10) Fougnie & Marois (2007) Psych Bulletin and Review
(11) New and German (2015) Evolution and Human Behaviour
(12) Jackson-Nielsen (2017) Consciousness and cognition
(13) Mack et al. (2016) Consciousness and cognition
(14) Devue et al. (2009) Perception
(15) Memmert (2014) Cognitive Development
(16) Moore & Egeth (1997) JEP:HPP
(17) Cohen et al. (2020) Proc Natl Acad Sci
(18) Cohen et al. (2011) Psych Science
This is a critical point. The authors' key idea is that when you ask more than just a simple yes/no question, you find that other studies have overestimated the effects of inattentional blindness. But none of the studies listed above only asked simple yes/no questions. Thus, I believe the authors are mis-representing the field. Moreover, many of the studies that do much more than ask a simple yes/no question are cited by the authors themselves! Furthermore, as far as I can tell, the authors believe that if researchers do these extra steps and ask more follow-ups, then the results are valid. But since so many of these prior studies do those extra steps, I am not exactly sure what is being criticized.
To make sure this point is clear, I'd like to use a paper of mine as an example. In this study (Cohen et al., 2020, Proc Natl Acad Sci USA) we used gaze-contingent virtual reality to examine how much color people see in the world. On the critical trial, the part of the scene they fixated on was in color, but the periphery was entirely in black and white. As soon as the trial ended, we asked participants a series of questions to determine what they noticed. The list of questions included:
(1) "Did you notice anything strange or different about that last trial?"
(2) "If I were to tell you that we did something odd on the last trial, would you have a guess as to what we did?"
(3) "If I were to tell you we did something different in the second half of the last trial, would you have a guess as to what we did?"
(4) "Did you notice anything different about the colors in the last scene?"
(5) We then showed observers the previous trial again and drew their attention to the effect and confirmed that they did not notice that previously.
In a situation like this, when the observers are asked so many questions, do the authors believe that "the inattentionally blind can see after all?" I believe they would not say that and the reason they would not say that is because of the follow-up questions after the initial yes/no question. But since so many previous studies use similar follow-up questions, I do not think you can state that the field is broadly overestimating inattentional blindness. This is why it seems to me to be a bit of a strawman: most people do not just use the yes/no method.
We appreciate this reviewer raising this issue. As he (Dr. Cohen) states, his “primary issue” concerns our discussion of the broader literature (which he worries understates recent improvements made to the IB methodology), rather than, e.g., the experiments we’ve run. We take this concern very seriously and address it comprehensively here.
A very similar issue is identified by Reviewer #1, comment (7). To review some of what we say in reply to them: To address the concern we have added to our manuscript a substantial new discussion of such improved methods. However, although we do agree that these methods can be helpful and may well address some of the methodological concerns which our paper raises, we do not think that they are a panacea. Thus, our discussion of these methods also includes a substantial discussion of the problems and pitfalls with such methods which led us to favor our own simple forced-response and 2afc questions, combined with SDT analysis. We think this approach is superior both to the classic approach in IB studies and to the approach raised by the reviewers.
In particular, we have three main concerns about the follow up questions now commonly used in the field:
First, many follow up questions are used not to exclude subjects from the IB group but to include subjects in the IB group. Thus, Most et al. (2001) asked follow up questions but used these to increase their IB group, only excluding subjects from the IB group if they both reported seeing and failed to answer their follow ups correctly: “Observers were regarded as having seen the unexpected object if they answered 'yes' when asked if they had seen anything on the critical trial that had not been present before and if they were able to describe its color, motion, or shape." This means that subjects who saw the object but failed to describe it in these respects would be treated as inattentionally blind. This is problematic since failure to describe a feature (e.g., color, shape) does not imply a complete lack of information concerning that feature; and even if a subject did lack all information concerning these features of an object, this would not imply a complete failure to see the object. Similarly, Pitts et al. (2012) asked subjects to rate their confidence in their initial yes/no response from 1 = least confident to 5 = most confident, and used these ratings to include in the IB group those who rated their confidence in seeing at 3 or less. This is evidently problematic, since there is a large gap between being under confident that one saw something and being completely blind to it. More generally, using follows up to inflate IB rates in such ways raises precisely the kinds of issues our paper is intended to critique. So in our view this isn’t an improvement but rather part of the approach we take issue with.
Second, many follow up questions remain yes/no questions or nearby variants, all of which are subject to response bias. For example, in the reviewer’s own studies (Cohen et al. 2020, 2011; see also: Simons et al., 1999; Most et al., 2001, 2005; Drew et al., 2013; Memmert, 2014) a series of follow up questions are used to try and ensure that subjects who noticed the critical stimuli are given the maximum opportunity to report doing so, e.g.:
(1) “Did you notice anything strange or different about that last trial?”
(2) “If I were to tell you that we did something odd on the last trial, would you have a guess as to what we did?”
(3) “If I were to tell you we did something different in the second half of the last trial, would you have a guess as to what we did?”
(4) “Did you notice anything different about the colors in the last scene?”
We certainly agree that such follow up questions improve over a simple yes/no question in some ways. However, such follow up probes nonetheless remain yes/no questions, intrinsically subject to response bias. Indeed, follow up questions of this kind can be especially susceptible to bias, since subjects may be reluctant to “take back” their earlier answers and so be conservative in responding positively to avoid inconsistency or acknowledgement of earlier error. This may explain why such follow up questions produce remarkable consistency despite their rather different wording. Thus, Simons and Chabris (1999) report: “Although we asked a series of questions escalating in specificity to determine whether observers had noticed the unexpected event, only one observer who failed to report the event in response to the first question (“did you notice anything unusual?'') reported the event in response to any of the next three questions (which culminated in “did you see a ... walk across the screen?''). Thus, since the responses were nearly always consistent across all four questions, we will present the results in terms of overall rates of noticing.” Thus, while there are undoubtedly merits to these follow ups, they do not resolve problems of bias.
It is also important to recognize that whereas 2afc questions are criterion free (in that they naturally have an unbiased decision rule), this is not true of n_afc nor delayed _n-alternative match to sample designs in general. Performance in such tasks thus requires SDT analysis – which itself may be problematic if the decision space is not properly understood or requires making substantial assumptions about observer strategy.
Third, and finally, many follow up questions are insufficiently sensitive (especially with small sample sizes). For instance, Todd, Fougnie & Marois (2005) used a 12-alternative match-tosample task (see similarly: Fougnie & Marois, 2007; Devue et al., 2009). And Most et al. (2005) asked an open-response follow-up: “If you did see something on the last trial that had not been present during the first two trials, what color was it? If you did not see something, please guess.” These questions are more difficult and to that extent less sensitive than binary forced-response/2afc questions of the sort we use in our own studies – a difference which may be critical in uncovering degraded perceptual sensitivity.
For all these reasons, then, while we agree that the field has taken significant steps to move beyond the simple yes/no question traditionally used in IB studies (and we have revised our manuscript to make this clear); we do not think it has resolved the methodological issues which our paper seeks to highlight and address, and we believe that our approach of using 2afc or forced-response questions combined with signal detection analysis is an important improvement on prior methods and contributes something additional that is not yet present in the literature. We have now revised our manuscript to make these points much clearer.
Other studies that improve on the standard methodology
This reviewer adds something else, however: A very helpful list of 18 papers which include follow ups and that he believes overcome many of the issues we raise in our paper. To just state our reaction bluntly: We are familiar with every one of these papers (indeed, one of them is a paper by one of us!), and while we think these are all very valuable contributions to the literature, it is our view that none of these 18 papers resolves the worries that led us to conduct our work.
Here we briefly comment on the relevant pitfalls in each case. We hope this serves to underscore the importance of our methodological approach.
(1) Most et al. (2005) Psych Review
Either a 2-item or 5-item questionnaire was used. The 2-item questionnaire ran as follows:
(1) On the last trial, did you see anything other than the 4 circles and the 4 squares (anything that had not been present on the original two trials)? Yes No
(2) If you did see something on the last trial that had not been present during the original two trials, please describe it in as much detail as possible.
This clearly does not substantially improve on the traditional simple yes/no question. Moreover, the second question (as well as being open-ended) was used to include additional subjects in the IB group, in that participants were counted as having seen the object only if they responded “yes” to Q1 and in addition “were able to report at least one accurate detail” in response to Q2. In other words, either a subject says “no” (and is treated as unaware), or says “yes” and then is asked to prove their awareness, as it were. If anything, this intensifies the concerns we raise, by inflating IB rates.
The 5-item questionnaire looked like this:
(1) On the last trial, did you see anything other than the black and white L’s and T’s (anything that had not been present on the first two trials)?
(2) If you did see something on the last trial that had not been present during the first two trials, please describe it.
(3) If you did see something on the last trial that had not been present during the first two trials, what color was it? If you did not see something, please guess. (Please indicate whether you did see something or are guessing)
(4) If you did see something during the last trial that had not been present in the first two trials, please draw an arrow on the “screen” below showing the direction in which it was moving. If you did not see something, please guess. (Please indicate whether you did see something or are guessing)
(5) If you did see something during the last trial that had not been present during the first two trials, please circle the shape of the object below [4 shapes are presented to choose from]. If you did not see anything, please guess. (Please indicate whether you did see something or are guessing)
Q5 was not used for analysis purposes. (It suffers from the second issue raised above.) Q1 is the traditional y/n question. Qs 2&3 are open ended. It is unclear how responses to Q4 were analyzed (at the limit it could be considered a helpful, forced-choice question – though it again would suffer from the second issue raised above). However, as noted with respect to the 2-item questionnaire, these responses were not used to exclude people from the IB group but to include people in it. So again, this approach does not in any way address the issues we are concerned about, and if anything, only makes them worse.
(2) Drew et al. (2013) Psych Science
All follow ups were yes/no: “we asked a series of questions to determine whether they noticed the gorilla: ‘Did the final trial seem any different than any of the other trials?’, ‘Did you notice anything unusual on the final trial?’, and, finally, ‘Did you see a gorilla on the final trial?’”. So, this paper essentially implements the standard methodology we mention (and criticize).
(3) Drew et al. (2016) Journal of Vision
Follow up questions were used, but the reported procedure does not provide sufficient details to evaluate them (we are only told: “After the final trial, they were asked: ‘On that last trial of the task, did you notice anything that was not there on previous trials?’ They then answered questions about the features of the unexpected stimulus on a separate screen (color, shape, movement, and direction of movement).”). It is not clear that these follow ups were used to exclude any subjects from the analysis. Finally, given that the unexpected object could be the same color as the targets/distractors, it is clear that biases would have been introduced which would need to be considered (but which were not).
(4) Simons & Chabris (1999) Perception
All follow ups were yes/no: “observers were … asked to provide answers to a surprise series of additional questions. (i) While you were doing the counting, did you notice anything unusual on the video? (ii) Did you notice any- thing other than the six players? (iii) Did you see anyone else (besides the six players) appear on the video? (iv) Did you see a gorilla [woman carrying an umbrella] walk across the screen? After any “yes'' response, observers were asked to provide details of what they noticed. If at any point an observer mentioned the unexpected event, the remaining questions were skipped.” As noted previously, the analyses in fact did not use these questions to exclude subjects since answers were so consistent.
(5) Simons and Levin (1998) Perception
This is a change detection paradigm, not a study of inattentional blindness. And in any case, one yes/no follow up was used: “Did you notice that I'm not the same person who approached you to ask for directions?”
(6) Chabris et al. (2011) iPerception
Two yes/no questions were asked: “we asked whether the subjects had seen anything unusual along the route, and then whether they had seen anyone fighting.” It seems that follow up questions (a request to describe the fight) were asked only of those who said yes.
This is in fact a common procedure – follow up questions only being asked of the “yes” group. As discussed, it is sometimes used to increase rates of IB, compounding the problem we identify in our paper. So this is another example of a follow-up question that makes the problem we identify worse, not better.
(7) Ward & Scholl (2015) Psych Bulletin and Review
Two yes/no questions were used: “...observers were asked whether they noticed ‘anything … that was different from the first three trials’ — and if so, to describe what was different. They were then shown the gray cross and asked if they had noticed it—and if so, to describe where it was and how it moved. Only observers who explicitly reported not noticing the cross were counted as ‘nonnoticers’ to be included in the final sample (N = 100).” In each case, combining the traditional noticing question with a request to describe and identify may have induced conservative response biases in the noticing question, since a subject might consider being able to describe or identify the unexpected stimulus a precondition of giving a positive answer to the noticing question.
(8) Most et al. (2001) Psych Science
The same 5-item questionnaire discussed above in relation to Most et al. (2005) was used:
(1) On the last trial, did you see anything other than the black and white L’s and T’s (anything that had not been present on the first two trials)?
(2) If you did see something on the last trial that had not been present during the first two trials, please describe it.
(3) If you did see something on the last trial that had not been present during the first two trials, what color was it? If you did not see something, please guess. (Please indicate whether you did see something or are guessing)
(4) If you did see something during the last trial that had not been present in the first two trials, please draw an arrow on the “screen” below showing the direction in which it was moving. If you did not see something, please guess. (Please indicate whether you did see something or are guessing)
(5) If you did see something during the last trial that had not been present during the first two trials, please circle the shape of the object below [4 shapes are presented to choose from]. If you did not see anything, please guess. (Please indicate whether you did see something or are guessing)
Q5 was not used for analysis purposes. (It suffers from the second issue raised above.) Q1 is the traditional yes/no question. Qs 2&3 are open ended. It is unclear how responses to Q4 were analyzed (at the limit it could be considered a helpful, forced-choice question – though it again would suffer from the second issue raised above). However, as noted with respect to the two item questionnaire in Most et al. 2005, these responses were not used to exclude people from the IB group but to include people in it. So again this approach does not in any way address the issues we are concerned about, and if anything only makes them worse.
(9) Todd, Fougnie & Marois (2005) Psych Science
“participants were probed with three questions to determine whether they had detected the critical stimulus ... .The first question assessed whether subjects had seen anything unusual during the trial; they responded ‘‘yes’’ or ‘‘no’’ by pressing the appropriate key on the keyboard. The second question asked participants to select which stimulus they might have seen among 12 possible objects and symbols selected from MacIntosh font databases. The third question asked participants to select the quadrant in which the critical stimulus may have appeared by pressing one of four keys, each of which corresponded to one of the quadrants.”
These follow ups were used to include people in the IB group: “In keeping with previous studies (Most et al., 2001), participants were considered to have detected the critical stimulus successfully if they (a) reported seeing an unexpected stimulus and (b) correctly selected its quadrant location.” In line with our third point about sensitivity, the object identity test transpired to be “too difficult even under full-attention conditions … Thus, performance with this question was not analyzed further.”
(10) Fougnie & Marois (2007) Psych Bulletin and Review
Same exact methods and problems as with Todd & Marois (2005) Psych Science, just discussed.
(11) New and German (2015) Evolution and Human Behaviour
“After the fourth trial containing the additional experimental stimulus, the participant was asked, “Did you see anything in addition to the cross on that trial?” and which quadrant the additional stimulus appeared in. They were then asked to identify the stimulus in an array which in Experiment 1 included two variants chosen randomly from the spider stimuli and the two needle stimuli. Participants in Experiment 2 picked from all eight stimuli used in that experiment.”
Our second concern about response biases and the need for appropriate SDT analysis of the 4/8 alternative tasks applies to all these questions. We also note that analyses were only performed on groups separately (those who detected/failed to detect, those who located/failed to locate, and those who identified/failed to identify) and on the group which did all three/failed to do any one of the three. Especially in light of the fact that some subjects could clearly detect the stimulus without being able to identity it (e.g.), the most stringent test given our concerns (which were not obviously New and German’s comparative concerns), would be to consider the group which could not detect, identify or localize.
(12) Jackson-Nielsen (2017) Consciousness and cognition
This is a very interesting example of a follow-up which used a 3-AFC recognition test:
“participants were immediately asked, ‘‘which display looks most like what you just saw?’ from 3 alternatives”. However, though such an objective test is definitely to be preferred in our view to an open-ended series of probes, the 3-AFC test administered clearly had issues with response biases, as discussed, and actually yielded significantly below chance performance in one of the experiments.
(13) Mack et al. (2016) Consciousness and cognition
The follow ups here were essentially yes/no combined with an assessment of surprise. Participants were asked to enter letters into a box, and if they did so “were immediately asked by the experimenter whether they had noticed anything different about the array on this last trial and if they did not, they were told that there had been no letters and their responses to that news were recorded. Clearly, if they expressed surprise, this would be compelling evidence that they were unaware of the absence of the letters. Those observers who did not enter letters and realized there were no letters present were considered aware of the absence.” So, this again has all of the same problems we identify, considering subjects unaware because they expressed surprise.
(14) Devue et al. (2009) Perception
An 8-alternative task was used. The authors were primarily interested in a comparative analysis and so did not use this task to exclude subjects. We note that an 8 alternative task is very demanding – compare the 12-alternative task used in Todd, Fougnie & Marois (2005). There was an attempt to investigate biases in a separate bias trial, however SDT measures were not used.
(15) Memmert (2014) Cognitive Development
“After watching the video and stating the number of passes, participants answered four questions (following Simons & Chabris, 1999): (1) While you were counting, did you perceive anything unusual on the video? (2) Did you perceive anything other than the six players? (3) Did you see anyone else (besides the six players) appear on the video? (4) Did you notice a gorilla walk across the screen? After any “yes” reply, children were asked to provide details of what they noticed. If at any point a child mentioned the unexpected event, the remaining questions were omitted.” All of these follow-up questions are yes/no judgments, used to determine awareness in exactly the way we critique as problematic.
(16) Moore & Egeth (1997) JEP:HPP
This study (which includes one of us, Egeth, as author) did use forced choice questions. In one case, the question was 2-alternative, in the other it was 4-alternative. In the latter case, SDT would have been appropriate but was not used. In the former case, it may have been that a larger sample would have revealed evidence of sensitivity to the background pattern (as it stood 55% answered the 2-alternative question correctly). Although these results have been replicated, unfortunately the replication in Wood and Simons 2019 used a 6-alternative recognition task and this was not analyzed using SDT. We also note that the task is rather difficult in this study. Wood and Simons report: “Exclusion rates were much higher than anticipated, primarily due to exclusions when subjects failed to correctly report the pattern on the full-attention trial; we excluded 361 subjects, or 58% of our sample.”
(17) Cohen et al. (2020) Proc Natl Acad Sci
While this paper improves over a simple yes/no question in some ways, especially in that it used the follow up questions to exclude subjects from the unaware (IB) group, the follow up probes nonetheless remain yes/no questions, subject to response bias, e.g.:
(1) “Did you notice anything strange or different about that last trial?”
(2) “If I were to tell you that we did something odd on the last trial, would you have a guess as to what we did?”
(3) “If I were to tell you we did something different in the second half of the last trial, would you have a guess as to what we did?”
(4) “Did you notice anything different about the colors in the last scene?”
Follow up questions of this kind can be especially susceptible to bias, since subjects may be reluctant to “take back” their earlier answers and so be conservative in responding positively to avoid inconsistency or acknowledgement of earlier error. This may explain why such follow up questions can produce remarkable consistency despite their rather different wording.
(18) Cohen et al. (2011) Psych Science
Here are the probes used in this study:
(1) Did you notice anything different on that trial?
(2) Did you notice something different about the background stream of images?
(3) Did you notice that a different type of image was presented in the background that was unique in some particular way?
(4) Did you see an actual photograph of a natural scene in that stream?
(5) If I were to tell you that there was a photograph in that stream, can you tell me what it was a photograph of?
Qs 1-4 are yes/no. Q5 is yes/no with an open-ended response. After this, a 5 or 6-alternative recognition test was administered. So again, this faces the same issues, since y/n questions are subject to bias in the way we have described, and many-alternative tests are more problematic than 2afc tests.
In summary
We really appreciate the care that went into compiling this list, and we agree that these papers and the improved methods they contain are relevant. But as hopefully made clear above, the approaches in each of these papers simply don’t solve the foundational issues our critique is aimed at (though they may address other issues). This is why we felt our new approach was necessary. And we continue to feel this way even after reading and incorporating these comments from Dr. Cohen.
Nevertheless, there is clearly lots for us to do in light of these comments. And so as noted earlier we have now added a very substantial new section to our discussion section to more fairly and completely portray the state of the art in this literature. This is really to our benefit in the end, since we now not only better acknowledge the diverse approaches present, but also set up ourselves to make our novel contribution exceedingly clear.
Main point 2: Let's imagine for a second that every study did just ask a yes/no question and then would stop. So, the criticism the authors are bringing up is valid (even though I believe it is not). I am not entirely sure that above chance performance on a forced choice task proves that the inattentionally blind can see after all. Could it just be a form of subliminal priming? Could there be a significant number of participants who basically would say something like, "No I did not see anything, and I feel like I am just guessing, but if you want me to say whether the thing was to the left or right, I will just 100% guess"? I know the literature on priming from things like change and inattentional blindness is a bit unclear, but this seems like maybe what is going on. In fact, maybe the authors are getting some of the best priming from inattentional blindness because of their large sample size, which previous studies do not use.
I'm curious how the authors would relate their studies to masked priming. In masked priming studies, observers say the did not see the target (like in this study) but still are above chance when forced to guess (like in this study). Do the researchers here think that that is evidence of "masked stimuli are truly seen" even if a participant openly says they are guessing?
We’re grateful to the reviewer for raising this question. As we say in response to Reviewer #1, our primary ambition in the paper is to establish, as our title suggests, residual sensitivity in IB. The ambition is quite neutral as to whether the sensitivity reflects conscious or unconscious processing (i.e. is akin to blindsight as traditionally conceived, or what the reviewer here suggests may be happening in masked priming). Since we were evidently insufficiently clear about this we have revised our manuscript in several places to clarify that we take our data primarily to support the more modest claim that there is residual sensitivity (conscious or unconscious) in the group of subjects who are traditionally classified as inattentionally blind. We believe that this claim has much more solid support in our data than our secondary and tentative suggestion about awareness.
This said, we do consider masked priming studies to be susceptible to the critique that performance may reflect degraded conscious awareness which is unreported because of conservative response criteria. There is good evidence that response criteria tend to be conservative near threshold (Björkman et al. 1993; see also: Railo et al. 2020), including specifically in masked priming studies (Sand 2016, cited in Phillips 2021). So, we consider it a perfectly reasonable hypothesis that subjects who say they feel they are guessing in fact have conscious access to a degraded signal which is insufficient to reach a conservative response criterion but nonetheless sufficient to perform above chance in 2afc detection. Of course, we appreciate that this hypothesis is controversial, so it is not one we argue for in our paper (though we are happy to share our feelings about it here).
Main point 3: My last question is about how the authors interpret a variety of inattentional blindness findings. Previous work has found that observers fail to notice a gorilla in a CT scan (Drew et al., 2013), a fight occurring right in front of them (Chabris et al., 2011), a plane on a runway that pilots crash into (Haines, 1991), and so forth. In a situation like this, do the authors believe that many participants are truly aware of these items but simply failed to answer a yes/no question correctly? For example, imagine the researchers made participants choose if the gorilla was in the left or right lung and some participants who initially said they did not notice the gorilla were still able to correctly say if it was in the left or right lung. Would the authors claim "that participant actually did see the gorilla in the lung"? I ask because it is difficult to understand what it means to be aware of something as salient as a gorilla in a CT scan, but say "no" you didn't notice it when asked a yes/no question. What does it mean to be aware of such important, ecologically relevant stimuli, but not act in response to them and openly say "no" you did not notice them?
Our view is that in such cases, observers may well have a “degraded” percept of the relevant feature (gorilla, plane, fight etc.). But crucially we do not suggest that this percept is sufficient for observers to recognize the object/event as a gorilla, plane, fight etc. Our claim is only that, in our studies at least, observers (as a group) do have enough information about the unexpected stimuli to locate them, and discriminate certain low level features better than chance. Crudely, it may be that subjects see the gorilla simply as a smudge or the plane as a shadowy patch etc. (One of us who is familiar with the gorilla CT scan stimuli notes that the gorilla is in fact rather hard to see even when you know which slide it is on, suggesting that they are not as “salient” as the reviewer suggests!)
More precisely, in the paper we write that in our view perhaps “...unattended stimuli are encoded in a partial or degraded way. Here we see a variety of promising options for future work to investigate. One is that unattended stimuli are only encoded as part of ensemble representations or summary scene statistics (Rosenholtz, 2011; Cohen et al., 2016). Another is that only certain basic “low-level” or “preattentive” features (see Wolfe & Utochkin, 2019 for discussion) can enter awareness without attention. A final possibility consistent with the present data is that observers can in principle be aware of individual objects and higher-level features under inattention but that the precision of the corresponding representations is severely reduced. Our central aim here is to provide evidence that awareness in inattentional blindness is not abolished. Further work is needed to characterize the exact nature of that awareness.” We hope this sheds light on our perspective while still being appropriately cautious not to go too far beyond our data.
Overall: I believe there are many aspects of this set of studies that are innovative and I hope the methods will be used more broadly in the literature. However, I believe the authors misrepresent the field and overstate what can be interpreted from their results. While I am sure there are cases where more nuanced questions might reveal inattentional blindness is somewhat overestimated, claims like "the inattentionally blind can see after all" or "Inattentionally blind subjects consciously perceive thest stimuli after all" seem to be incorrect (or at least not at all proven by this data).
Once again, we would like to thank this reviewer for his feedback, which obviously comes from a place of tremendous expertise on these issues. We appreciate his assessment that our studies are innovative and that our methodological advances will be of use more broadly. We also hear the reviewer loud and clear about the passages in question, which on reflection we agree are not as central to our case as the other claims we make (regarding residual sensitivity and conservative responding), and so we have now edited them accordingly to refocus our discussion on only those claims that are central and supported. Thank you for making our paper stronger!
Reviewer #3 (Public review):
Summary:
Authors try to challenge the mainstream scientific as well as popularly held view that Inattentional
Blindness (IB) signifies subjects having no conscious awareness of what they report not seeing (after being exposed to unexpected stimuli). They show that even when subjects indicate NOT having seen the unexpected stimulus, they are at above chance level for reporting features such as location, color or movement of these stimuli. Also, they show that 'not seen' responses are in part due to a conservative bias of subjects, i.e. they tend to say no more than yes, regardless of actual visibility. Their conclusion is that IB may not (always) be blindness, but possibly amnesia, uncertainty etc.
We just thought to say that we felt this was a very accurate summary of our claims, and in ways underscore the modesty we had hoped to convey. This is especially true of the reviewer’s final sentence: “Their conclusion is that IB may not (always) be blindness, but possibly amnesia, uncertainty etc.”; as we noted in response to other reviewers, our claim is not that IB doesn’t exist, that subjects are always conscious of the stimulus, etc.; it is only that the cohort of IB subjects show sensitivity to the unattended stimulus in ways that suggest they are not as blind as traditionally conceived. Thank you for reading us as intended!
Strengths:
A huge pool of (25.000) subjects is used. They perform several versions of the IB experiments, both with briefly presented stimuli (as the classic Mack and Rock paradigm), as well as with prolonged stimuli moving over the screen for 5 seconds (a bit like the famous gorilla version), and all these versions show similar results, pointing in the same direction: above chance detection of unseen features, as well as conservative bias towards saying not seen.
We’re delighted that the reviewer appreciated these strengths in our manuscript!
Weaknesses:
Results are all significant but effects are not very strong, typically a bit above chance. Also, it is unclear what to compare these effects to, as there are no control experiments showing what performance would have been in a dual task version where subjects have to also report features etc for stimuli that they know will appear in some trials
The backdrop to the experiments reported here is the “consensus view” (Noah & Mangun, 2020) according to which inattention completely abolishes perception, such that subjects undergoing IB “have no awareness at all of the stimulus object” (Rock et al., 1992) and that “one can have one’s eyes focused on an object or event … without seeing it at all” (Carruthers, 2015). In this context, we think our findings of significant above-chance sensitivity (e.g., d′ = 0.51 for location in Experiment 1; chance, of course, would be d′ = 0 here) are striking and constitute strong evidence against the consensus view. We of course agree that the residual sensitivity is far lower than amongst subjects who noticed the stimulus. For this reason, we certainly believe that inattention has a dramatic impact on perception. To that extent, our data speak in favor of a “middle ground” view on which inattention substantially degrades but crucially does not abolish perception/explicit encoding. We see this as an importantly neglected option in a literature which has overly focused on seen/not seen binaries (see our section ‘Visual awareness as graded’).
Regarding the absence of a control condition, we think those conditions wouldn’t have played the same role in our experiments as they typically play in other experiments. As Reviewer #1 comments, the main role of such trials in previous work has been to exclude from analysis subjects who failed to report the unexpected stimulus on the divided and/or full attention control trials. As Reviewer #1 points out, excluding such subjects would very likely have ‘helped’ us. However, the practice is controversial. Indeed, in a review of 128 experiments, White et al. 2018 argue that the practice has “problematic consequences” and “may lead researchers to understate the pervasiveness of inattentional blindness". Since we wanted to offer as simple and demanding a test of residual sensitivity in IB as possible, we thus decided not to use any exclusions, and for that reason decided not to include divided/full attention trials.
As recommended, we discuss this decision not to include divided/full attention trials and our logic for not doing so in the manuscript. As we explain, not having those conditions makes it more impressive, not less impressive, that we observed the results we in fact did — it makes our results more interpretable, not less interpretable, and so absence of such conditions from our manuscript should not (in our view) be considered any kind of weakness.
There are quite some studies showing that during IB, neural processing of visual stimuli continues up to high visual levels, for example, Vandenbroucke et al 2014 doi:10.1162/jocn_a_00530 showed preserved processing of perceptual inference (i.e. seeing a kanizsa illusion) during IB. Scholte et al 2006 doi: 10.1016/j.brainres.2005.10.051 showed preserved scene segmentation signals during IB. Compared to the strength of these neural signatures, the reported effects may be considered not all that surprising, or even weak.
We agree that such evidence of neural processing in IB is relevant to — and perhaps indeed consistent with — our picture, and we’re grateful to the reviewer for pointing out further studies along those lines. Previously, we mentioned a study from Pitts et al., 2012 in which, as we wrote, “unexpected line patterns have been found to elicit the same Nd1 ERP component in both noticers and inattentionally blind subjects (Pitts et al., 2012).” We have added references to both the studies which the reviewer mentions – as well as an additional relevant study – to our manuscript in this context. Thank you for the helpful addition.
We do however think that our studies are importantly different to this previous work. Our question is whether processing under IB yields representations which are available for explicit report and so would constitute clear evidence of seeing, and perhaps even conscious experience. As we discuss, evidence for this kind of processing remains wanting: “A handful of prior studies have explored the possibility that inattentionally blind subjects may retain some visual sensitivity to features of IB stimuli (e.g., Schnuerch et al., 2016; see also Kreitz et al., 2020, Nobre et al., 2020). However, a recent meta-analysis of this literature (Nobre et al., 2022) argues that such work is problematic along a number of dimensions, including underpowered samples and evidence of publication bias that, when corrected for, eliminates effects revealed by earlier approaches, concluding “that more evidence, particularly from well-powered pre-registered experiments, is needed before solid conclusions can be drawn regarding implicit processing during inattentional blindness” (Nobre et al., 2022).” Our paper is aimed at addressing this question which evidence of neural processing can only speak to indirectly.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
(1) Please report all of the data, especially the number of subjects in each experiment that answered Y/N and the numbers of subjects in each of the Y and N groups that guessed a feature correctly/incorrectly on the 2AFC tasks. And also the confidence ratings for the 2AFC task (for comparison with the confidence ratings on the Y/N questions).
We now report all this data in our (revised) Supplementary Materials. We agree that this information will be helpful to readers.
(2) Consider adding a control condition with partial attention (dual task) or full attention (single task) to estimate the rates of seeing the critical stimulus when it's expected.
This is the only recommendation we have chosen not to implement. The reason, as we explain in detail above (especially in response to Reviewer #1 comment 5), is that this would not in fact be a “control condition” in our studies, and indeed would only inflate the biases we are concerned with in our work. As the referee comments, the main role of such trials in previous work has been to exclude from analysis subjects who failed to report the unexpected stimulus on the divided and/or full attention control trials. And the practice is controversial: Indeed, in a review of 128 experiments, White et al. 2018 argue that the practice has “problematic consequences” and “may lead researchers to understate the pervasiveness of inattentional blindness" (emphasis added). So, our choice not to have such conditions ensures an especially stringent test of our central claim. Not having those conditions (and their accompanying exclusions) makes our results more interpretable, not less interpretable, and so the absence of such conditions from our manuscript should not (in our view) be considered any kind of weakness.
We have added a paragraph to our “Design and analytical approach” section explaining the logic behind our deliberate decision not to include divided or full attention trials in our experiments. (For even fuller discussion, see our response to Reviewer #1’s comment 5 above.)
(3) Consider revising the interpretations to be more precise about the distinction between the super subject being above chance versus each individual subject who cannot be at chance or above chance because there was only a single trial per subject.
We have now done this throughout the manuscript, as discussed above. We have also added a substantive additional discussion to our “Design and analytical approach” section discussing what should be said about individual subjects in light of our group level data.
This was a very helpful point, and greatly clarifies the claims we wish to make in the paper. Thank you for this comment, which has certainly made our paper stronger.
Reviewer #2 (Recommendations for the authors):
I would be curious to hear the authors' response to two points:
(1) What do they have to say about prior studies that do more than just ask yes/no questions (and ask several follow-ups)? Are those studies "valid"?
A very substantial new discussion of this important point has been added. As you will see above, we comment on every one of the 18 papers this reviewer raised (as well as the general argument made); we contend that while many of these papers improve on past methodology in various ways, most in fact do “just ask yes/no questions”, and none of them makes the methodological advance we offer in our manuscript. However, this discussion has helped us clarify that very advance, and so working through this issue has really helped us improve our paper and make its relation to existing literature that much clearer. Thank you for raising this crucial point.
(2) Do the authors think it is possible that in many cases, people are just guessing about a critical item's location or color and this is at least in part a form of priming?
We have clarified our discussion in numerous places to further emphasize that our main point concerns above-chance sensitivity, not awareness. Given this, we take very seriously the hypothesis that something like priming of a kind sometimes proposed to occur in cases of blindsight or other putative cases of unconscious perception could be what is driving the responses in non-noticers.
Reviewer #3 (Recommendations for the authors):
(1) Control dual task version with expected stimuli would be nice
We have added a paragraph to our “Design and analytical approach” section explaining the logic behind our deliberate decision not to include divided or full attention trials, which would not in fact be a “control” task in our experiments. For full discussion, see our response to Reviewer 3 above, as well as our summary here in the Recommendations for Authors section in responding to Reviewer 1, recommendation (2).
(2) Please do a better job in discussing and introducing experiments about neural signatures during IB.
A discussion of Vandenbroucke et al. 2014 and Scholte et al. 2006 has been added to our discussion of neural signatures in IB, as well as an additional reference to an important early study of semantic processing in IB (Rees et al., 1999). Thank you for these very helpful suggestions!
-
-
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer 1:
(1) The notion of a “root” causal gene - which the authors define based on a graph theoretic notion of topologically sorting graphs - requires a graph that is directed and acyclic. It is the latter that constitutes an important weakness here - it simply is a large simplification of human biology to draw out a DAG including hundreds of genes and a phenotype Y and to claim that the true graph contains no cycles.
We agree that real causal graphs in biology often contain cycles. We now include additional experimental results with cyclic directed graphs in the Supplementary Materials. RCSP outperformed the other algorithms even in this setting, but we caution the reader that the theoretical interpretation of the RCS score may not coincide with a root causal effect when cycles exist:
“We also evaluated the algorithms on directed graphs with cycles. We generated a linear SEM over ρ + 1 = 1000 variables in
. We sampled the coefficient matrix β from a Bernoulli (1/(p − 1)) distribution but did not restrict the non-zero coefficients to the upper triangular portion of the matrix. We then proceeded to permute the variable ordering and weight each entry as in the Methods for the DAG. We repeated this procedure 30 times and report the results in Supplementary Figure 3.
RCSP again outperformed all other algorithms even in the cyclic case. The results suggest that conditioning on the surrogate ancestors also estimates the RCS well even in the cyclic case. However, we caution that an error term E<sub>i</sub> can affect the ancestors of
when cycles exist. As a result, the RCS may not isolate the causal effect of the error term and thus not truly coincide with the notion of a root causal effect in cyclic causal graphs.”
(2) I also encourage the authors to consider more carefully when graph structure learned from Perturb-seq can be ported over to bulk RNA-seq. Presumably this structure is not exactly correct - to what extent is the RCSP algorithm sensitive to false edges in this graph? This leap - from cell line to primary human cells - is also not modeled in the simulation. Although challenging - it would be ideal for the RCSP to model or reflect the challenges in correctly identifying the regulatory structure.
We now include additional experimental results, where we gradually increased the incongruence between the DAG modeling the Perturb-seq and the DAG modeling the bulk RNA-seq using a mixture of graphs. The performance of RCSP degraded gradually, rather than abruptly, with increasing incongruence. We therefore conclude that RCSP is robust to differences between the causal graphs representing Perturb-seq and bulk RNA-seq:
“We next assessed the performance of RCSP when the DAG underlying the Perturb-seq data differs from the DAG underlying the bulk RNA-seq data. We considered a mixture of two random DAGs in bulk RNA-seq, where one of the DAGs coincided with the Perturb-seq DAG and second alternate DAG did not. We instantiated and simulated samples from each DAG as per the previous subsection. We generated 0%, 25%, 50%, 75%, and 100% of the bulk RNA-seq samples from the alternate DAG, and the rest from the Perturb-seq DAG. We ideally would like to see the performance of RCSP degrade gracefully, as opposed to abruptly, as the percent of samples derived from the alternate DAG increases.
We summarize results in Supplementary Figure 4. As expected, RCSP performed the best when we drew all samples from the same underlying DAG for Perturb-seq and bulk RNA-seq. However, the performance of RCSP also degraded slowly as the percent of samples increased from the alternate DAG. We conclude that RCSP can accommodate some differences between the underlying DAGs in Perturb-seq and bulk RNA-seq with only a mild degradation in performance.”
(3) It should also be noted that in most Perturb-seq experiments, the entire genome is not perturbed, and frequently important TFs (that presumably are very far “upstream” and thus candidate “root” causal genes) are not expressed highly enough to be detected with scRNA-seq. In that context - perhaps slightly modifying the language regarding RCSP’s capabilities might be helpful for the manuscript - perhaps it would be better to describe it as an algorithm for causal discovery among a set of genes that were perturbed and measured, rather than a truly complete search for causal factors. Perhaps more broadly it would also benefit the manuscript to devote slightly more text to describing the kinds of scenarios where RCSP (and similar ideas) would be most appropriately applied - perhaps a well-powered, phenotype annotated Perturb-seq dataset performed in a disease relevant primary cell.
We now clarify that Perturb-seq can only identify root causal genes among the perturbed set of genes in the Discussion:
“Modern genome-wide Perturb-seq datasets also adequately perturb and measure only a few thousand, rather than all, gene expression levels. RCSP can only identify root causal genes within this perturbed and measured subset.”
We now also describe the scenario where RCSP can identify root causal genes well in the Introduction:
“Experiments demonstrate marked improvements in performance, when investigators have access to a large bulk RNA-seq dataset and a genome-wide Perturb-seq dataset from a cell line of a disease-relevant tissue.”
Reviewer 2:
(1) The process from health-to-disease is not linear most of the time with many checks along the way that aim to prevent the disease phenotype. This leads to a non-deterministic nature of the path from health-to-disease. In other words, with the same root gene perturbations, and depending on other factors outside of gene expression, someone may develop a phenotype in a year, another in 10 years and someone else never. Claiming that this information is included in the error terms might not be sufficient to address this issue. The authors should discuss this limitation.
The proposed approach accommodates the above non-deterministic nature. The error terms of
model factors that are outside of gene expression. We model the relation from gene expression to Y as probabilistic rather than deterministic because
, where E<sub>Y</sub> introduces stochasticity. Thus, two individuals with the same instantiations of the root causes may develop disease differently. We now clarify this in Methods:
“The error terms model root causes that are outside of gene expression, such as genetic variation or environmental factors. Moreover, the relation from gene expression to Y is stochastic because
, where E<sub>Y</sub> introduces the stochasticity. Two individuals may therefore have the exact same error term values over
but different instantiations of Y.”
(2) The paper assumes that the network connectivity will remain the same after perturbation. This is not always true due to backup mechanisms in the cells. For example, suppose that a cell wants to create product P and it can do it through two alternative paths: Path #1: A → B → P, Path #2: A → C → P. Now suppose that path #1 is more efficient, so when B can be produced, path #2 is inactive. Once the perturbation blocks element B from being produced, the graph connectivity changes by activation of path #2. I did not see the authors taking this into consideration, which seems to be a major limitation in using Perturb-seq results to infer conductivities.
We agree that backup mechanisms can exist and therefore now include additional experimental results, where we gradually increased the incongruence between the DAG modeling the Perturb-seq and the DAG modeling the bulk RNA-seq using a mixture of graphs. The performance of RCSP degraded gradually, rather than abruptly, with increasing incongruence. We therefore conclude that RCSP is robust to differences between the causal graphs representing Perturb-seq and bulk RNA-seq:
“We next assessed the performance of RCSP when the DAG underlying the Perturb-seq data differs from the DAG underlying the bulk RNA-seq data. We considered a mixture of two random DAGs in bulk RNA-seq, where one of the DAGs coincided with the Perturb-seq DAG and second alternate DAG did not. We generated 0%, 25%, 50%, 75%, and 100% of the bulk RNA-seq samples from the alternate DAG, and the rest from the Perturb-seq DAG. We ideally would like to see the performance of RCSP degrade gracefully, as opposed to abruptly, as the percent of samples derived from the alternate DAG increases.
We summarize results in Supplementary Figure 4. As expected, RCSP performed the best when we drew all samples from the same underlying DAG for Perturb-seq and bulk RNA-seq. However, the performance of RCSP also degraded slowly as the percent of samples increased from the alternate DAG. We conclude that RCSP can accommodate some differences between the underlying DAGs in Perturb-seq and bulk RNA-seq with only a mild degradation in performance.”
(3) There is substantial system heterogeneity that may cause the same phenotype. This goes beyond the authors claim that although the initial gene causes of a disease may differ from person to person, at some point they will all converge to changes in the same set of “root genes.” This is not true for many diseases, which are defined based on symptoms and lab tests at the patient level. You may have two completely different molecular pathologies that lead to the development of the same symptoms and test results. Breast cancer with its subtypes is a prime example of that. In theory, this issue could be addressed if there is infinite sample size. However, this assumption is largely violated in all existing biological datasets.
The proposed method accommodates the above heterogeneity. We do not assume that the root causes affect the same set of root causal genes. Instead the root causes and root causal genes may vary from person to person. We write in the Introduction:
“The problem is further complicated by the existence of complex disease, where a patient may have multiple root causal genes that differ from other patients even within the same diagnostic category... We thus also seek to identify patient-specific root causal genes in order to classify patients into meaningful biological subgroups each hopefully dictated by only a small group of genes.”
The root causal genes may further affect different downstream genes at the patient-specific level. However root causal genes tend to have many downstream effects so that virtually every gene expression level becomes correlated with Y. We now clarify this by describing the omnigenic root causal model in the Introduction as follows:
“Finally, application of the algorithm to two complex diseases with disparate pathogeneses recovers an omnigenic root causal model, where a small set of root causal genes drive pathogenesis but impact many downstream genes within each patient. As a result, nearly all gene expression levels are correlated with the diagnosis at the population level.”
(4) Were the values of the synthetic variables Z-scored?
Yes, all variables were z-scored. We now clarify this in Methods:
“We also standardized all variables before running the regressions to prevent gaming of the marginal variances in causal discovery (Reisach et al., 2021; Ng et al., 2024).”
(5) The algorithm seems to require both RNA-seq and Perturb-seq data (Algorithm 1, page 14). Can it function with RNA-seq data only? What will be different in this case?
The algorithm cannot function with observational bulk RNA-seq data only. We included Perturb-seq because causal discovery with observational RNA-seq data alone tends to be inaccurate and unstable, as highlighted by the results of CausalCell. We further emphasize that we do not rely on d-separation faithfulness in Methods, which is typically required for causal discovery from observational data alone:
“We can also claim the backward direction under d-separation faithfulness. We however avoid making this additional assumption because real biological data may not arise from distributions obeying d-separation faithfulness in practice.”
(6) Synthetic data generation: how many different graphs (SEMs) did they start from? (30?) How many samples per graph? Did they test different sample sizes?
We now clarify that we generate 30 random SEMs, each associated with a DAG. We used 200 samples for the bulk RNA-seq to mimic a relatively large but common sample size. We also drew 200 samples for each perturbation or control in the Perturb-seq data. We did not consider multiple sample sizes due to the time required to complete each run. Instead, we focused on a typical scenario where investigators would apply RCSP. We now write the following in the Methods:
“We drew 200 samples for the bulk RNA-seq data to mimic a large but common dataset size. We introduced knockdown perturbations in Perturb-seq by subtracting an offset of two in the softplus function:
. We finally drew 200 samples for the control and each perturbation condition to generate the Perturb-seq data. We repeated the above procedure 30 times.” We also include the following in Results:
“We obtained 200 cell samples from each perturbation, and another 200 controls without perturbations. We therefore generated a total of 2501 × 200 = 500,200 single cell samples for each Perturb-seq dataset. We simulated 200 bulk RNA-seq samples.”
(7) The presentation of comparative results (Supplementary Figures 4 and 7) is not clear. No details are given on how these results were generated. (what does it mean “The first column denotes the standard deviation of the outputs for each algorithm?”) Why all other methods have higher SD differences than RCSP? Is it a matter of scaling? Shouldn’t they have at least some values near zero since the authors “added the minimum value so that all histograms begin at zero?”
Each of these supplementary figures contains a 6 by 3 table of figures. By the first column, we mean column one (with rows 1 through 6) of each figure. The D-RCS and D-SD scores represent standard deviations of the RCS and SD scores from zero of each gene, respectively. We can similarly compute the standard deviation of the outputs of the algorithms. We now clarify this in the Supplementary Materials:
“The figure contains 6 rows and 3 columns. Similar to the D-RCS, we can compute the standard deviation of the output of each algorithm from zero for each gene. The first column in Supplementary Figure 7 denotes the histograms of these standard deviations across the genes.”
Many histograms do not appear to start at zero because the bars are too small to be visible. We now clarify this in the Supplementary Materials as well:
“Note that the bars at zero are not visible for many algorithms, since only a few genes attained standard deviations near the minimum.”
(8) Why RCSP results are more like a negative binomial distribution and every other is kind of normal?
All other methods have higher standard deviations than RCSP because they fail to compute an accurate measure of the root causal effect. Recall that, just like a machine has a few root causal problems, only a few root casual genes have large root causal effects under the omnigenic root causal model. The results of RCSP look more like a negative binomial distribution because most RCS scores are concentrated around zero and only a few RCS scores are large – consistent with the omnigenic root causal model. The other algorithms fail to properly control for the upstream genes and thus attain large standard deviations for nearly all genes. We now clarify these points in the Supplementary Materials as follows:
“If an algorithm accurately identifies root causal genes, then it should only identify a few genes with large conditional root causal effects under the omnigenic root causal model. The RCSP algorithm had a histogram with large probability mass centered around zero with a long tail to the right. The standard deviations of the outputs of the other algorithms attained large values for nearly all genes. Incorporating feature selection and causal discovery with CausalCell introduced more outliers in the histogram of ANM. We conclude that only RCSP detected an omnigenic root causal model.”
(9) What is the significance of genes changing expression “from left to right” in a UMAP plot? (e.g., Fig. 3h and 3g)
The first UMAP dimension captured the variability of the RCS scores for most root causal genes. As a result, we could focus our analysis on the black cluster in Figure 3 (g) with large RCS scores in the subsequent pathway enrichment analysis summarized in Figure 3 (j). If two dimensions were involved, then we would need to analyze at least two clusters (e.g., black and pink), but this was not the case. We now clarify this in Results:
“The RCS scores of most of the top genes exhibited a clear gradation increasing only from the left to the right hand side of the UMAP embedding; we plot an example in Figure 3 (h). We found three exceptions to this rule among the top 30 genes (example in Figure 3 (i) and see Supplementary Materials). RCSP thus detected genes with large RCS scores primarily in the black cluster of Figure 3 (g). Pathway enrichment analysis within this cluster alone yielded supra-significant results on the same pathway detected in the global analysis...”
(10) The authors somewhat overstate the novelty of their algorithm. Representation of GRNs as causal graphs dates back in 2000 with the work of Nir Friedman in yeast. Other methods were developed more recently that look on regulatory network changes at the single sample level which the authors do not seem to be aware (e.g., Ellington et al, NeurIPS 2023 workshop GenBio and Bushur et al, 2019, Bioinformatics are two such examples). The methods they mention are for single cell data and they are not designed to connect single sample-level changes to a person’s phenotype. The RCS method needs to be put in the right background context in order to bring up what is really novel about it.
We agree that many methods already exist for uncovering associational, predictive (Markov, neighborhood) and causal gene regulatory networks. We now cite the above papers. However, the novelty in our manuscript is not causal graph discovery, but rather estimation of root causal effects, detection of root causal genes, and the proposal of the omnigenic root causal model. We now clarify this in the
Introduction:
“Many algorithms focus on discovering associational or predictive relations, sometimes visually represented as gene regulatory networks (Costa et al., 2017; Ellington et al., 2023). Other methods even identify causal relations (Friedman et al., 2000; Wang et al., 2023; Wen et al., 2000; Buschur et al., 2000), but none pinpoint the first gene expression levels that ultimately generate the vast majority of pathogenesis. Simply learning a causal graph does not resolve the issue because causal graphs do not summarize the effects of unobserved root causes, such as unmeasured environmental changes or variants, that are needed to identify all root causal genes. We therefore define the Root Causal Strength (RCS) score...”
Reviewer 3:
(1) Several assumptions of the method are problematic. The most concerning is that the observational expression changes are all causally upstream of disease. There is work using Mendelian randomization (MR) showing that the opposite is more likely to be true: most differential expression in disease cohorts is a consequence rather than a cause of disease (Porcu et al., 2021). Indeed, the oxidative stress of AMD has known cellular responses including the upregulation of p53. The authors need to think carefully about how this impacts their framework. Can the theory say anything in this light? Simulations could also be designed to address robustness.
Strictly speaking, we believe that differential expression in disease most likely has a cyclic causal structure: gene expression causes a diagnosis or symptom severity, and a diagnosis or symptom severity lead to treatments and other behavioral changes that perturb gene expression. For example, revTMWR in Porcu et al. (2021) uses trans-variants that are less likely to directly cause gene expression and instead directly cause a phenotype. However, TWMR as proposed in Porcu et al. (2019) instead uses cis-eQTLs and finds many putative causal relations from gene expression to phenotype. Thus, both causal directions likely hold.
RCSP uses disease-relevant tissue believed to harbor gene expression levels that cause disease. However, RCSP theoretically cannot handle the scenario where Y is a non-sink vertex and is a parent of a gene expression level because modern Perturb-seq datasets usually do not perturb or measure Y. We therefore empirically investigated the degree of error by running experiments, where we set Y to a non-sink vertex, so that it can cause gene expression. We find that the performance of RCSP degrades considerably for gene expression levels that contain Y as a parent. Thus RCSP is sensitive to violations of the sink target assumption:
“We finally considered the scenario where Y is a non-sink (or non-terminal) vertex. If Y is a parent of a gene expression level, then we cannot properly condition on the parents because modern Perturbseq datasets usually do not intervene on Y or measure Y . We therefore empirically investigated the degradation in performance resulting from a non-sink target Y, in particular for gene expression levels where Y is a parent. We again simulated 200 samples from bulk RNA-seq and each condition of Perturbseq with a DAG over 1000 vertices, an expected neighborhood size of 2 and a non-sink target Y . We then removed the outgoing edges from Y and resampled the DAG with a sink target. We compare the results of RCSP for both DAGs in gene expression levels where Y is a parent. We plot the results in Supplementary Figure 5. As expected, we observe a degradation in performance when Y is not terminal, where the mean RMSE increased from 0.045 to 0.342. We conclude that RCSP is sensitive to violations of the sink target assumption.”
(2) A closely related issue is the DAG assumption of no cycles. This assumption is brought to bear because it is required for much classical causal machinery, but is unrealistic in biology where feedback is pervasive. How robust is RCSP to (mild) violations of this assumption? Simulations would be a straightforward way to address this.
We agree that real causal graphs in biology often contain cycles. We now include additional experimental results with cyclic directed graphs in the Supplementary Materials. RCSP outperformed the other algorithms even in this setting, but we caution the reader that the theoretical interpretation of the RCS score may not coincide with a root causal effect when cycles exist:
“We also evaluated the algorithms on directed graphs with cycles. We generated a linear SEM over p + 1 = 1000 variables in
. We sampled the coefficient matrix β from a Bernoulli (1/(p − 1)) distribution but did not restrict the non-zero coefficients to the upper triangular portion of the matrix. We then proceeded to permute the variable ordering and weight each entry as in the Methods for the DAG. We repeated this procedure 30 times and report the results in Supplementary Figure 3.
RCSP again outperformed all other algorithms even in the cyclic case. The results suggest that conditioning on the surrogate ancestors also estimates the RCS well even in the cyclic case. However, we caution that an error term E<sub>i</sub> can affect the ancestors of
, when cycles exist. As a result, the RCS may not isolate the causal effect of the error term and thus not truly coincide with the notion of a root causal effect in cyclic causal graphs.”
(3) The authors spend considerable effort arguing that technical sampling noise in X can effectively be ignored (at least in bulk). While the mathematical arguments here are reasonable, they miss the bigger picture point that the measured gene expression X can only ever be a noisy/biased proxy for the expression changes that caused disease: 1) Those events happened before the disease manifested, possibly early in development for some conditions like neurodevelopmental disorders. 2) bulk RNA-seq gives only an average across cell-types, whereas specific cell-types are likely “causal.” 3) only a small sample, at a single time point, is typically available. Expression in other parts of the tissue and at different times will be variable.
We agree that many other sources of error exist. The causal model of RNA-expression in Methods corresponds to a single snapshot in time for each sample. We now clarify this in the Methods as follows:
“We represent a snapshot of a biological causal process using an SEM over
obeying Equation (3).”
We thus only detect the root causal genes in a single snapshot in time for each sample in bulk RNA-seq. If we cannot detect the root causal effect in a gene due to the signal washing out over time as in (1), or if the root causal effect in different cell types cancel each other out to exactly zero in bulk as in (2), then we cannot detect those root causal genes even with an infinite sample size.
(4) While there are connections to the omnigenic model, the latter is somewhat misrepresented. The authors refer to the “core genes” of the omnigenic model as being at the end (longitudinal) of pathogenesis. The omnigenic model makes no statements about temporal ordering: in causal inference terminology the core genes are simply the direct causes of disease.
We now clarify that we use the word pathogenesis to mean the causal cascade from root causes to the diagnosis. In this case, the direct causes of the diagnosis correspond to the end of pathogenesis, while the root causes correspond to the beginning. For example, if
, with Y a diagnosis, then X<sub>1</sub> is a root causal gene while X<sub>2</sub> is a core (direct causal) gene. We now clarify this in the Introduction:
“Root causes of disease correspond to the most upstream causes of a diagnosis with strong causal effects on the diagnosis. Pathogenesis refers to the causal cascade from root causes to the diagnosis. Genetic and non-genetic factors may act as root causes and affect gene expression as an intermediate step during pathogenesis. We introduce root causal gene expression levels – or root causal genes for short – that correspond to the initial changes to gene expression induced by genetic and non-genetic root causes that have large causal effects on a downstream diagnosis (Figure 1 (a)). Root causal genes differ from core genes that directly cause the diagnosis and thus lie at the end, rather than at the beginning, of pathogenesis (Boyle et al., 2017).”
(5) A key observation underlying the omnigenic model is that genetic heritability is spread throughout the genome (and somewhat concentrated near genes expressed in disease relevant cell types). This implies that (almost) all expressed genes, or their associated (e)SNPs, are “root causes”.
We now clarify that genetic heritability can be spread throughout the genome in the omnigenic root causal model as well in the Discussion:
“Further, each causal genetic variant tends to have only a small effect on disease risk in complex disease because the variant can directly cause Y or directly cause any causal gene including those with small root causal effects on Y ; thus, all error terms that cause Y can model genetic effects on Y. However, the root causal model further elaborates that genetic and non-genetic factors often combine to produce a few root causal genes with large root causal effects, where non-genetic factors typically account for the majority of the large effects in complex disease. Many variants may therefore cause many genes in diseases with only a few root causal genes.”
We finally add Figure 5 into the Discussion as a concrete example illustrating the omnigenic root causal model:
(6) The claim that root causal genes would be good therapeutic targets feels unfounded. If these are highly variable across individuals then the choice of treatment becomes challenging. By contrast the causal effects may converge on core genes before impacting disease, so that intervening on the core genes might be preferable. The jury is still out on these questions, so the claim should at least be made hypothetical.
We clarify that we do not claim that root causal genes are better treatment targets than core genes in terms of magnitudes of causal effects on the phenotype. For example, in the common cold with a virus as the root cause, giving a patient an antiviral will eliminate fever and congestion, but so will giving a decongestant and an antipyretic. We only claim that treating root causal genes can eliminate disease near its pathogenic onset, just like giving an antiviral can eliminate the viral load and stop pathogenesis. We write the following the Introduction:
“Treating root causal genes can modify disease pathogenesis in its entirety, whereas targeting other causes may only provide symptomatic relief... Identifying root causal genes is therefore critical for developing treatments that eliminate disease near its pathogenic onset.”
We also further clarify in the Discussion that root causal genes account for deleterious causal effects not captured by the diagnosis Y:
“We finally emphasize that the root causal model accounts for all deleterious effects of the root causal genes, whereas the core gene model only captures the deleterious effects captured by the diagnosis Y. For example, the disease of diabetes causes retinopathy, but retinopathy is not a part of the diagnostic criteria of diabetes. As a result, the gene expression levels that cause retinopathy but not the diagnosis of diabetes are not core genes, even though they are affected by the root causal genes.”
We do agree that root causal genes may differ substantially between patients, although it is unclear if the heterogeneity is too great to develop treatments.
(7) The closest thing to a gold standard I believe we have for “root causal genes” is integration of molecular QTLs and GWAS, specifically coloc/MR. Here the “E” of RCSP are explicitly represented as SNPs. I don’t know if there is good data for AMD but there certainly is for MS. The authors should assess the overlap with their results. Another orthogonal avenue would be to check whether the root causal genes change early in disease progression.
Colocalization and Mendelian randomization unfortunately cannot identify root causal effects because they all attempt, either heuristically (colocalization) or rigorously (MR), to identify variants that cause each gene expression level rather than variants that directly cause each gene expression level and thus make up the error terms. We therefore need new methods that can identify direct causal variants in order to assess overlap.
We checked whether root causal genes change early in disease progression using knowledge of pathogenesis. In particular, oxidative stress induces pathogenesis in AMD, and RCSP identified root causal genes involved in oxidative stress in AMD:
“The pathogenesis of AMD involves the loss of RPE cells. The RPE absorbs light in the back of the retina, but the combination of light and oxygen induces oxidative stress, and then a cascade of events such as immune cell activation, cellular senescence, drusen accumulation, neovascularization and ultimately fibrosis (Barouch et al., 2007). We therefore expect the root causal genes of AMD to include genes involved in oxidative stress during early pathogenesis. The gene MIPEP with the highest D-RCS score in Figure 3 (d) indeed promotes the maturation of oxidative phosphorylation-related proteins (Shi et al., 2011). The second gene SLC7A5 is a solute carrier that activates mTORC1 whose hyperactivation increases oxidative stress via lipid peroxidation (Nachef et al., 2021; Go et al., 2020). The gene HEATR1 is involved in ribosome biogenesis that is downregulated by oxidative stress (Turi et al., 2018). The top genes discovered by RCSP thus identify pathways known to be involved in oxidative stress.”
Similarly, T cell infiltration across the blood brain barrier initiates pathogenesis in MS, and RCSP identified root causal genes involved in this infiltration:
“Genes with the highest D-RCS scores included MNT, CERCAM and HERPUD2 (Figure 4 (d)). MNT is a MYC antagonist that modulates the proliferative and pro-survival signals of T cells after engagement of the T cell receptor (Gnanaprakasam et al., 2017). Similarly, CERCAM is an adhesion molecule expressed at high levels in microvessels of the brain that increases leukocyte transmigration across the blood brain barrier (Starzyk et al., 2000). HERPUD2 is involved in the endoplasmic-reticulum associated degradation of unfolded proteins (Kokame et al., 2000). Genes with the highest D-RCS scores thus serve key roles in known pathogenic pathways of MS.”
(8) The available Perturb-seq datasets have limitations beyond on the control of the authors. 1) The set of genes that are perturbed. The authors address this by simply sub-setting their analysis to the intersection of genes represented in the perturbation and observational data. However, this may mean that a true ancestor of X is not modeled/perturbed, limiting the formal claims that can be made. Additionally, some proportion of genes that are nominally perturbed show little to no actual perturbation effect (for example, due to poor guide RNA choice) which will also lead to missing ancestors.
We now clarify that Perturb-seq can only identify root causal genes among the adequately perturbed set of genes in the Discussion:
“Modern genome-wide Perturb-seq datasets also only adequately perturb and measure a few thousand, rather than all, gene expression levels. RCSP can only identify root causal genes within this perturbed and measured subset.”
(9) The authors provide no mechanism for statistical inference/significance for their results at either the individual or aggregated level. While I am a proponent of using effect sizes more than p-values, there is still value in understanding how much signal is present relative to a reasonable null.
We now explain that RCSP does not perform statistical inference in Methods because it is not clear how to define the appropriate cut-off for the RCS score under the null distribution:
“We focus on statistical estimation rather than statistical inference because Φ<sub>i</sub> > 0 when E<sub>i</sub> causes Y under mild conditions, so we reject the null hypothesis that Φ<sub>i</sub> \= 0 for many genes if many gene expression levels cause Y. However, just like a machine typically breaks down due to only one or a few root causal problems, we hypothesize that only a few genes have large RCS scores Φ<sub>i</sub> ≫ 0 even in complex disease.”
(10) I agree with the authors that age coming out of a “root cause” is potentially encouraging. However, it is also quite different in nature to expression, including being “measured” exactly. Will RCSP be biased towards variables that have lower measurement error?
We tested the above hypothesis by plotting sequencing depth against the D-RCS scores of each gene. We observed a small negative correlation between sequencing depth and D-RCS scores, indicating the D-RCS scores are slightly biased upwards with low sequencing depth. However, genes with the largest D-RCS scores exhibited a wide variety of sequencing depths in both MS and AMD, suggesting that sequencing depth has minimal effect on the largest D-RCS scores. We now explain these results for AMD in the Supplementary Materials:
“Theorem 1 states that RCS scores may exhibit bias with insufficient sequencing depth. The genes with large D-RCS scores may therefore simply have low sequencing depths. To test this hypothesis, we plotted sequencing depth against D-RCS scores. Consistent with Theorem 1, we observed a small negative correlation between D-RCS and sequencing depth (ρ \= −0.16, p=2.04E-13), and D-RCS scores exhibited greater variability at the lowest sequencing depths (Supplementary Figure 8). However, genes with the largest D-RCS scores had mean sequencing depths interspersed between 20 and 3000. We conclude that genes with the largest D-RCS scores had a variety of sequencing depths ranging from low to high.”
We also report the results for MS:
“We plot sequencing depth against the D-RCS scores of each gene similar to the AMD dataset. We again observed a small negative correlation (ρ \= −0.136, p_<_2.2E-16), indicating that genes with low sequencing depths had slightly higher D-RCS scores on average (Supplementary Figure 12). However, genes with the largest D-RCS scores again had a variety of sequencing depths. We conclude that sequencing depth has minimal correlation with the largest D-RCS scores.”
(11) Finally, it’s a stretch to call K562 cells “lymphoblasts.” They are more myeloid than lymphoid.
We now clarify that K562 cells are undifferentiated blast cells that can be induced to differentiate into lymphoblasts in Results:
“We next ran RCSP on 137 samples collected from CD4+ T cells of multiple sclerosis (MS; GSE137143) as well as Perturb-seq data of 1,989,578 undifferentiated blast cells that can be induced to differentiate into lymphoblasts, or the precursors of T cells and other lymphocytes.”
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public review):
Summary:
In this manuscript, Dong et al. study the directed cell migration of tracheal stem cells in Drosophila pupae. The migration of these cells which are found in two nearby groups of cells normally happens unidirectionally along the dorsal trunk towards the posterior. Here, the authors study how this directionality is regulated. They show that inter-organ communication between the tracheal stem cells and the nearby fat body plays a role. They provide compelling evidence that Upd2 production in the fat body and JAK/STAT activation in the tracheal stem cells play a role. Moreover, they show that JAK/STAT signalling might induce the expression of apicobasal and planar cell polarity genes in the tracheal stem cells which appear to be needed to ensure unidirectional migration. Finally, the authors suggest that trafficking and vesicular transport of Upd2 from the fat body towards the tracheal cells might be important.
Strengths:
The manuscript is well written. This novel work demonstrates a likely link between Upd2JAK/STAT signalling in the fat body and tracheal stem cells and the control of unidirectional cell migration of tracheal stem cells. The authors show that hid+rpr or Upd2RNAi expression in a fat body or Dome RNAi, Hop RNAi, or STAT92E RNAi expression in tracheal stem cells results in aberrant migration of some of the tracheal stem cells towards the anterior. Using ChIP-seq as well as analysis of GFP-protein trap lines of planar cell polarity genes in combination with RNAi experiments, the authors show that STAT92E likely regulates the transcription of planar cell polarity genes and some apicobasal cell polarity genes in tracheal stem cells which appear to be needed for unidirectional migration. Moreover, the authors hypothesise that extracellular vesicle transport of Upd2 might be involved in this Upd2-JAK/STAT signalling in the fat body and tracheal stem cells, which, if true, would be quite interesting and novel.
Overall, the work presented here provides some novel insights into the mechanism that ensures unidirectional migration of tracheal stem cells that prevents bidirectional migration. This might have important implications for other types of directed cell migration in invertebrates or vertebrates including cancer cell migration.
Weaknesses:
It remains unclear to what extent Upd2-JAK/STAT signalling regulates unidirectional migration. While there seems to be a consistent phenotype upon genetic manipulation of Upd2-JAK/STAT signalling and planar cell polarity genes, as in the aberrant anterior migration of a fraction of the cells, the phenotype seems to be rather mild, with the majority of cells migrating towards the posterior.
We agree that the phenotype is mild, as perturbing JAK/STAT signaling in the progenitors specifically affects the coordinated migration of the cells rather than alters their direction or completely blocks migration. Our data indicate that inter-organ communication ensures coordinated behavior of the progenitor cells, although the differential responses exhibited by individual cells represent an interesting unresolved issue that awaits future in-depth investigation.
While I am not an expert on extracellular vesicle transport, the data presented here regarding Upd2 being transported in extracellular vesicles do not appear to be very convincing.
We performed additional PLA experiments which support the interaction between Upd2 and the core components of extracellular vesicles (revised Figure 8). Furthermore, we performed electron microscopy to visualize the Lbm-containing vesicles in fat body (Figure 8-figure supplement 1D).
These data are now provided in the revised manuscript.
Major comments:
(1) The graphs showing the quantification of anterior (and in some cases also posterior migration) are quite confusing. E.g. Figure 1F (and 5E and all others): These graphs are difficult to read because the quantification for the different conditions is not shown separately. E.g. what is the migration distance for Fj RNAi anterior at 3h in Fig5E? Around -205micron (green plus all the other colors) or around -70micron (just green, even though the green bar goes to -205micron). If it's -205micron, then the images in C' or D' do not seem to show this strong phenotype. If it's around -70, then the way the graph shows it is misleading, because some readers will interpret the result as -205. Moreover, it's also not clear what exactly was quantified and how it was quantified. The details are also not described in the methods. It would be useful, to mark with two arrowheads in the image (e.g. 5 A' -D') where the migration distance is measured (anterior margin and point zero).
Overall, it would be better, if the graph showed the different conditions separately. Also, n numbers should be shown in the figure legend for all graphs.
We apologize for those inappropriate presentation and insufficient description and thank you for kindly pointing them out. We used different colors to represent different genotypes, and the columns were superimposed. we chose to show the quantification in different conditions separately in the revised Figures. The anterior migration distance for Fj RNAi is around 70 µm.
We now provided detailed description in the revised methods. For migration distance measurement, we took snapshots at 0hr\ 1hr\ 2hr and 3hr, and measured the distance from the starting point (the junction of TC and DT) to the leading edge of progenitor clusters. The velocity formula: v=d (micrometer)/t (min). As you kindly suggested, we indicated the anterior margin and point zero in the corresponding panels. We have added n number in the legends.
(2) Figure 2-figure supplement 1: C-L and M: From these images and graph it appears that Upd2 RNAi results in no aberrant anterior migration. Why is this result different from Figures 2D-F where it does?
The fat body-expressing lsp2-Gal4 was used in Figure 2-figure supplement 1C-L and Figure 2D-F, while trachea specific btl-Gal4 was used in Figure 2-figure supplement 1K-L. The lsp2-Gal4-driven but not btl-Gal4-driven upd2RNAi causes aberrant anterior migration, suggesting that fat bodyderived Upd2 plays a role. We have further clarified this in the text.
(3) Figure 5F: The data on the localisation of planar cell polarity proteins in the tracheal stem cell group is rather weak. Figure 5G and J should at least be quantified for several animals of the same age for each genotype. Is there overall more Ft-GFP in the cells on the posterior end of the cell group than on the opposite side? Or is there a more classic planar cell polarity in each cell with FtGFP facing to the posterior side of the cell in each cell? Maybe it would be more convincing if the authors assessed what the subcellular localisation of Ft is through the expression of Ft-GFP in clones to figure out whether it localises posteriorly or anteriorly in individual cells.
We staged the animals, measured several animals for each genotype and provided the quantifications in the revised manuscript. The level of Ft-GFP is higher in the cells at the frontal edge. We tried to examine the expression of Ft-GFP at single-cell level. However, this turned out to be technically difficult because the tracheal stem cells are not regularly arranged as epithelial cells and the proximal-distant axis of the tracheal stem cells remains unclear. We thus decided to measure the fluorescence signal of groups of stem cells along the DT regardless of their individual polarity within cells.
(4) Regarding the trafficking of Upd2 in the fat body, is it known, whether Grasp65, Lbm, Rab5, and 7 are specifically needed for extracellular vesicle trafficking rather than general intracellular trafficking? What is the evidence for this?
In our experiments, knocking down rab5, rab7, grasp65 or lbm in trachea using btl-Gal4 did not cause abnormality in the disciplined migration, which excludes their intracellular contribution in the trachea (Figure 7-figure supplement 1). Perturbation of Grasp65 or Lbm in fat body increased intracellular upd2-containing vesicles, indicating that intracellular production is functional (Figure 6J). The Grasp65 is specifically required for Upd2 production. Lbm, Rab5 and Rab7 are important of vesicle trafficking. Our conclusion does not pertain to extracellular or intracellular compartment.
(5) Figure 8A-B: The data on the proximity of Rab5 and 7 to the Upd2 blobs are not very convincing.
The confocal images indicate the proximity of Rab5 and Rab7 to the Upd2 vesicles. We interpret the proximity together with the results from Co-IP and PLA data (Figure 8E-K).
(6) The authors should clarify whether or not their work has shown that "vesicle-mediated transport of ligands is essential for JAK/STAT signaling". In its current form, this manuscript does not appear to provide enough evidence for extracellular vesicle transport of Upd2.
Lbm belongs to the tetraspanin protein family that contains four transmembrane domains, which are the principal components of extracellular vesicles. We show that Lbm interacts with Upd2. The JAK/STAT signaling depends on the Upd2 in the fat body as well as vesicle trafficking machinery. Furthermore, we performed electron microscopy and show the presence of Lbm-containing vesicles in fat body (Figure 8-figure supplement 1D).
(7) What is the long-term effect of the various genetic manipulations on migration? The authors don't show what the phenotype at later time points would be, regarding the longer-term migration behaviour (e.g. at 10h APF when the cells should normally reach the posterior end of the pupa). And what is the overall effect of the aberrant bidirectional migration phenotype on tracheal remodelling?
We observed that the integrity of tracheal network especially the dorsal trunk was impaired, which may be due to incomplete regeneration (Figure 3-figure supplement1E-I).
(8) The RNAi experiments in this manuscript are generally done using a single RNAi line. To rule out off-target effects, it would be important to use two non-overlapping RNAi lines for each gene.
We validated the phenotype using several independent RNAi alleles.
Reviewer #2 (Public review):
Summary:
This work by Dong and colleagues investigates the directed migration of tracheal stem cells in Drosophila pupae, essential for tissue homeostasis. These cells, found in two nearby groups, migrate unidirectionally along the dorsal trunk towards the posterior to replenish degenerating branches that disperse the FGF mitogen. The authors show that inter-organ communication between tracheal stem cells and the neighboring fat body controls this directionality. They propose that the fat body-derived cytokine Upd2 induces JAK/STAT signaling in tracheal progenitors, maintaining their directional migration. Disruption of Upd2 production or JAK/STAT signaling results in erratic, bidirectional migration. Additionally, JAK/STAT signaling promotes the expression of planar cell polarity genes, leading to asymmetric localization of Fat in progenitor cells. The study also indicates that Upd2 transport depends on Rab5- and Rab7-mediated endocytic sorting and Lbm-dependent vesicle trafficking. This research addresses inter-organ communication and vesicular transport in the disciplined migration of tracheal progenitors.
Strengths:
This manuscript presents extensive and varied experimental data to show a link between Upd2JAK/STAT signaling and tracheal progenitor cell migration. The authors provide convincing evidence that the fat body, located near the trachea, secretes vesicles containing the Upd2 cytokine. These vesicles reach tracheal progenitors and activate the JAK-STAT pathway, which is necessary for their polarized migration. Using ChIP-seq, GFP-protein trap lines of planar cell polarity genes, and RNAi experiments, the authors demonstrate that STAT92E likely regulates the transcription of planar cell polarity genes and some apicobasal cell polarity genes in tracheal stem cells, which seem to be necessary for unidirectional migration.
Weaknesses:
Directional migration of tracheal progenitors is only partially compromised, with some cells migrating anteriorly and others maintaining their posterior migration.
Our results suggest that Upd2-JAK/STAT signaling is required for the consistency of disciplined migration. Although only a few tracheal progenitors display anterior migration, these cells lose the commitment of directional movement. We acknowledge that the phenotype is moderate.
Additionally, the authors do not examine the potential phenotypic consequences of this defective migration.
We examined the long-term effects of the aberrant migration and observed an impairment of tracheal integrity and melanized tracheal branches (Figure 3-figure supplement1E-I).
It is not clear whether the number of tracheal progenitors remains unchanged in the different genetic conditions. If there are more cells, this could affect their localization rather than migration and may change the proposed interpretation of the data.
We examined the progenitor cell number in bidirectional movement samples and control group. The results show that cell number does not exhibit a significant difference between control and bidirectional movement groups (Figure 3-figure supplement 1).
Upd2 transport by vesicles is not convincingly shown.
We performed additional PLA experiments to further support the interaction between Upd2 and the core components of extracellular vesicles. Furthermore, we performed electron microscopy and show the presence of Lbm-containing vesicles in fat body (Figure 8-supplement 1D). Additional experiments such as colocalization and Co-IP assay and better quantification are provided in the revised manuscript (see revised Figure 8).
Data presentation is confusing and incomplete.
We used different colors to represent different genotypes, and the columns were superimposed. we changed the graphs to show the quantification in different conditions separately. We revised data presentation to avoid confusing.
Reviewer #3 (Public review):
Summary:
Dong et al tackle the mechanism leading to polarized migration of tracheal progenitors during Drosophila metamorphosis. This work fits in the stem cell research field and its crucial role in growth and regeneration. While it has been previously reported by others that tracheal progenitors migrate in response to FGF and Insulin signals emanating from the fat body in order to regenerate tracheal branches, the authors identified an additional mechanism involved in the communication of the fat body and tracheal progenitors.
Strengths:
The data presented were obtained using a wide range of complementary techniques combining genetics, molecular biology, quantitative, and live imaging techniques. The authors provide convincing evidence that the fat body, found in close proximity to the trachea, secrete vesicles containing the Upd2 cytokine that reach tracheal progenitors leading to JAK-STAT pathway activation, which is required for their polarized migration. In addition, the authors show that genes regulating planar cell polarity are also involved in this inter-organ communication.
Weaknesses:
(1) Affecting this inter-organ communication leads to a quite discrete phenotype where polarized migration of tracheal progenitors is partially compromised. The study lacks data showing the consequences of this phenotype on the final trachea morphology, function, and/or regeneration capacities at later pupal and adult stages. This could potentially increase the significance of the findings.
Regarding your kind suggestion, we examined the long-term effects of the aberrant migration and observed the impairment of tracheal integrity and melanized tracheal branches (Figure 3-figure supplement1E-I).
(2) The conclusions of this paper are mostly well supported by data, but some aspects of data acquisition and analysis need to be clarified and corrected, such as recurrent errors in plotting of tracheal progenitor migration distance that mislead the reader regarding the severity of the phenotype.
We used different colors to represent different genotypes, and the columns were superimposed. we changed the graphs to show the quantification in different conditions separately. We thank you for kindly pointing it out.
(3) The number of tracheal progenitors should be assessed since they seem to be found in excess in some genetic conditions that affect their behavior. A change in progenitor number could lead to crowding, thus affecting their localization rather than migration capacities, thereby changing the proposed interpretation. In addition, the authors show data suggesting a reduced progenitor migration speed when the fat body is affected, which would also be consistent with a crowding of progenitors.
We examined the cell number in bidirectional movement samples and control group. We examined cell number and cell proliferation and observed that there was no significance between control and bidirectional movement groups (Figure 3-figure supplement 2).
(4) The authors claim that tracheal progenitors display a polarized distribution of PCP proteins that is controlled by JAK-STAT signaling. However, this conclusion is made from a single experiment that is not quantified and for which there is no explanation of how the plot profile measurements were performed. It also seems that this experiment was done only once. Altogether, this is insufficient to support the claim. Finally, a quantification of the number of posterior edges presenting filopodia rather than the number of filopodia at the anterior and posterior leading edges would be more appropriate.
We staged the animals, measured several animals for each genotype and provided the quantifications in the revised manuscript. The level of Ft-GFP is higher in the cells at the frontal edge. We tried to examine the expression of Ft-GFP at single-cell level. However, this turned out to be difficult due to the fact that the tracheal stem cells are not regularly patterned as epithelial cells and the proximaldistant axis of tracheal stem cells is not well defined. We thus decided to measure the fluorescence signal of groups of stem cells along the DT regardless of their individual polarity.
(5) The authors demonstrate that Upd2 is transported through vesicles from the fat body to the tracheal progenitors where they propose they are internalized. Since the Upd2 receptor Dome ligand binding sites are exposed to the extracellular environment, it is difficult to envision in the proposed model how Upd2 would be released from vesicles to bind Dome extracellularly and activate the JAK-STAT pathway. Moreover, data regarding the mechanism of the vesicular transport of Upd2 are not fully convincing since the PLA experiments between Upd2 and Rab5, Rab7, and Lbm are not supported by proper positive and negative controls and co-immunoprecipitation data in the main figure do not always correlate to the raw data.
We use molecular modeling to show that Upd2 and Lbm intermingle, and Upd2 is not entirely encapsulated in vesicles (Figure 8-supplement 1E). We performed PLA experiments using the animals not expressing upd2-Cherry as negative control (Figure 8 E-J). We corrected the Co-IP panel and apologize for this error.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
Minor comments:
(1) Figure 1-figure supplement 1: E: How was the migration velocity assessed? By live imaging individual cells or following the cell front of the group? Over what time period? Do the data points in the graph correspond to individual cells or the cell group? It would be important to show confocal images that go along with this quantification.
We took snapshots of pupae at 0hr\ 1hr\ 2hr and 3hr, and measured the distance covered by the migrating progenitor cells from the start place (the junction of TC and DT) to the leading edge of progenitor groups. We then calculated the migration rate by v=d (micrometer)/t (min). As the progenitor cells revolve around and migrate along the DT, tracking single tracheoblast through intact cuticle is technically challenging. We have therefore measured the leading edge as a proxy to the whole cell group. We agree with you that time-lapse imaging is favorable for analysis of migration.
(2) Figure 1-figure supplement 1: F: Why is there Gal80ts in the genotype? (and in Figure 1H). Also, what pupal age was used for this quantification?
Expression of hid and rpr in L3 stage impaired fat body integrity and adipocyte abundance, and caused lethality. Gal80ts was used for controlling the expression of rpr.hid. The pupal at 0hr APF were used in EdU experiment.
(3) Figure 2C: what is shown in the 6 columns (why 3 each for control and rpr/hid)?
We conducted 3 replicates of each group for control and rpr.hid.
(4) In the methods, several Drosophila stocks are listed as 'source:" from a particular person (e.g. Dr Ma). Please list the real source of this stock, e.g. Bloomington stock number, or the lab and publication in which the stock was originally made.
We provide the information on these stocks in the revised methods.
(5) The SKOV3 carcinoma cell and S2 cell work is not described in the methods.
We added detailed description of this experiment in the revised method-Cell culture and transfection.
(6) Figure 6 (F) 'Bar graph plots the abundance of Upd2-mCherry-containing vesicles in progenitors.' What does abundance mean? What was quantified, the number of vesicles, or the mean intensity? This is also not mentioned in the methods.
We counted the number of Upd2-mCherry-containing vesicles in fat body cells and trachea progenitors and added the description of measurement in the method.
(7) There are a few language mistakes throughout the manuscript. E.g.
(a) Line 117 and other places: Language: 'fat body' should be 'the fat body'.
We thank you for pointing out these errors and corrected it accordingly.
(b) Line 1276 Language mistakes: 'Video 1 3D-view of confocal image stacks of tracheal progenitors and fat body. Scale bar: 100 μm. Genotypes: UAS-mCD8-GFP/+;lsp2-Gal4,P[B123]-RFP-moe/+.' :stacks and genotypes should be singular.
We fixed these errors and thank you for kindly pointing them out. We also proofread the entire manuscript to assure accuracy.
(8) In general, it is hard to figure out the exact genotypes used in experiments. This is mostly not written very clearly in the figure legends. E.g. Figure 2: genotype for A-C missing in figure legend (is B from control animals?)
We added genotypes in the figure legends. For Figure 2, A and C lsp2-Gal4,P[B123]-RFP-moe/+ for control, UAS-rpr-hid/+;Gal80ts/+;lsp2-Gal4,P[B123]-RFP-moe/+ for rpr.hid; B from control animals.
Reviewer #2 (Recommendations for the authors):
Major comments:
(1) The phenotype resulting from Upd2 downregulation by RNAi is subtle and shown by unconvincing images. In addition, these phenotypes are analyzed using only one RNAi line.
We used two independent alleles of upd2RNAi from THFC (THU1288 and THU1331), and observed similar phenotype. For RNAi experiments, we always use multiple independent alleles.
(2) The authors should analyze the phenotypic consequences of directional migration changes. Is there an effect on tracheal remodeling?
We observed that the integrity of tracheal network especially the dorsal trunk was impaired and that melanized tracheal branches were present, which may be due to incomplete regeneration (Figure 3figure supplement1E-I).
(3) The number of tracheal progenitors should be quantified, as some genetic conditions may affect cell numbers, as is apparent in some panels.
We examined cell number and cell proliferation and observed that there was no significance between control and bidirectional movement groups (Figure 3-figure supplement 1).
(4) The data on PCP protein distribution are unconvincing, unquantified, and insufficient to support one of the main conclusions of the study, which is stated in the abstract: "JAK/STAT signaling promotes the expression of genes involved in planar cell polarity, leading to asymmetric localization of Fat in progenitor cells."
We staged the animals, measured several animals for each genotype and provided the quantifications in the revised manuscript. The level of Ft-GFP is higher in the cells at the frontal edge. We tried to examine the expression of Ft-GFP at single-cell level. However, this turned out to be difficult due to the fact that the tracheal stem cells are not regularly patterned as epithelial cells and the proximaldistant axis of tracheal stem cells is not well defined. We thus decided to measure the fluorescence signal of groups of stem cells along the DT regardless of their individual polarity.
Minor comments:
(1) Language should be revised. In many places in the manuscript, starting in line 113, "fat body" should be "the fat body".
Thank you for pointing out this error. We corrected it accordingly.
(2) Genotypes used in experiments should be described.
We added all the genotypes. We proofread the entire manuscript to complete the figure legends for genotypes.
(3) Line 67, the reference to "The progenitor cells reside in Tr4 and Tr5 metameres and start to move along the tracheal branch" should include (Chen and Krasnow, Science 2014).
We added the reference in the manuscript.
(4) Line 1081, Figure 7 Legend. "Bar graph plots the abundance of Upd2-mCherry-containing vesicles" Abundance is the number of vesicles? The graph displays the average number of vesicles? Please explain and describe the quantification.
The bar graph represents the number of Upd2-mCherry-containing vesicles in different conditions. We quantified the number of vesicles per area.
(5) Figure 1 (I-J) What is shown on the panels? Progenitors marked with? This information is not present in the figure or figure legend. Same for Figure 2 (D-E).
Figure 1I-J show the vector of migrating progenitors. We added the information in the legends. The tracheal cells were labeled by nls-mCherry in Figure 1I-J. In Figure 2D-E, the progenitors were marked with P[B123]-RFP-moe.
(6) Figure 3 Q, Stat92E-GFP values in the graph are not well-explained. What do the numbers in the y-axis refer to?
y-axis represents the intensity of Stat92E-GFP normalized to control. We have changed the y-axis label to ‘normalized Stat92E-GFP intensity’ in the legends.
(7) In general, figures and figure legends must be revised. Sometimes stainings are not well-defined, some scale bars are missing and plots do not say what the values are.
We apologized for inadequate information and have revised the figures and legends accordingly.
Reviewer #3 (Recommendations for the authors):
Several points should be addressed by the authors in order to improve their manuscript.
Major points:
(1) The phenotype obtained from decreasing the inter-organ signaling is quite discrete. It is further weakened by the fact that the images chosen to illustrate the measures are not really convincing. No image at 1h APF shows any clear anterior migration. Based on the scale, most of the images at 3h APF do not show a striking difference compared to the control, and in any case, stronger phenotypes would be missed anteriorly since they would thus be out of frame. In addition, at 3h APF, progenitors migrating anteriorly from Tr5 position get mixed with those migrating posteriorly from Tr4 so it is not clear how measurements were made. Given that most phenotypes are observed upon the use of RNAis, it is possible that phenotypes are weak due to persistent gene expression. Using null clones for dome, hop, or stat in progenitors could therefore aggravate the phenotypes and support further the significance of the study. Finally, assessing the consequences of compromised fat body-tracheal communication on trachea morphology, function, and regeneration later in pupal development and on adult flies would also help strengthen the importance of the findings.
We agree with you that anteriorly migrated Tr5 progenitors adjoining Tr4 progenitor hinders measurements and that mutants may give stronger phenotype than RNAi lines. We only measured Tr4 progenitors (instead of Tr5) when assessing anterior migration. Thus, we performed experiments using mutant alleles, which gave aberrant migration of tracheal progenitors (Figure 3-figure supplement1A-D). We can now show that the integrity of tracheal network especially dorsal trunk was impaired, which may be due to incomplete regeneration (Figure 3-figure supplement1E-I).
(2) Although the authors did not observe defects in tracheal progenitor proliferation, progenitors seem to be present in excess in some key genetic background (e.g, upon expression of rpr.hid, statRNAi, Rab-RNAi or in the presence of BFA). This excess could be the result of another mechanism than proliferation (recruitment of extra progenitors since it is not clear how they originate, defect in apoptosis...) and could impact the localization of progenitors, those being pushed anteriorly as a consequence of crowding. A proper characterization of tracheal progenitor number would thus help to discriminate between defects in migration or crowding. This point could also be addressed by performing individual tracking of tracheal progenitors, to find out whether each progenitor is indeed migrating in the wrong direction or if the movement assessed by the global tracking method that is used is just a consequence of progenitor excess.
We examined the cell number in bidirectional movement samples and control group. The results show that there was no significance between control and bidirectional movement groups (Figure 3figure supplement 1). We also tried to follow every progenitor, but were unable to obtain convincing results with P[B123]-RFP-moe, as tracking single tracheoblast through intact cuticle is technically challenging.
(3) Regarding the ChIP-seq experiment, an explanation of why choosing the "establishment of planar polarity" family should be provided since data indicate a quite low GeneRatio. Indeed, the "cell adhesion" family seems a more obvious candidate, which would be further supported by the fact that the JAK-STAT pathway has been shown to affect cell adhesion components such as ECadherin and FAK (Silver and Montell 2001, Mallart et al 2024). Also, have these known targets of JAK-STAT signaling been found in the ChIP-seq data? Since filopodia polarization is affected in tracheal progenitors when JAK-STAT signaling is decreased, the same question also applies to enabled, which is involved in filopodia formation and has been recently identified as a target of JAK-STAT signaling.
As you kindly suggested, we tested a number of cell adhesion-related genes such as E-Cadherin (shg), fak, robo2 and enabled (ena). We did not observe an apparent aberrancy in the migration of tracheal progenitors (Figure 5-supplement 1J).
(4) Data investigating PCP protein distribution is not convincing, not quantified, and not sufficient to draw one of the main conclusions of the study, which is even written in the abstract "JAK/STAT signaling promotes the expression of genes involved in planar cell polarity leading to asymmetric localization of Fat in progenitor cells."
We better quantified the abundance of Ft in in the progenitors in the frontal edge and those lagging behind. The traces plot multiple replicates in the figures. The level of Ft-GFP is higher in the cells at the frontal edge.
(5) Overall, the figures together with their caption and/or the material and methods section lack some important information for the reader to fully understand the data. In addition, some errors are found in multiple plots throughout the article and must be corrected. Here are some examples:
According to your suggestion, we revised legends and methods section to include sufficient information.
(a) Migration distance plots from Figure 3E do not match the data presented in the source data file. It seems that, when creating the plot, instead of superimposing the bars, bars were stacked. This should be corrected for all migration distance plots from Figure 3E onward, including in supplementary figures.
We apologized for misleading representation. We revised it accordingly and show the quantification in different conditions separately.
(b) The number of analyzed flies and/or clusters of tracheal progenitors from different flies should be stated for all quantification or observations made on images. This information is lacking for all migration distance plots, for progenitor migration tracking (Figure 1 I, J), for DIPF reporter in Figure 2J, for plot profiles (Figure 5G, J), for Upd2-Rab5/Rab7/Lbm co-detections, PLA, CoIP, and lbm-pHluorin experiments. This also applies to RNA seq, ChIP seq, and surface proteomics, for which the number of pupae and number of replicates is not indicated.
We changed the graphs to show the quantification and n number in different conditions separately.
We also added the n number of replicates in methods.
(c) How quantifications were performed is not sufficiently explained. For example, the reference point for migration distance measurement is not defined, and neither is whether the measures were made on fixed or live imaging samples. In fluorescence intensity measurements and Upd2 vesicle counting, information on whether measures were made on a single z slice or on a projection of several z slices should be stated together with what ROI and which FIJI tool for quantification were used. For plot profiles, the same information regarding z slices misses together with how the orientation, the thickness, and the length of the line were chosen, and again the number of times the experiment was conducted should be mentioned and error bars should appear on graphs.
We thank this reviewer for the suggestions which help clarify the methodology of our experiments and improve presentation of our data. We have made the changes according to the suggestions and modified our methods section and the related figures to incorporate these changes.
For measuring the migration distance of tracheal progenitors, we took snapshots of living pupae at 0hr\ 1hr\ 2hr and 3hr APF, and measured the migration distance of tracheal progenitors from the start place (the junction of TC and DT) to the leading edge of progenitor groups.
For the measurements of fluorescent intensity of stat92E-GFP and DIPF, we took z-stack confocal images of samples and quantified the fluorescent intensity using FIJI. Specifically, intensity was quantified for regions of interest, using the Analysis and Measurement tools. To quantify Upd2mCherry vesicles, z-stack confocal images of fat body were taken and the cell counting function of FIJI was used to measure the vesicle number.
To quantify the fluorescent intensity of in vivo tagged Ds, Ft and Fj proteins, a single z slice was used. The expression level of the protein was assessed as the integrated fluorescent intensity normalized to area.
For the measurement of Ft-GFP distribution, a single z slice of the progenitors immediately proximal to the DT was imaged. An arbitrary line was drawn along the migration direction from the starting TC-DT junction to the leading front (the length of the line corresponds to the distribution range of tracheal stem cell clusters). Then, fluorescent intensity along the line was automatically calculated with the imbedded measurement function of Zeiss confocal software.
Minor points:
(1) In several instances, the authors generalize that stem cells migrate to leave their niche, but this is not the case for all stem cells.
The phenomenon that stem cells leave their niche when they are activated is commonly observed. We interpreted the general mechanism from our system of tracheal stem cells. We fully agree with you that it may not be the case for all stem cells. We modified the text accordingly.
(2) Line 122 -a reference paper or an image showing the expression pattern of the lsp2-Gal4 driver is missing.
We added the reference in the manuscript.
(3) Line 136 - The term "traces of individual progenitors" is overstated and should be reformulated as the method used does not seem to be individual cell tracking.
We rephrased accordingly in the revised manuscript.
(4) Line 146 - Fat body and tracheal progenitors are qualified as interdependent organs, in which aspect do tracheal progenitors affect the fat body?
Current knowledge suggests a close inter-organ crosstalk between trachea and fat body: The fly trachea provides oxygen to the body and influences the oxidation and metabolism of the whole body. When the trachea is perturbed, the body is in hypoxia, which causes inflammatory response in adipose tissue as an important immune organ (Shin et al., 2024).
(5) Line 163 - Not all the genes tested are cytokines, so the sentence should be reformulated. In addition, in supplementary Fig2-1 C-J, the KD of hh seems to abolish completely tracheal progenitor migration, which is not commented on.
According to your suggestion, we revised the description on information of the genes tested. We added comments in the revised manuscript regarding phenotypes of hh knockdown.
(6) Line 180 - Conclusion is made on Dome expression while using a dome-Gal4 construct, which does not necessarily recapitulate the endogenous pattern of dome expression, so it should be reformulated. Ideally, dome expression should be assessed in another way. Also, it is not clear whether GFP is present only in progenitors since images are zoomed.
We revised statement and provided larger view of dome>GFP that shows an enriched expression in the tracheal progenitors (Figure 2-figure supplement 2E), an expression pattern that is consistent with FlyBase.
(7) Line 199 - Is it upd-Gal4 or upd2-Gal4 that is used? Since the conclusion of the experiment is made on upd2, the use of upd-gal4 would not be relevant. If upd2-gal4 is used, it should be corrected. In general, the provenance of the Gal4 lines should be provided. In addition, a strong GFP signal in the trachea is visible on the image in Supplementary Figure 2-2F but not commented on and seems contradictory with the conclusion mentioning that fat body and gut are the main source of Upd2 production.
We removed data obtained from the use of this irrelevant upd-Gal4 line.
(8) Figures:
- Figure 1 G, H - Scale bar is missing.
We added it accordingly.
- Figure 1 I, J - The information on the staining is missing.
We added it in the revised manuscript.
- Figure 2A - Providing explanations of the terms "Count" and "Gene ratio" in the caption would be helpful for readers who are not used to this kind of data. In addition, the color code is confusing since the same color is used for the selected gene family and for high p-values (the same applies to other similar graphs).
Gene ratio refers to the proportion of genes in a dataset that are associated with a particular biological process, function, or pathway. Count indicates the number of genes from input gene list that are associated with a specific GO term. We used redness to indicate a smaller p-value and a higher significance.
- Figure 2 B, C - What does the color scale represent? What do the columns in C correspond to, different time points, different replicates?
The color scale represents the normalized expression. The columns in C correspond to different replicates of control and rpr.hid.
- Figure 2 F - The error bars on the 3h APF posterior bars are missing.
We added error bars accordingly.
- Figure 2 G - The legend "Down-Stable-Up" is in comparison to what?
The control group was generated from the reaction without H2O2. The comparison was relative to the control group.
- Figure 2 J - The specificity of the DIPF tool that has been created should be validated in other tissues displaying known JAK-STAT activity and/or in conditions of decreased JAK-STAT signaling. In addition, the added value of the tool as compared to the JAK-STAT activity reporter used later, which has been well characterized, is not obvious.
We added the signal of DIPF in fat body and salivary gland, both of which harbor active JAK/STAT signaling (Figure 2-figure supplement 2F-H). As opposed to the well characterized Stat92E-GFP reporter that assays the downstream transcription activity, the DIPF reporter measures the upstream event of receptor dimerization.
- Figure 3 I-P - Reporter tool validation in Images I-L could be moved to supplementary data. In images M-P, staining of nuclei and/or membranes would be useful to assess cell integrity.
We revised the figures accordingly.
- Figure 3Q and similar plots in the following figures do not explain the normalization performed and how it can be higher than 1 in control conditions.
In these figures, we normalized the signal relative to control groups, e.g., The value of Stat92E-GFP in btl-GFP control group was set to 1 in the previous Figure 3Q (revised Figure 3-supplementary
Figure B-J).
- Figure 4C - These representations lack explanations to be fully understood by a broad audience.
The figure showing that Stat92E binding was detected in the promoters and intronic regions (the orange peaks) of genes functioning in distal-to-proximal signaling, such as ds, fj, fz, stan, Vang and fat2. We added the information in figure legend according to your suggestion.
- Figure 5 K,L - What is the x-axis missing, together with the method of tracking used?
The x-axis refers to time of recording from a t stack series with a time interval of 5 min. We revised method section and provide detailed procedure of this experiment.
- Figures 6 and 8- The overall figures lack a wider view of the cells/tissues/organs and/or additional staining to understand what is presented.
We showed preparation of fat body. In order to obtain the high resolution of vesicles, we used high magnification. We now added wider views of the tissues under investigation (e.g. Figure 6-figure supplement 1).
- Figure 6 D,E - The scale bar is missing.
We added it accordingly.
- Figure 8 O-S - What is the blue staining?
The blue staining shows DAPI-stained nuclei. We have added the information in the legend.
- PLA experiments can give a lot of non-specific background. What kind of controls have been used in Figure 8 F-J? Negative controls should be done on cells that do not express upd2-mCherry using both antibodies to detect non-specific background, which does not usually appear completely black.
If possible, a positive control using a known protein interacting with Rab5-GFP should be included.
We used the control samples without one of the primary antibodies in previous Figure 8. In the revised Figure 8, we conducted experiment as you suggested with controls that do not express upd2mCherry (Figure 8 E-J).
- Co-IP experiments - The raw data file for blots is quite hard to read through. Some legends are not facing the right lane and some blots presented in the main figure are difficult to track since several blots are presented in the raw data file. e.g.
(a) Raw blot for Figure 8 K: the band for mCherry in the IP anti-GFP blot (lane one in K) is not convincing, it is not distinguishable from other aspecific bands. On the reverse IP presented only in raw data, on the input from blot IB anti-mCherry, both lanes present exactly the same bands at 72kb when one of the lanes corresponds to extract from flies not expressing upd2-mCherry.
We thank you for pointing out the incorrect labels. We apologized for the errors and corrected it accordingly.
(b) Raw blot for Figure 8 L: on the input blot IB anti-GFP, there is a band corresponding to Rab7-GFP in the lane of the extract from flies not expressing Rab7-GFP.
We corrected it.
(c) Raw data for Figure 8 M: on the last blot, legends are missing above the input Ib anti-GFP blot.
We added the missing legends in the figure.
Shin, M., Chang, E., Lee, D., Kim, N., Cho, B., Cha, N., Koranteng, F., Song, J.J., and Shim, J. (2024). Drosophila immune cells transport oxygen through PPO2 protein phase transition. Nature 631, 350-359.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public review):
(1) Summary:
In this manuscript, the model's capacity to capture epistatic interactions through multi-point mutations and its success in finding the global optimum within the protein fitness landscape highlights the strength of deep learning methods over traditional approaches.
We thank the reviewer for his/her recognition of our model’s potential and advantages.
(2) Strengths:
It is impressive that the authors used AI combined with limited experimental validation to achieve such significant enhancements in protein performance. Besides, the successful application of the designed antibody in industrial settings demonstrates the practical and economic relevance of the study. Overall, this work has broad implications for future AI-guided protein engineering efforts.
We are thankful for the editor’s appreciation on our work, especially acknowledged the practical application of our model.
(3) Weaknesses:
However, the authors should conduct a more thorough computational analysis to complement their manuscript. While the identification of improved multi-point mutants is commendable, the manuscript lacks a detailed investigation into the mechanisms by which these mutations enhance protein properties. The authors briefly mention that some physicochemical characteristics of the mutants are unusual, but they do not delve into why these mutations result in improved performance. Could computational techniques, such as molecular dynamics simulations, be employed to explore the effects of these mutations?
We thank the reviewer for this good question, which allows us to provide a deeper investigation into the mechanisms by which the mutations significantly enhance the alkali-resistance of proteins. By following the reviewer’s suggestion, we have expanded our analysis by incorporating molecular dynamics (MD) simulations to understand the impact of the mutations. As an example, we focused on the representative alkali-resistant mutant, A57D;P29T, and examined its MD simulation results. As shown in Figure S4A, the two-point mutant of A57D;P29T has a Tm increase of around 8 ℃ and a much stronger binding affinity than the WT. Our analysis of the MD trajectories indicates that the A57D;P29T mutant has a more rigid structure than that of WT due to its lower root mean squared deviation (RMSD) of protein (Figure S4B). Furthermore, we calculated the root mean squared fluctuation (RMSF) for each residue, and realized that the mutant displayed less fluctuation at residue 29 but similar flexibility at residue 57. Interestingly, residues at positions 10, 108 and 118 which spatially distant from residues 29 and 57 in the mutant exhibited remarkable weakened fluctuations than those in the WT (Figure S4C), implying a more rigid structure of the mutant contributing to its improved resistance on high temperature and strong alkalinity. However, Figure S4D shows the AlphaFold3 predicted structures of the WT and the mutant are quite similar.
To unveil the origin of change on structural flexibility, we computed the intramolecular interactions, such as salt bridges and hydrogen bonds for both WT and the mutant. We observed that the mutations increased the number of hydrogen bonds between the mutation sites and the rest of the protein (Figure S4E). However, the overall structure of the mutant did not show significant changes, which is also evident from the solvent-accessible surface area (SASA) analysis (Figure S4F). We also analyzed changes in salt bridges and found that although residue 57 mutated to Histidine, no new salt bridges were formed. Additionally, RMSF results showed that residues 10, 108, and 118 became more rigid, but further analysis revealed that there was no significant change in hydrogen bonds or other interactions in these regions. Overall, the MD results suggest that more hydrogen bonds introduced by the mutations of A57D;P29T stabilize the protein, leading to the enhanced alkali resistance observed in the mutant. These results are now presented in Figure S4 and discussed in detail in the revised manuscript.
Specifically, we have added the following discussion in the main text:
“In order to gain deeper insights into the mechanisms by which the identified mutations enhance protein properties, we performed molecular dynamics (MD) simulations on the best alkali-resistant mutant. The simulation results revealed several key observations that help explain the observed improvements in protein stability and alkali resistance. As shown in Figure S4A, the two-point mutant of A57D;P29T has a Tm increase of around 8℃ and a much stronger binding affinity than the WT. Our analysis of the MD trajectories indicates that the A57D;P29T mutant has a more rigid structure than that of WT due to its lower root mean squared deviation (RMSD) of protein (Figure S4B). Furthermore, we calculated the root mean squared fluctuation (RMSF) for each residue, and realized that the mutant displayed less fluctuation at residue 29 but similar flexibility at residue 57. Interestingly, residues at positions 10, 108 and 118 which spatially distant from residues 29 and 57 in the mutant exhibited remarkable weakened fluctuations than those in the WT (Figure S1C), implying a more rigid structure of the mutant contributing to its improved resistance on high temperature and strong alkalinity. However, Figure S4D shows the AlphaFold3 predicted structures of the WT and the mutant are quite similar. To unveil the origin of change on structural flexibility, we computed the intramolecular interactions, such as salt bridges and hydrogen bonds for both WT and the mutant. We observed that the mutations increased the number of hydrogen bonds between the mutation sites and the rest of the protein (Figure S4E). However, the overall structure of the mutant did not show significant changes, which is also evident from the solvent-accessible surface area (SASA) analysis (Figure S4F). We also analyzed changes in salt bridges and found that although residue 57 mutated to Histidine, no new salt bridges were formed. Additionally, RMSF results showed that residues 10, 108, and 118 became more rigid, but further analysis revealed that there were no significant changes in hydrogen bonds or other interactions in these regions. Taken together, these findings suggest that the enhanced alkali resistance of the mutant is likely due to an overall increase in protein stability, rather than a dramatic change in its structural conformation. The MD simulation results, which are detailed in Figure S4, provide a deeper understanding of how specific mutations can improve protein properties and offer valuable insights for future protein engineering applications.”
And we also included the following content in the SI:
“Molecular Dynamics (MD) simulations
The initial structures for molecular dynamics (MD) simulations of both the wild type and the mutant were predicted using AlphaFold3. To simulate experimental conditions, each protein was placed in a cubic water box containing 0.1 M NaCl. The CHARMM27 force field and the TIP4P water model were applied throughout the simulations. After an initial energy minimization of 50,000 steps, the systems were heated and equilibrated for 1 ns in the NVT ensemble at 300 K followed by an additional 1 ns in the NPT ensemble at 1 atm. The production phase then involved 200-ns simulations with periodic boundary conditions, using a 2 fs integration time step. The LINCS algorithm was used to constrain covalent bonds involving hydrogen atoms, while Lennard-Jones interactions were cut off at 10 Å. Electrostatic interactions were computed with the particle mesh Ewald method, using a 10 Å cutoff and a grid spacing of approximately 1.6 Å with a fourth-order spline. Temperature and pressure were regulated by the velocity rescaling thermostat and Parrinello-Rahman algorithm, respectively. All simulations were performed using GROMACS 2020.4 software packages. Both systems have reached equilibrium according to the analyses of root mean squared deviation (RMSD).”
(4) Additionally, the authors claim that their method is efficient. However, the selected VHH is relatively short (<150 AA), resulting in lower computational costs. It remains unclear whether the computational cost of this approach would still be acceptable when designing larger proteins (>1000 AA). Besides, the design process involves a large number of prediction tasks, including the properties of both single-site saturation and multi-point mutants. The computational load is closely tied to the protein length and the number of mutation sites. Could the authors analyze the model's capability boundaries in this regard and discuss how scalable their approach is when dealing with larger proteins or more complex mutation tasks?
In our prior work, we have demonstrated that our method is applicable to larger proteins as well [Jiang et al., Sci. Adv. 10, eadr2641 (2024)]. For instance, when engineering a protein with 1000 amino acids, inferring the fitness of one million mutants using the model on a single 4090 GPU takes approximately 20 hours. However, it remains infeasible to explore all possible mutations when designing multi-point mutants due to the vast space. To address this challenge, we propose the design of a reliable mutant library. In the first round of experiments, we used the model to score all single-point mutations, and then constructed the multi-point mutant library by combining experimentally tested single-point mutations. In this way, even when designing five-point mutants, we only need to score on the order of millions of mutants, making the inference process time-efficient and fully acceptable. As a result, the number of single-point mutations selected for combination into the multi-point mutant library becomes a crucial parameter that affects both inference time and scope. We limited the number of single-point mutations to between 30 and 50 to strike a balance between efficiency and accuracy.
These results are discussed in the revised manuscript. Specifically, we have added the following discussion at the section 2.2 in the main text:
“Although the model inference is fast, it is not feasible to explore all possible mutations when designing multi-point mutants due to the exponential increase in the number of potential combinations. To manage this challenge, we constructed a mutant library based on a two-stage design process. In the first stage, we scored all single-point mutations using the model, and in the second stage, we combined experimentally validated single-point mutations to create the multi-point mutant library. This approach ensures that even when designing multi-point mutants (e.g., five-point mutants), the number of mutants to score remains in the millions, which is computationally efficient and practical. The number of single-point mutations selected for the multi-point mutant library is a key factor influencing both the computational load and the scope of the design space. To maintain a balance between efficiency and accuracy, we limited the number of single-point mutations to between 30 and 50. This strategic approach allows us to achieve both scalability and precision in our protein engineering tasks.”
Reviewer #2 (Public review):
In this paper, the authors aim to explore whether an AI model trained on natural protein data can aid in designing proteins that are resistant to extreme environments. While this is an interesting attempt, the study's computational contributions are weak, and the design of the computational experiments appears arbitrary.
The reviewer’s comments give us an opportunity to further state the novelty of this study. Despite the AI model has been reported in our previous work [Sci. Adv. 10, eadr2641 (2024)], the unnatural physicochemical properties of proteins, to the best of our knowledge, have never been predicted using AI models. Our preceding work [Sci. Adv. 10, eadr2641 (2024)] has demonstrated that the large language model can predict the performances of the mutants on thermostability, catalytic activity, and binding affinity, etc. However, whether the AI models are able to evaluate the unnatural properties of the mutants remains unexplored. Our work has shown that AI models trained on the natural proteins can be used to design the mutants that resistant extreme conditions, such as strong alkalinity, substantially expanding the application of AI for bioengineering. Moreover, our design of the computational experiments was driven by the nature of the task and the availability of experimental data. We employed different strategies for designing single-point and multi-point mutants, specifically using a zero-shot approach for single-point mutations to overcome the challenge of rare data and fine-tuning the model for multi-point mutations to leverage the experimental data of single-point mutations.
(1) The writing throughout the paper is poor. This leaves the reader confused.
The manuscript has been revised accordingly, and we would like to address the reader’s questions if anything is confused.
(2) The main technical issue the authors address is whether AI can identify protein mutations that adapt to extreme environments based solely on natural protein data. However, the introduction could be more concise and focused on the key points to better clarify the significance of this question.
We thank the reviewer for this comment. We have revised the manuscript, particularly the introduction, where we focused on the research questions, methods, and main findings, while removing excessive background information to improve the manuscript’s conciseness and clarity.
“Protein engineering, situated at the nexus of molecular biology, bioinformatics, and biotechnology, focuses on the design of proteins to introduce novel functionalities or enhance existing attributes[1-3]. With the exponential growth of biological data and computational power, protein engineering has experienced a significant shift towards advanced computational methodologies, particularly deep learning, to expedite the design process and unravel complex protein-function relationships[4-9]. However, a significant challenge in industrial protein engineering is designing proteins with inherent resistance to extreme conditions, such as high temperature and extreme pH environments (acidic or alkaline)[17, 18]. Unlike proteins in natural ecosystems, those used in industrial processes often encounter harsh physical and chemical conditions, necessitating exceptional resilience to maintain functionality[19, 20]. Previous efforts to enhance protein resistance have often relied on rational design and mutant library screening. These methods are typically labor-intensive, inefficient, and yield limited improvements[23-26]. Consequently, the industrial demand for proteins resilient to harsh environments poses a notable absence within the training datasets of Artificial Intelligence (AI) models. Exploring whether AI can achieve the evolution of protein resistance to extreme environments is crucial for broadening protein applications and improving modification efficiency.
Recent advances in large-scale protein language models (LLMs) have enabled zero-shot predictions of protein mutants based on self-supervised learning from natural protein sequences. Although AI-guided protein design has been applied to predict the mutants with greater thermostability and higher activity[34-36], it is unexplored whether these models based on the natural protein information can find the mutants that adapt the unnatural extreme environments, such as the alkaline solution with the pH value higher than 13.
Here, we employed a LLM (large language model) developed by our group, the Pro-PRIME model[27], to predict dozens of mutants of a nano-antibody against growth hormone (a VHH antibody), and examined their fitness, including alkali resistance and thermostability, to evaluate their performance under extreme environments.
We utilized the Pro-PRIME model to score saturated single-point mutations of the VHH in a zero-shot setting, and selected the top 45 mutants for experimental testing. Some mutants exhibited improved alkali resistance, while others demonstrated higher thermal stability or affinity. Subsequently, we fine-tuned the Pro-PRIME model to predict dozens of multi-point mutations. As a result, we obtained three multi-point mutants with enhanced alkali resistance, higher thermostability, as well as strong affinity to the targeted protein. Also, the dynamic binding capacity of the selected mutant did not show significant decline after more than 100 cycles, making it suitable for practical application in industrial production. The selected mutant has been used in practical production and lower the cost for over one million dollars in a year. To the best of our knowledge, this is the first protein product developed by a LLM that has been successfully applied in mass production. Due to the Pro-PRIME model's ability to achieve precise predictions of multi-point mutations with reliance on a small amount of experimental data, our two-round design process involved experimental validation of only 65 mutants in two months, demonstrating remarkable high efficiency. Furthermore, we performed a systematic analysis of these findings and determined that the model can yield more valuable predictive outcomes while remaining consistent with rational design principles. Specifically, within the framework of multi-point combinations, the model's incorporation of negative single-point mutations into the combinatorial space led to exceptional results, showcasing its capacity to capture epistatic interactions. Notably, in striving for global optimum, deep learning methods offer distinct advantages over traditional rational design approaches.”
(3) The authors did not develop a new model but instead used their previously developed Pro-PRIME model. This significantly weakens the novelty and contribution of this work.
While it is true that the Pro-PRIME model was previously developed, the novelty and contribution of this work lie in its novel application to design proteins with properties that are not naturally found or are rare in nature. In our original work, the Pro-PRIME model was used to optimize proteins for existing, well-established properties, such as thermal stability, enzymatic activity, and affinity. However, in this study, we extended the model’s capabilities to design proteins that exhibit resilience to extreme environments, such as high pH—properties that are not inherently present in most natural proteins. To our knowledge, no existing model has addressed the challenge of engineering alkali-resistant proteins, nor is there relevant dataset available for training such models.
This shift from optimizing existing characteristics to engineering entirely new properties represents a significant step forward in the field of protein design. By focusing on the design of proteins that can survive and function in harsh, unnatural environments, we have demonstrated the broader applicability of the Pro-PRIME model beyond its initial scope. This expansion of the model's application is a novel contribution that has the potential to accelerate the development of proteins for industrial, agricultural, and biotechnological applications.
Thus, while the Pro-PRIME model itself is not new, its application to the new challenge of engineering proteins with alkali resistance and other novel properties significantly enhances the impact and novelty of this work. Moreover, this work is groundbreaking not only in terms of the model’s novel application but also because no previous studies have specifically targeted alkali resistance or provided data for training models on such extreme properties. Therefore, our approach is unique, marking a new direction in protein engineering.
We have made the following revisions to the conclusions section of the manuscript:
“Through two rounds of evolution, we successfully designed a VHH antibody with strong resistance to extreme environments and enhanced affinity using the Pro-PRIME model. Although rare case can tolerate the extreme pH and saline conditions in our pre-training dataset, the Pro-PRIME model showed impressive performance after supervised learning with limited data, especially on capturing the epistatic effects. The analysis of these 65 mutants revealed that the Pro-PRIME model is adept at exploring the large space of protein fitness, being less susceptible to local optima, and having greater potential to find the global optimum. Our efficient method of designing mutants that consider multiple properties improvement holds promise for industrial application of proteins. Specifically, the VHH antibody has been deployed in practical production and significantly enhancing the efficiency of the entire production line after our design. While the Pro-PRIME model itself has been reported, this work demonstrates its first-time application to the challenge of designing proteins with alkali resistance and other extreme properties that are not found in natural proteins, nor have previous studies addressed or provided data for such applications. This shift from optimizing existing protein properties to engineering entirely new, unnatural traits is a significant advance in the field. This study shows that the AI models, such as Pro-PRIME, can not only guide the evolution of protein thermal stability, enzymatic activity, ligand affinity, etc., but also enable to develop the mutants adapting the harsh unnatural environments, such as extreme pH and concentrated salt, largely expanding its application. The novelty of this work lies in the ability to design and engineer proteins with novel properties, specifically alkali resistance, which is an unprecedented achievement in AI-assisted protein engineering. The great potential of AI model is expected to significantly accelerate the development of proteins for diverse applications in medicine, agriculture, bioengineering, etc.”
(4) The computational experiments are not well-justified. For instance, the authors used a zero-shot setting for single-point mutation experiments but opted for fine-tuning in multiple-point mutation experiments. There is no clear explanation for this discrepancy. How does the model perform in zero-shot settings for multiple-point mutations? How would fine-tuning affect single-point mutation results? The choice of these strategies seems arbitrary and lacks sufficient discussion.
We appreciate the reviewer’s comment regarding the use of zero-shot and fine-tuning settings for single-point and multi-point mutation experiments, and we are grateful for the opportunity to further clarify this aspect of our work.
In the first round of design, we used the zero-shot approach for single-point mutations because the number of possible single-point mutations is limited, and no prior experimental data was available. In the absence of relevant data, the zero-shot approach allows the model to make predictions based on the learned sequence patterns from the pre-trained protein language model. Given that single-point mutations are relatively fewer in number and computationally feasible to evaluate, the zero-shot approach was deemed appropriate for this task.
However, when it comes to designing multi-point mutants, the number of potential combinations increases exponentially, making it computationally impractical to explore all possible mutations in a reasonable timeframe. Furthermore, since we had already obtained some experimental data for single-point mutations in the first round, we fine-tuned the model with this data in the second round to improve the accuracy of predictions for multi-point mutants. Fine-tuning helps the model better capture the specific features that contribute to protein functionality, which are critical when dealing with multi-point mutations where multiple residues interact. This allows the model to produce more reliable and targeted predictions for multi-point mutants, ultimately leading to better design outcomes.
Regarding the model's performance in zero-shot settings for multi-point mutations, we tested this approach, and the results did not align well with the experimental data for multi-point mutants. Specifically, the Spearman correlation coefficient between the zero-shot predictions and experimental results was -0.71, indicating that zero-shot predictions for multi-point mutations were not as accurate as those from the fine-tuned model.
In summary, the choice of using zero-shot for single-point mutations and fine-tuning for multi-point mutations was driven by the nature of the task and the availability of experimental data. Fine-tuning the model improves its predictive performance, especially for more complex multi-point mutation tasks. We have now clarified these choices in the manuscript and have added further discussion on the trade-offs between zero-shot and fine-tuning approaches.
Specifically, we have added the following discussion at the section 2.2 in the main text:
“Note that we employed different strategies for designing single-point and multi-point mutants, specifically using a zero-shot approach for single-point mutations and fine-tuning the model for multi-point mutations. These choices were made based on the distinct characteristics of the two tasks and the availability of experimental data. For single-point mutations, the number of possible mutations is relatively limited, and at the outset, there were no experimental data available. In such cases, the zero-shot setting was chosen because it allows the model to predict the fitness of mutants based solely on the information learned during pre-training on a large protein sequence dataset. Since single-point mutations are computationally manageable, this approach was deemed appropriate to generate initial predictions for protein engineering. However, when designing multi-point mutants, the situation changes significantly. The potential combinations of mutations increase exponentially, and without prior data, it becomes computationally infeasible to evaluate every possible combination within a reasonable timeframe. Moreover, by the time we reached the multi-point mutation design stage, experimental data for several single-point mutations had already been obtained. This data enabled us to fine-tune the model to better capture the specific structural and functional features that contribute to protein stability and resistance, especially in the context of multiple interacting mutations. Fine-tuning improves the model’s accuracy by adjusting its parameters to align more closely with the experimental data, ensuring that the predicted multi-point mutants are more likely to meet the desired engineering goals. After the second round of design, the fitness of the mutants was further improved. In improving alkali resistance, experimental results showed that 15 of the 45 designed mutants exhibited positive responses, yielding a success rate of 30%, close to the 35% success rate achieved in the second round. Compared to the wild type, the best single-point mutant improved alkali resistance by approximately 44.7%, while the best multi-point mutant achieved a 67.7% increase. For thermal stability enhancement, the success rate in the first round was 77.8%, rising to 100% in the second round. The top single-point mutant exhibited a Tm increase of 6.37°C over the wild type, while the best multi-point mutant had a Tm increase of 10.02°C. We also tested the performance of the zero-shot approach for multi-point mutants, and the results showed that this method did not yield satisfactory predictions. The Spearman correlation coefficient between the zero-shot predictions and experimental results for multi-point mutants was -0.71, indicating a significant discrepancy. This further highlights the importance of fine-tuning the model for multi-point mutations, as the fine-tuned model provided more accurate and reliable results. In summary, the choice of zero-shot for single-point mutations and fine-tuning for multi-point mutations was driven by practical considerations regarding computational feasibility and the availability of experimental data. Fine-tuning the model significantly enhances its predictive performance, particularly for complex multi-point mutations where multiple residues interact. We believe this strategy strikes an optimal balance between computational efficiency and predictive accuracy, making it well-suited for practical protein engineering applications.”
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
We would like to thank the reviewers and the editors for carefully reading and commenting our manuscript and plan to prepare a revised manuscript. Particularly, we want to thank reviewer 2 for spotting a major oversight regarding the use of the TKO (TRiP-CRISPR knockout) and TOE (TRiP-CRISPR Over Expression) systems and the MiMIC alleles. As the reviewer pointed out, these lines were not used as intended, therefore our results and conclusions regarding the genetic interactions between Pink1 and several of genes in the paper (PIG-A, Rab7, Ccz1, CG10646, Mon1, FASN2, CG17712) that we attempted to target, are incorrect and based on a technical mistake. These results need to be removed from the manuscript.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
The authors assess the effectiveness of electroporating mRNA into male germ cells to rescue the expression of proteins required for spermatogenesis progression in individuals where these proteins are mutated or depleted. To set up the methodology, they first evaluated the expression of reporter proteins in wild-type mice, which showed expression in germ cells for over two weeks. Then, they attempted to recover fertility in a model of late spermatogenesis arrest that produces immotile sperm. By electroporating the mutated protein, the authors recovered the motility of ~5% of the sperm, although the sperm regenerated was not able to produce offspring using IVF.
We actually did not write that “sperm regenerated was not able to produce offspring using IVF” but rather that IVF was not attempted because the number of rescued sperm was too low. To address this important point, the ability of sperm to produce embryos was therefore challenged by two different assisted reproduction technologies, that are IVF and ICSI. To increase the number of motile sperm for IVF experiments, we have injected both testes from one male. We also conducted intracytoplasmic sperm injection (ICSI) experiments, using only rescued sperm, identified as motile sperm with a normal flagellum. The results of these new experiments have demonstrated that the rescued ARMC2 sperm successfully fertilized eggs and produced embryos at the two-cell stage by IVF and blastocysts by ICSI. These outcomes are presented in Figure 12.
This is a comprehensive evaluation of the mRNA methodology with multiple strengths. First, the authors show that naked synthetic RNA, purchased from a commercial source or generated in the laboratory with simple methods, is enough to express exogenous proteins in testicular germ cells. The authors compared RNA to DNA electroporation and found that germ cells are efficiently electroporated with RNA, but not DNA. The differences between these constructs were evaluated using in vivo imaging to track the reporter signal in individual animals through time. To understand how the reporter proteins affect the results of the experiments, the authors used different reporters: two fluorescent (eGFP and mCherry) and one bioluminescent (Luciferase). Although they observed differences among reporters, in every case expression lasted for at least two weeks.
The authors used a relevant system to study the therapeutic potential of RNA electroporation. The ARMC2-deficient animals have impaired sperm motility phenotype that affects only the later stages of spermatogenesis. The authors showed that sperm motility was recovered to ~5%, which is remarkable due to the small fraction of germ cells electroporated with RNA with the current protocol. The 3D reconstruction of an electroporated testis using state-of-the-art methods to show the electroporated regions is compelling.
The main weakness of the manuscript is that although the authors manage to recover motility in a small fraction of the sperm population, it is unclear whether the increased sperm quality is substantial to improve assisted reproduction outcomes. The quality of the sperm was not systematically evaluated in the manuscript, with the endpoints being sperm morphology and sperm mobility.
We would like to thank the reviewers for their comments. As previously stated above, we produced additional rescue experiments and performed CASA, morphology observation, IVF and ICSI with the rescued sperm. The rescued ARMC2 sperm exhibited normal morphology (new figure 11 and Supp Fig 8), motility (figure 11), and fecundity (figure 12). Whereas sperm from untreated KO males were unable to fertilize egg by IVF, the rescued sperm fertilized eggs in vitro at a significant level (mean 62%, n=5), demonstrating that our strategy improves the sperm quality and assisted reproduction outcome (from 0 to 62%).
Some key results, such as the 3D reconstruction of the testis and the recovery of sperm motility, are qualitative given the low replicate numbers or the small magnitude of the effects. The presentation of the sperm motility data could have been clearer as well. For example, on day 21 after Armc2-mRNA electroporation, only one animal out of the three tested showed increased sperm motility. However, it is unclear from Figure 11A what the percentage of sperm motility for this animal is since the graph shows a value of >5% and the reported aggregate motility is 4.5%. It would have been helpful to show all individual data points in Figure 11A.
We provide now in figure 11A, a graph showing the percentage of rescued sperm for all animals. (scatter dot plot). Moreover, we performed additional CASA experiments to analyze in detail sperm motility (Figure 11A2-A3). Individual CASA parameters for motile sperm cells were extracted as requested by reviewer 3 and represented in a new graph (Fig 11 A2).
The expression of the reporter genes is unambiguous; however, better figures could have been presented to show cell type specificity. The DAPI staining is diffused, and it is challenging to understand where the basement membranes of the tubules are. For example, in Figures 7B3 and 7E3, the spermatogonia seems to be in the middle of the seminiferous tubule. The imaging was better for Figure 8. Suboptimal staining appears to lead to mislabeling of some germ cell populations. For example, in Supplementary Figure 4A3, the round spermatid label appears to be labeling spermatocytes. Also, in some instances, the authors seem to be confusing, elongating spermatids with spermatozoa, such as in the case of Supplementary Figures 4D3 and D4.
Thanks for the comments, some spermatogenic cells were indeed mislabeled as you mentioned. We have therefore readjusted the labeling accordingly. We also changed spermatozoa to mature spermatids. The new sentence is now: “At the cellular level, fluorescence was detectable in germ cells (B1-B3) including Spermatogonia (Sg), Spermatocytes (Scytes),round Spermatids (RStids), mature spermatids (m-Sptids) and Sertoli cells (SC)”. Moreover, to indicate the localization of the basal membrane, we have also labelled myoid cells.
The characterization of Armc2 expression could have been improved as well. The authors show a convincing expression of ARMC2 in a few spermatids/sperm using a combination of an anti-ARMC2 antibody and tubules derived from ARMC2 KO animals. At the minimum, one would have liked to see at least one whole tubule of a relevant stage.
Thanks for the remark.
We present now new images showing transversal section of seminiferous tubules as requested (see supp fig 6). In this new figure, it is clear that Armc2 is only expressed in spermatids. We have also added in this figure an analysis of the RNA-seq database produced by Gan's team (Gan, Wen et al. 2013), confirming that ArmC2 expression is predominantly expressed at the elongated spermatid stage. This point is now clearly indicated in the text.
Overall, the authors show that electroporating mRNA can improve spermatogenesis as demonstrated by the generation of motile sperm in the ARMC2 KO mouse model.
Thank you
Reviewer #2 (Public Review):
Summary:
Here, the authors inject naked mRNAs and plasmids into the rete testes of mice to express exogenous proteins - GFP and later ARMC2. This approach has been taken before, as noted in the Discussion to rescue Dmc1 KO infertility. While the concept is exciting, multiple concerns reduce reviewer enthusiasm.
Strengths:
The approach, while not necessarily novel, is timely and interesting. Weaknesses:
Overall, the writing and text can be improved and standardized - as an example, in some places in vivo is italicized, in others it's not; gene names are italicized in some places, others not; some places have spaces between a number and the units, others not. This lack of attention to detail in the preparation of the manuscript is a significant concern to this reviewer - the presentation of the experimental details does cast some reasonable concern with how the experiments might have been done. While this may be unfair, it is all the reviewers have to judge. Multiple typographical and grammatical errors are present, and vague or misleading statements.
Thanks for the comment, we have revised the whole manuscript to remove all the mistakes. We have also added new experiments/figures to strengthen the message. Finally, we have substantially modified the discussion.
Reviewer #3 (Public Review):
Summary:
The authors used a novel technique to treat male infertility. In a proof-of-concept study, the authors were able to rescue the phenotype of a knockout mouse model with immotile sperm using this technique. This could also be a promising treatment option for infertile men.
Strengths:
In their proof-of-concept study, the authors were able to show that the novel technique rescues the infertility phenotype in vivo.
Weaknesses:
Some minor weaknesses, especially in the discussion section, could be addressed to further improve the quality of the manuscript.
We have substantially modified the discussion, following the remarks of the reviewers.
It is very convincing that the phenotype of Armc2 KO mice could (at least in part) be rescued by injection of Armc2 RNA. However, a central question remains about which testicular cell types have been targeted by the constructs. From the pictures presented in Figures 7 and 8, this issue is hard to assess. Given the more punctate staining of the DNA construct a targeting of Sertoli cells is more likely, whereas the more broader staining of seminiferous tubules using RNA constructs is talking toward germ cells. Further, the staining for up to 119 days (Figure 5) would point toward an integration of the DNA construct into the genome of early germ cells such as spermatogonia and/or possibly to Sertoli cells.
Thanks for the comment. We would like to recall the peculiar properties of the non-insertional Enhanced Episomes Vector (EEV) plasmid, which is a non-viral episome based on the Epstein-Barr virus (EBV: Epstein-Barr Virus). It allows the persistence of the plasmid for long period of time without integration. Its maintenance within the cell is made possible by its ability to replicate in a synchronous manner with the host genome and to segregate into daughter cells. This is due to the fact that EEV is composed of two distinct elements derived from EBV: an origin of replication (oriP) and an EpsteinBarr Nuclear Antigen 1 (EBNA1) expression cassette (Gil, Gallaher, and Berk, 2010). The oriP is a locus comprising two EBNA1-binding domains, designated as the Family of Repeats (FR) and Dyad Symmetry (DS). The FR is an array of approximately 20 EBNA1-binding sites (20 repeats of 30 bp) with high affinity, while the DS comprises four lower-affinity sites operating in tandem (Ehrhardt et al., 2008).
The 641-amino-acid EBNA1 protein contains numerous domains. The N-terminal domains are rich in glycines and alanines, which enable interaction with host chromosomes. The C-terminal region is responsible for binding to oriP (Hodin, Najrana, and Yates, 2013). The binding of EBNA1 to the DS element results in the recruitment of the origin of replication. This results in the synchronous initiation of extra-chromosomal EEV replication with host DNA at each S phase of the cell cycle (Düzgüneş, Cheung, and Konopka 2018). Furthermore, EBNA1 binding to the FR domain induces the formation of a bridge between metaphase chromosomes and the vector during mitosis. This binding is responsible for the segregation of the EEV episome in daughter cells (Düzgüneş, Cheung, and Konopka 2018). It is notable that EEV is maintained at a rate of 90-95% per cell division.
Because of the intrinsic properties of EEV described above, the presence of the reporter protein at 119 day after injection was likely due to the maintenance of the plasmid, mostly in Sertoli cells, and not to the DNA integration of the plasmid.
Of note, the specificity of EEV was already indicated in the introduction (lines 124-128 clean copy). Nevertheless, we have added more information about EEV to help the readers.
Given the expression after RNA transfection for up to 21 days (Figure 4) and the detection of motile sperm after 21 days (Figure 11), this would point to either round spermatids or spermatocytes. These aspects need to be discussed more carefully (discussion section: lines 549-574).
We added a sentence to highlight that spermatids are transfected and protein synthetized at this stage and this question is discussed in details (see lines 677-684 clean copy).
It would also be very interesting to know in which testicular cell type Armc2 is endogenously expressed (lines 575-591)
Thanks for the remarks. We present now new images showing the full seminiferous tubules as requested by reviewer 1 (see supp fig 6). In this new figure, it is clear that Armc2 is only expressed in spermatids. We have also added in this figure an analysis of the RNA-seq database produced by Gan's team (Gan, Wen et al. 2013), confirming that Armc2 is predominantly expressed at the elongated spermatid stage. This point is now clearly indicated in the text. (lines 570-579 clean copy).
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
The article is well-structured and easy to read. Nonetheless, there are typos and mistakes in some places that are distracting to the reader, such as the capitalization of the word "Oligo-" in the title of the manuscript, the use of the word "Materiel" in the title of the Materials and methods and the presence of space holders "Schorr staining was obtained from Merck (XXX)". Thank you, we corrected the misspelling of "Materials and Methods" and corrected our error: "obtained from Merck (Darmstadt, Germany)". We also carefully corrected the manuscript to remove typos and mistakes.
The discussion is too lengthy, with much repetition regarding the methods used and the results obtained. For example, these are two sentences from the discussion. "The vector was injected via the rete testis into the adult Armc2 KO mice. The testes were then electroporated." I would recommend shortening these passages.
Thanks for your comments, we removed the sentences and we have substantially modified the discussion, following the remarks of the reviewers.
The work is extensive, and many experiments have been done to prove the points made. However, a more in-depth analysis of critical experiments would have benefited the manuscript significantly. A more thorough analysis of sperm mobility and morphology using the CASA system would have been an initial step.
In response to the observations made, additional CASA experiments and sperm motility analysis were conducted, as illustrated in Figure 11 (A2-A3). Individual CASA parameters for motile sperm cells were extracted as suggested and represented in a new graph (Fig 11 A2). We have observed significant differences between WT and rescued sperm. In particular, the VSL and LIN parameters were lower for rescued sperm. Nevertheless, these differences were not sufficient to prevent IVF, maybe because the curvilinear velocity (VCL) was not modified.
In the case of ARMC2 localization, an analysis of the different stages of spermatogenesis to show when ARMC2 starts to be expressed.
Thanks for the remarks. This is an important remark pointed out by all reviewers. As explained above, we have performed more experiments. We present now new images showing transversal section of seminiferous tubules as requested (see supp fig 6). In this new figure, it is clear that Armc2 is only expressed in spermatid layers. We have also added in this figure an analysis of the RNA-seq database produced by Gan's team (Gan, Wen et al. 2013), confirming that ArmC2 expression is predominantly expressed at the elongated spermatid stage. This point is now clearly indicated in the text. (lines 575579 clean copy).
Finally, exploring additional endpoints to understand the quality of the sperm generated, such as the efficiency of ICSI or sperm damage, could have helped understand the degree of the recovery.
This point was underlined in public review. We paste here our answer: “To address this important point, the ability of sperm to produce embryos was therefore challenged by two different assisted reproduction technologies, that are IVF and ICSI. To increase the number of motile sperm for IVF experiments, we have injected both testes from one male. We also conducted intracytoplasmic sperm injection (ICSI) experiments, using only rescued sperm, identified as motile sperm with a normal flagellum. The results of these new experiments have demonstrated that the rescued ARMC2 sperm successfully fertilized eggs and produced embryos at the two-cell stage by IVF and blastocysts by ICSI. These outcomes are presented in Figure 12.”
Reviewer #2 (Recommendations For The Authors):
38,74 intracellular
Thanks, we changed it accordingly: "Intracytoplasmic sperm injection (ICSI) is required to treat such a condition, but it has limited efficacy and has been associated with a small increase in birth defects" and "such as intracytoplasmic sperm injection (ICSI)".
39 "limited efficacy" Versus what? And for what reason? "small increase in birth defects" - compared to what?
We changed to “… but it is associated with a small increase in birth defect with comparison to pregnancies not involving assisted conception.”
40 Just thinking through the logic of the argument thus far - the authors lay out that there are people with OAT (true), ICSI must be used (true), ICSI is bad (not convincing), and therefore a new strategy is needed... so is this an alternative to ICSI? And this is to restore fertility, not "restore spermatogenesis"
- because ICSI doesn't restore spermatogenesis. This logic flow needs to be cleaned up some
Thanks we changed it accordingly: “restore fertility.”
45 "mostly"?
Thank you, we removed the word: “We show that mRNA-coded reporter proteins are detected for up to 3 weeks in germ cells, making the use of mRNA possible to treat infertility.”
65 Reference missing.
We added the following reference Kumar, N. and A. K. Singh (2015). "Trends of male factor infertility, an important cause of infertility: A review of literature." J Hum Reprod Sci 8(4): 191-196.
68 Would argue meiosis is not a reduction of the number of chromosomes - that happens at the ends of meiosis I and II - but the bulk of meiosis is doubling DNA and recombination; would re-word; replace "differentiation" with morphogenesis, which is much more commonly used:
Thank you, we have changed the sentence accordingly: "proliferation (mitosis of spermatogonia), reduction of the number of chromosomes (meiosis of spermatocytes), and morphogenesis of sperm (spermiogenesis)".
70 "almost exclusively" is an odd term, and a bit of an oxymoron - if not exclusively, then where else are they expressed? Can you provide some sense of scale rather than using vague words like "large", "almost", "several", "strongly" and "most...likely" - need some support for these claims by being more specific:
Thanks for the comment, we changed the sentence: "The whole process involves around two thousand genes, 60% of which are expressed exclusively in the testes."
73 "severe infertility" is redundant - if they are infertile, is there really any more or less about it? I think what is meant is patients with immotile sperm can be helped by ICSI - so just be more specific...
We changed the transition : “Among infertility disorders, oligo-astheno-teratozoospermia (OAT) is the most frequent (50 % (Thonneau, Marchand et al. 1991); it is likely to be of genetic origin. Spermatocytograms of OAT patients show a decrease in sperm concentration, multiple morphological defects and defective motility. Because of these combined defects, patients are infertile and can only conceive by IntraCytoplasmic Sperm Injection (ICSI). IntraCytoplasmic Sperm Injection (ICSI) can efficiently overcome the problems faced. However, there are …”
75 "some" is vague - how many concerns, and who has them? Be specific!
Thanks for the comment, we removed the word.
76-7 Again, be specific - "real" has little meaning - what is the increased risk, in % or fold? This is likely a controversial point, so make sure you absolutely support your contention with data .
77 "these"? There was only one concern listed - increased birth defects; and "a number" is vague - what number, 1 or 1,000,000? A few (2-3), dozens, hundreds?
Thanks for the comment, we have reworded the sentence: “Nevertheless, concerns persist regarding the potential risks associated with this technique, including blastogenesis defect, cardiovascular defect, gastrointestinal defect, musculoskeletal defect, orofacial defect, leukemia, central nervous system tumors, and solid tumors. Statistical analyses of birth records have demonstrated an elevated risk of birth defects, with a 30–40% increased likelihood in cases involving ICSI, and a prevalence of birth defects between 1% and 4%.” We have added a list of references to support these claims.
79-81 So, basically transgenesis? Again, vague terms "widely" - I don't think it's all that widely used yet... and references are missing to support the statement that integration of DNA into patient genomes is widely used. Give specific numbers, and provide a reference to support the contention.
Thanks for the comment, we removed the word widely and add references.
81-5 Just finished talking about humans, but now it appears the authors have switched to talking about mice - got to let the readers know that! Unless you're talking about the Chinese group that deleted CCR5 in making transgenic humans?
Your feedback is greatly appreciated. In response to your comments, the sentence in question has been amended to provide a more comprehensive understanding. Indeed, the text refers to experiences carried in mice. The revised wording is as follows: “Given the genetic basis of male infertility, the first strategy, tested in mice, was to overcome spermatogenic failure associated with monogenic diseases by delivery of an intact gene to deficient germ cells (Usmani, Ganguli et al. 2013).
84-5 "efficiently" and "high" - provide context so the reader can understand what is meant - do the authors mean the experiments work efficiently, or that a high percentage of cells are transfected? And give some numbers or range of numbers - you're asking the readers to take your word for things when you choose adjectives - instead, provide values and let the readers decide for themselves.
Thanks for the comment, we have reworded the sentence: Gene therapy is effective in germ cells, as numerous publications have shown that conventional plasmids can be transferred into spermatogonia in several species with success, allowing their transcription in all cells of the germinal lineage (Usmani, Ganguli et al. 2013, Michaelis, Sobczak et al. 2014, Raina, Kumar et al. 2015, Wang, Liu et al. 2022).
93 Reference at the end of the sentence "most countries"
Thanks, we changed the sentence and added the reference: the new sentence is "… to avoid any eugenic deviations, transmissible changes in humans are illegal in 39 countries (Liu 2020)” (Liu, S. (2020). "Legal reflections on the case of genomeedited babies." Glob Health Res Policy 5: 24
93-4 Odd to say "multiple" and then list only one.
Thanks for the comment, we have reworded the sentence: “Furthermore, the genetic modification of germ cell lines poses biological risks, including the induction of cancer, off-target effects, and cell mosaicism. Errors in editing may have adverse effects on future generations. It is exceedingly challenging to anticipate the consequences of genetic mosaicism, for instance, in a single individual. (Sadelain, Papapetrou et al. 2011, Ishii 2017).”
97 Is this really a "small" change? Again, would use adjectives carefully - to this reviewer, this is not a small change, but a significant one! And "should be" is not altogether convincing
Thanks for the comment, we have reworded the sentence: “Thanks to this change, the risk of genomic insertion is avoided, and thus there is no question of heritable alterations.”
What chance is there of retrotransposition? Is there any data in the literature for that, after injecting millions of copies of RNA one or more might be reverse transcribed and inserted into the genome?
This is certainly possible and is the putative origin for multiple intronless spermatid-expressed genes:
The expert poses an interesting question, but one that unfortunately remains unanswered at present. Most papers on mRNA therapy state that there is no risk concerning genomic integration, but no reference is given (for instance see mRNA-based therapeutics: looking beyond COVID-19 vaccines. Lancet. 2024 doi: 10.1016/S0140-6736(23)02444-3). This is an important question, which deserves to be evaluated, but is beyond the scope of this manuscript. Nevertheless is remaining very debating (Igyarto and Qin 2024).
98 Odd to say "should be no risk" and then conclude with "there is no question" - so start the sentence with 'hedging', and then end with certainty - got to pick one or the other.
Thanks for the comment, we have reworded the sentence
99 "Complete" - probably not, would delete:
We removed the word: “The first part of this study presents a characterization of the protein expression patterns obtained following transfection of naked mRNA coding for reporter genes into the testes of mice”
101-2 Reference missing, as are numbers - what % of cases?
Thank you, we changed the sentence and added the reference: “Among infertility disorders, oligoastheno-teratozoospermia (OAT) is the most frequent (50 % (Thonneau, Marchand et al. 1991)” Thonneau, P., S. Marchand, A. Tallec, M. L. Ferial, B. Ducot, J. Lansac, P. Lopes, J. M. Tabaste and A. Spira (1991). "Incidence and main causes of infertility in a resident population (1,850,000) of three French regions (1988-1989)." Hum Reprod 6(6): 811-816.
103 Once again, the reference is missing:
We have added these references: (Colpi, Francavilla et al. 2018) (Cavallini 2006)
104-5 Awkward transition.
Thanks, we changed the transition: “The first part of this study presents a characterization of the protein expression patterns obtained following transfection of naked mRNA coding for reporter genes into the testes of mice. The second part is to apply the protocol to a preclinical mouse model of OAT.”
105 Backslash is odd - never seen it used in that way before
Removed
108 "completely infertile" is redundant;
Thank you, we changed it accordingly: “Patients and mice carrying mutations in the ARMC2 gene present a canonical OAT phenotype and are infertile”.
and is a KO mouse really "preclinical"?
The definition of preclinical research, is research involving the use of animals to ascertain the potential efficacy of a drug, procedure, or treatment. Preclinical studies are conducted prior to any testing in humans. Our KO mouse model has been shown to mimic human infertility. Indeed Armc2-/-mice exhibit a phenotype that is identical to that observed in humans. Our study is in line with this definition. For this reason, we have decided to maintain our current position and to use the term "preclinical" in the article.
110 Delete "sperm".
Thank you, we changed it accordingly: “The preclinical Armc2 deficient (Armc2 KO) mouse model is therefore a valuable model to assess whether in vivo injection of naked mRNA combined with electroporation can restore spermatogenesis”
111 "Easy"? Really?
We changed it accordingly: “We chose this model for several reasons: first, Armc2 KO mice are sterile and all sperm exhibit short, thick or coiled flagella [13].”
112-3 "completely immobile" is redundant - either they are immobile or not.
Thank you, we changed it accordingly: “As a result, 100 % of sperm are immobile, thus it should be easy to determine the efficacy of the technique by measuring sperm motility with a CASA system.”
108-33 Condense this lengthy text into a coherent few sentences to give readers a sense of what you sought to accomplish, broadly how it was done, and what you found. This reads more like a Results section
Thanks for the comment, we shortened the text.
Materials and Methods
The sections appear to have been written by different scientists - the authors should standardize so that similar detail and formatting are used - e.g., in some parts the source is in parentheses with catalog number, in others not, some have city, state, country, others do not... the authors should check eLife mandates for this type of information and provide.
We are grateful for your feedback. We standardized the text, and if we had missed some, as outlined on the E-Life website, we can finish to format the article once it has been accepted for publication in the journal before sending the VOR.
134 Misspelling
We corrected the misspelling
142 Just reference, don't need to spell it out.
Thanks, we changed it accordingly: “and the Armc2 KO mouse strain obtained by CRISPR-Cas9 (Coutton, Martinez et al. 2019). Experiments”
150 What is XXX?
We would like to express our gratitude for bringing this error to our attention. We have duly rectified the issue: “obtained from Merck (Darmstadt, Germany).”
157-60 Are enough details provided for readers to repeat this if necessary? Doesn't seem so to this reviewer; if kits were followed, then can say "using manufacturer's protocol", or refer to another manuscript - but this is too vague.
Thanks, we change it accordingly: After expansion, plasmids were purified with a NucleoBond Xtra Midi kit (740410-50; Macherey-Nagel, Düren, Germany) using manufacturer's protocol.”
165 Again, too few details - how was it purified? What liquid was it in?
Thanks for the comment, the EEV plasmids were purified like all other plasmids. We change the text: “All plasmids,EEV CAGs-GFP-T2A-Luciferase,((EEV604A-2), System Bioscience, Palo Alto, CA, USA), mCherry plasmid ( given by Dr. Conti MD at UCSF, San Francisco, CA, USA) and EEV-Armc2-GFP plasmid (CUSTOM-S017188-R2-3,Trilink,San Diego, USA) were amplified by bacterial transformation”
170 Seems some words are missing - and will everyone know Dr. Conti by last name alone? Would spell out, and the details of the plasmid must either be provided or a reference given; how was amplification done? Purification? What was it resuspended in?
Thank for the remark, the mcherry plasmids were purified like all other plasmids. We change the text: “All plasmids,EEV CAGs-GFP-T2A-Luciferase,((EEV604A-2), System Bioscience, Palo Alto, CA, USA), mCherry plasmid ( given by Dr. Conti MD, UCSF, San Francisco, CA, USA) and EEV-Armc2-GFP plasmid (CUSTOM-S017188-R2-3,Trilink,San Diego, USA) were amplified by bacterial transformation”
175 Again, for this plasmid provide more information - catalog number, reference, etc; how amplified and purified, what resuspension buffer?
Thank you for the remark, as We mentioned, we add this sentence for the preparation: “All plasmids, EEV CAGs-GFP-T2A-Luciferase,((EEV604A-2), System Bioscience, Palo Alto, CA, USA), mCherry plasmid (given by Dr. Conti MD at UCSF, San Francisco, CA, USA) and EEV-Armc2-GFP plasmid (CUSTOMS017188-R2-3,Trilink,San Diego, USA) were amplified by bacterial transformation” and we add these sentence “The EEV-Armc2-GFP plasmid used for in vivo testes microinjection and electroporation was synthesized and customized by Trilink (CUSTOM-S017188-R2-3,San Diego, USA).”
183 What sequence, or isoform was used? Mouse or human?
Thanks, we changed accordingly: “This non-integrative episome contains the mice cDNA sequences of Armc2 (ENSMUST00000095729.11)”
186-7 Provide sequence or catalog number; what was it resolubilized in?
Thanks we changed accordingly “the final plasmid concentration was adjusted to 9 μg μL-1 in water.” We provided the sequence of EEV-Armc2-GFP in supp data 6.
207-219 Much better, this is how the entire section needs to be written!
237-240 Font
Thanks for the comment, we changed it accordingly
246 Cauda, and sperm, not sperm cells
Thanks for the comment, we changed it accordingly
255-6 Which was done first? Would indicate clearly.
Thanks for the comment, we changed the sentence: “Adult mice were euthanized by cervical dislocation and then transcardiac perfused with 1X PBS”
281-2 Provide source for software - company, location, etc:
We changed it accordingly: FIJI software (Opened source software) was used to process and analyze images and Imaris software (Oxford Instruments Tubney Woods, Abingdon, Oxon OX13 5QX, UK) for the 3D reconstructions.
323 um, not uM.
Thanks for the comment, we changed our mistake: “After filtration (100 µm filter)”
Results
369 Weighed.
Thanks for the comment, we changed our mistake: “the testes were measured and weighed”
371 No difference in what, specifically?
Thanks for the comment, we changed the sentence to: “No statistical differences in length and weight were observed between control and treated testes”
375 "was respected"? What does this mean?
Thanks for the comment, we changed the sentence to “The layered structure of germ cells were identical in all conditions”
378 This is highly unlikely to be true, as even epididymal sperm from WT animals are often defective - the authors are saying there were ZERO morphological defects? Or that there was no difference between control and treated? Only showing 2-3 sperm for control vs treatment is not sufficient.
Your observation that the epididymal spermatozoa from wild-type animals exhibited defective morphology is indeed true. The prevalence of these defects varies by strain, with an average incidence of 20% to 40% (Kawai, Hata et al., 2006; Fan, Liu et al., 2015). To provide a more comprehensive representation, we conducted a Harris-Shorr staining procedure and included a histogram of the percentage of normal sperm in each condition (new figure 2F4). Furthermore, Harris-Shorr staining of the epididymal sperm cells revealed that there were no discernible increases in morphological defects when mRNA and EEV were utilized, in comparison with the control. We add the sentence “At last, Harris-Shorr staining of the epididymal sperm cells demonstrated that there were no increases in morphological defects when mRNA and EEV were used in comparison with the control”.
379 "safe" is not the right word - better to say "did not perturb spermatogenesis".
Thanks, we changed it accordingly: “these results suggest that in vivo microinjection and electroporation of EEV or mRNA did not perturb spermatogenesis”
382-3 This sentence needs attention, doesn't make sense as written:
Thanks for the remark, we changed the sentence to: “No testicular lesions were observed on the testes at any post injection time”
389 How long after injection?
Thanks for the comment, we changed the sentence to: “It is worth noting that both vectors induced GFP expression at one day post-injection”
390 Given the duration of mouse spermatogenesis (~35 days), for GFP to persist past that time suggests that it was maintained in SSCs? How can the authors explain how such a strong signal was maintained after such a long period of time? How stable are the episomally-maintained plasmids, are they maintained 100% for months? And if they are inherited by progeny of SSCs, shouldn't they be successively diluted over time? And if they are inherited by daughter cells such that they would still be expressed 49 days after injection, shouldn't all the cells originating from that SSC also be positive, instead of what appear to be small subsets as shown in Fig. 3H2? Overall, this reviewer is struggling to understand how a plasmid would be inherited and passed through spermatogenesis in the manner seen in these results.
Thanks for the comment.
This point was already underlined in public review. We paste here our answer: “The non-insertional Enhanced Episomes Vector (EEV) plasmid is a non-viral episome based on the Epstein-Barr virus (EBV: Epstein-Barr Virus). Its maintenance within the cell is made possible by its ability to replicate in a synchronous manner with the host genome and to segregate into daughter cells. This is due to the fact that EEV is composed of two distinct elements derived from EBV: an origin of replication (oriP) and an Epstein-Barr Nuclear Antigen 1 (EBNA1) expression cassette (Gil, Gallaher, and Berk, 2010). The oriP is a locus comprising two EBNA1-binding domains, designated as the Family of Repeats (FR) and Dyad Symmetry (DS). The FR is an array of approximately 20 EBNA1-binding sites (20 repeats of 30 bp) with high affinity, while the DS comprises four lower-affinity sites operating in tandem (Ehrhardt et al., 2008).
The 641-amino-acid EBNA1 protein contains numerous domains.The N-terminal domains are rich in glycines and alanines, which enable interaction with host chromosomes. The C-terminal region is responsible for binding to oriP (Hodin, Najrana, and Yates, 2013a). The binding of EBNA1 to the DS element results in the recruitment of the origin of replication. This results in the synchronous initiation of extra-chromosomal EEV replication with host DNA at each S phase of the cell cycle (Düzgüneş, Cheung, and Konopka 2018a). Furthermore, EBNA1 binding to the FR domain induces the formation of a bridge between metaphase chromosomes and the vector during mitosis. This binding is responsible for the segregation of the EEV episome in daughter cells (Düzgüneş, Cheung, and Konopka 2018b). It is notable that EEV is maintained at a rate of 90-95% per cell division.”
Because of the intrinsic properties of EEV described above, the presence of the reporter protein at 119 day after injection was likely due to the maintenance of the plasmid, mostly in Sertoli cells, and not to the DNA integration of the plasmid.
Of note, the specificity of EEV was already indicated in the introduction. Nevertheless, we have added more information about it to help the readers (lines 124-128 clean copy)
398 Which "cell types"?
Your feedback is greatly appreciated, and the sentence in question has been amended to provide a more comprehensive understanding. The revised wording is as follows: These results suggest that GFPmRNA and EEV-GFP targeted different seminiferous cell types, such as Sertoli cells and all germline cells, or that there were differences in terms of transfection efficiency.
409 Why is it important to inject similar copies of EEV and mRNA? Wouldn't the EEV be expected to generate many, many more copies of RNA per molecule than the mRNAs when injected directly??
We removed the word importantly.
415 How is an injected naked mRNA stably maintained for 3 weeks? What is the stability of this mRNA?? Wouldn't its residence in germ cells for 21 days make it more stable than even the most stable endogenous mRNAs? Even mRNAs for housekeeping genes such as actin, which are incredibly stable, have half-lives of 9-10 hours.
We appreciate your inquiry and concur with your assessment that mRNA stability is limited. It is our hypothesis that the source of the confusion lies in the fact that we injected mRNA coding for the GFP protein, rather than mRNA tagged with GFP. After a three-week observation period, we did not observe the mRNA, but we observed the expression of the GFP protein induced by the mRNA. To draw the reader's attention to this point, we have added the following sentence to the text “It is important to underline that the signal measured is the fluorescence emitted by the GFP. This signal is dependent of both the half-lives of the plasmid/mRNA and the GFP. Therefore, the kinetic of the signal persistence (which is called here expression) is a combination of the persistence of the vector and the synthetized protein. See lines 469-472 clean copy.
This being said, it is difficult to compare the lifespan of a cellular mRNA with that of a mRNA that has been modified at different levels, including 5’Cap, mRNA body, poly(A)tail modifications, which both increase mRNA stability and translation (see The Pivotal Role of Chemical Modifications in mRNA Therapeutics (2022) https://doi.org/10.3389/fcell.2022.901510). This question is discussed lines 687698 clean copy
467 "safely" should be deleted
Thanks, we removed the word: “To validate and confirm the capacity of naked mRNA to express proteins in the testes after injection and electroporation”
470 Except that apoptotic cells were clearly seen in Figure 2:
We would like to thank the reviewer for their comment. We agree that the staining of the provided sections were of heterogenous quality. To address the remark, we carried out additional HE staining for all conditions, and we now present testis sections correctly stained obtained in the different condition in Fig. 2 and Supp. 7. Our observations revealed that the number of apoptotic cells remained consistent across all conditions.
471 "remanence"?
We appreciate your feedback and have amended the sentence to provide clear meaning. The revised wording is as follows: “The assessment of the temporal persistence of testicular mCherry fluorescent protein expression revealed a robust red fluorescence from day 1 post-injection, which remained detectable for at least 15 days (Fig. Supp. 3 B2, C2, and D2).”
489 IF measures steady-state protein levels, not translation; should say you determined when ARMC2 was detectable.
Thanks for the remark, we changed the sentence to: “ By IF, we determined when ARMC2 protein was detectable during spermatogenesis.”
491 Flagella
Thanks for the comment, we changed our mistake: “in the flagella of the elongated spermatids (Fig 9A)”
Discussion
The Discussion is largely a re-hashing of the Methods and Results, with additional background.
Message stability must be addressed - how is a naked mRNA maintained for 21 days?
As previously stated, it is our hypothesis that the source of the confusion lies in the fact that we injected mRNA coding for the GFP protein, rather than mRNA tagged with GFP. After a three-week observation period, we did not observe the mRNA, but we observed the synthetized GFP protein. This point and the stability of protein in the testis is now discussed lines 677-684 (clean copy).
556 How do the authors define "safe"?
Thanks for the comment, we changed the sentence to be clearer: “Our results also showed that the combination of injection and electroporation did not perturb spermatogenesis when electric pulses are carefully controlled”
563 Synthesized
Thanks, we changed it accordingly
602 Again, this was not apparent, as there were more apoptotic cells in Fig. 2 - data must be provided to show "no effect".
As previously stated, we carried out additional HE staining for all conditions, as can be observed in Fig. 2 . Our observations revealed that the number of apoptotic cells remained consistent across all conditions.
629-30 This directly contradicts the authors' contention in the Introduction that ICSI was unsafe - how is this procedure going to be an advancement over ICSI as proposed, if ICSI needs to be used?? Why not just skip all this and do ICSI then?? Perhaps if this technique was used to 'repair' defects in spermatogonia or spermatocytes, then that makes more sense. But if ICSI is required, then this is not an advancement when trying to rescue a sperm morphology/motility defect.
In light of the latest findings (Fig 12), we have revised this part of the discussion and this paragraph no longer exist.
Nevertheless, to address specifically the reviewer’s remark, we would like to underline that ICSI with sperm from fertile donor is always more efficient than ICSI with sperm from patient suffering of OAT condition. Our strategy, by improving sperm quality, will improve the efficiency of ICSI and at the end will increase the live birth rate resulting from the first fresh IVF cycle.
640-2 What is meant by "sperm organelles" And what examples are provided for sperm proteins being required at or after fertilization?
This paragraph was also strongly modified and the notion of protein persistence during spermatogenesis was discussed in the paragraph on fluorescent signal duration. See lines 698-705.
651 "Dong team"??
Thanks for the comment, we added the references.
Figure 2D2 - tubule treated with EEV-GFP appears to have considerably more apoptotic cells - this reviewer counted ~10 vs 0 in control; also, many of the spermatocytes appear abnormal in terms of their chromatin morphology - the authors must address this by staining for markers of apoptosis - not fair to conclude there was no difference when there's a very obvious difference!
We would like to thank the reviewer for their comment. This point was already addressed. As previously stated, we provide now new testis sections for all condition (see Fig. 2). Our observations revealed that the number of apoptotic cells remained consistent across all conditions.
Figure 2D3 staining is quite different than D1-2, likely a technical issue - looks like no hematoxylin was added? Need to re-stain so results can be compared to the other 2 figures
As previously stated, we carried out additional HE staining for all conditions, and new images are provided, with similar staining.
Figure 3 - the fluorescent images lack any context of tubule structure so it is nearly impossible to get a sense of what cells express GFP, or whether they're in the basal vs adluminal compartment - can the authors outline them? Indicate where the BM and lumen are.
We would like to thank the reviewer for their comment. This figure provides actually a global view of the green fluorescent protein (GFP) expression at the surface of the testis. The entire testis was placed under an inverted epifluorescence microscope, and a picture of the GFP signal was recorded. For this reason, it is impossible to delineate the BM and the lumen. It should be noted that the fluorescence likely originates from different seminiferous tubules.
Author response image 1.
So, for Figure 3 if the plasmid is being uptaken by cells and maintained as an episome, is it able to replicate? Likely not.
Yes! it is the intrinsic property of the episome, see the detailed explanation provided above about the EEV plasmid
So, initially, it could be in spermatogonia, spermatocytes, and spermatids. As time progressed those initially positive spermatids and then spermatocytes would be lost - and finally, the only cells that should be positive would be the progeny of spermatogonia that were positive - but, as they proliferate shouldn't the GFP signal decline?
Because EEV is able to replicate in a synchronous manner with the host genome and to segregate into daughter cells at a level of 90% of the mother cell, the expected decline is very slow.
And, since clones of germ cells are connected throughout their development, shouldn't the GFP diffuse through the intercellular bridges so entire clones are positive? Was this observed?
We did not perform IF experiments further than 7 days after injection, a time too short to observe what the reviewer suggested. Moreover, if at 1 day after injection, GFP synthesized from injected EEV was found in both germ cells and Sertoli cells (Fig 7), after one week, the reporter proteins were only observable in Sertoli cells. This result suggests that EEV is maintained only in Sertoli cells, thus preventing the observation of stained clones.
Can these sections be stained for the ICB TEX14 so that clonality can be distinguished? Based on the apparent distance between cells, it appears some are clones, but many are not...
We thank the reviewer for this suggestion but we are not able to perform testis sectioning and costaining experiments because the PFA treatment bleaches the GFP signal. We also tested several GFP antibodies, but all failed.
Nevertheless, we were able to localize and identify transfected cells thank to the whole testis optical clearing, combined with a measure of GFP fluorescence and three-dimensional image reconstructions.
For Figure 4, with the mRNA-GFP, why does the 1-day image (which looks similar to the plasmidtransfected) look so different from days 7-21?
And why do days 7-21 look so different from those days in Fig 3?
Thank you for your feedback. It is an excellent question. Because of the low resolution of the whole testis epifluorescences imaging and light penetration issue, we decided to carry-out whole testis optical clearing and three-dimensional image reconstructions experiments, in order to get insights on the transfection process. At day 1, GFP synthesized from EEV injection was found in spermatogonia, spermatocytes and Sertoli cells (Fig 7). After one week, the reporter protein synthesized from injected EEV was only observable in Sertoli cells.
In contrast, for mRNA, on day 1 and day 7 post-injection, GFP fluorescent signal was associated with both Sertoli cells and germ cells. This explains why patterns between mRNA-GFP and EEV-GFP are similar at day 1 and different at day 7 between both conditions.
Why do the authors think the signal went from so strong at 21 to undetectable at 28? What changed so drastically over those 7 days?
What is the half-life of this mRNA supposed to be? It seems that 21 days is an unreasonably long time, but then to go to zero at 28 seems also odd... Please provide some explanation, and context for whether the residence of an exogenous mRNA for 21 days is expected.
As previously stated, it is our hypothesis that the source of the confusion lies in the fact that we injected mRNA coding for the GFP protein, rather than mRNA tagged with GFP. After a three-week observation period, we did not observe the mRNA, but we observed the GFP protein produced by the mRNA. The time of observation of the reporter proteins expressed by the respective mRNA molecules (mCherry, luciferase, or GFP) ranged from 15 to 21 days. Proteins have very different turnover rates, with half-lives ranging from minutes to days. Half-lives depend on proteins but also on tissues. As explained in the discussion, it has been demonstrated that proteins involved in spermatogenesis exhibit a markedly low turnover rate and this explains the duration of the fluorescent signal.
The authors should immunostain testis sections from controls and those with mRNA and plasmid and immunostain with established germ cell protein fate markers to show what specific germ cell types are GFP+
Thank you for your feedback. As previously mentioned, we were unable to perform testis sectioning and co-staining because the PFA treatment bleaches the GFP signal and because we were unable to reveal GFP with an GFP antibody, for unknown reasons.
For the GFP signal to be maintained past 35 days, the plasmid must have integrated into SSCs - and for that to happen, the plasmid would have to cross the blood-testis-barrier... is this expected?
We are grateful for your observation.
First, as explained above, we do not think that the plasmid has been integrated.
Concerning the blood-testing barrier. It bears noting that electroporation is a technique that is widely utilized in biotechnology and medicine for the delivery of drugs and the transfer of genes into living cells (Boussetta, Lebovka et al. 2009). This process entails the application of an electric current, which induces the formation of hydrophilic pores in the lipid bilayer of the plasma membrane (Kanduser, Miklavcic et al. 2009). The pores remain stable throughout the electroporation process and then close again once it is complete. Consequently, as electroporation destabilizes the cell membrane, it can also destabilize the gap junctions responsible of the blood-testis barrier. This was actually confirmed by several studies, which have observed plasmid transfection beyond the blood-testis barrier with injection into rete testis following electroporation (Muramatsu, Shibata et al. 1997, Kubota, Hayashi et al. 2005, Danner, Kirchhoff et al. 2009, Kanduser, Miklavcic et al. 2009, Michaelis, Sobczak et al. 2014).
Figure 9 - authors should show >1 cell - this is insufficient; also, it's stated it's only in the flagella, but it also appears to be in the head as well. And is this just the principal piece?? And are the authors sure those are elongating vs condensing spermatids? Need to show multiple tubules, at different stages, to make these claims
We have partly answered to this question in the public review; We pastehere our answer
“We present now new images showing the full seminiferous tubules as requested (see supp fig 6). In this new figure, it is clear that Armc2 is only expressed in spermatids. We have also added in this figure an analysis of the RNA-seq database produced by Gan's team (Gan, Wen et al. 2013), confirming that ArmC2 expression is predominantly expressed at the elongated spermatid stage. This point is now clearly indicated in the text.”
Concerning the localization of the protein in the head, we confirm that the base of the manchette is stained but we have no explanation so far. This point is now indicated in the manuscript.
Figure 10B2 image - a better resolution is necessary
We are grateful for your feedback. We concede that the quality of the image was not optimal. Consequently, We have replaced it with an alternative.
Figure 11 - in control, need to show >1 sperm; and lower-mag images should be provided for all samples to show population-wide effects; showing 1 "normal" sperm per group (white arrows) is insufficient:
We are grateful for your feedback. We conducted further experiments and provide now additional images in Supp. figure 8.
Reviewer #3 (Recommendations For The Authors):
In this study, Vilpreux et al. developed a microinjection/electroporation method in order to transfect RNA into testicular cells. The authors studied several parameters of treated testis and compared the injection of DNA versus RNA. Using the injection of Armc2 RNA into mice with an Armc2 knockout the authors were able to (partly) rescue the fertility phenotype.
Minor points.
Figure 6 + lines 553+554: might it be that the staining pattern primarily on one side of the testis is due to the orientation of the scissor electrode during the electroporation procedure and the migration direction of negatively charged RNA molecules (Figure 6)?
Your input is greatly appreciated. We concur that the observed peripheral expression is due to both the electroporation and injection. Accordingly, we have amended the sentence as follows: "The peripheral expression observed was due to the close vicinity of cells to the electrodes, and to a peripheral dispersal of the injected solution, as shown by the distribution of the fluorescent i-particles NIRFiP-180."
Discussion of the safety aspect (lines 601-608): The authors state several times that there are no visible tissue changes after the electroporation procedure. However, in order to claim that this procedure is "safe", it is necessary to examine the offspring born after microinjection/electroporation.
Your input is greatly appreciated. Consequently, the term "safe" has been replaced with "did not perturb spermatogenesis" in accordance with the provided feedback. Your assertion is correct; an examination of the offspring born would be necessary to ascertain the safety of the procedure. Due to the quantity of motile sperm obtained, it was not possible to produce offspring through natural mating. However, novel Armc2-/--rescued sperm samples have been produced and in vitro fertilization (IVF) and intracytoplasmic sperm injection (ICSI) experiments have been conducted. The results demonstrate that the Armc2-/--rescued sperm can successfully fertilize eggs and produce two-cell embryos by IVF and blastocysts by ICSI. These outcomes are visually represented in Figure 12. The development of embryos up to the blastocyst stage is a step in the right direction.
The discussion section could be shortened. Lines 632-646 are largely a repetition of the introductory section. In addition, the Dong paper (ref. 25) may be interesting; however, this part could also be shortened (lines 647-676). This reviewer would prefer the authors to focus on the technique (different application sites and applied nucleotides) and proof of concept for (partial) phenotype rescue in the knockout mice.
Your contribution is highly valued. In light of your observations and the latest findings, we have substantially revised the discussion accordingly.
Line 63: oocytes rather than eggs.
We are grateful for your input, but we have decided to retain our current position and to use the term "eggs" rather than "oocytes" in our writing because the definition of an oocyte is a female gametocyte or germ cell involved in reproduction. In other words, oocyte corresponds to a germ cell inside the ovary and after ovulation become an egg.
Boussetta, N., N. Lebovka, E. Vorobiev, H. Adenier, C. Bedel-Cloutour and J. L. Lanoiselle (2009). "Electrically assisted extraction of soluble matter from chardonnay grape skins for polyphenol recovery." J Agric Food Chem 57(4): 1491-1497.
Cavallini, G. (2006). "Male idiopathic oligoasthenoteratozoospermia." Asian J Androl 8(2): 143-157.
Colpi, G. M., S. Francavilla, G. Haidl, K. Link, H. M. Behre, D. G. Goulis, C. Krausz and A. Giwercman (2018). "European Academy of Andrology guideline Management of oligo-asthenoteratozoospermia." Andrology 6(4): 513-524.
Coutton, C., G. Martinez, Z. E. Kherraf, A. Amiri-Yekta, M. Boguenet, A. Saut, X. He, F. Zhang, M. Cristou-Kent, J. Escoffier, M. Bidart, V. Satre, B. Conne, S. Fourati Ben Mustapha, L. Halouani, O. Marrakchi, M. Makni, H. Latrous, M. Kharouf, K. Pernet-Gallay, M. Bonhivers, S. Hennebicq, N. Rives, E. Dulioust, A. Toure, H. Gourabi, Y. Cao, R. Zouari, S. H. Hosseini, S. Nef, N. Thierry-Mieg, C. Arnoult and P. F. Ray (2019). "Bi-allelic Mutations in ARMC2 Lead to Severe Astheno-Teratozoospermia Due to Sperm Flagellum Malformations in Humans and Mice." Am J Hum Genet 104(2): 331-340.
Danner, S., C. Kirchhoff and R. Ivell (2009). "Seminiferous tubule transfection in vitro to define postmeiotic gene regulation." Reprod Biol Endocrinol 7: 67.
Gan, H., L. Wen, S. Liao, X. Lin, T. Ma, J. Liu, C. X. Song, M. Wang, C. He, C. Han and F. Tang (2013). "Dynamics of 5-hydroxymethylcytosine during mouse spermatogenesis." Nat Commun 4: 1995. Igyarto, B. Z. and Z. Qin (2024). "The mRNA-LNP vaccines - the good, the bad and the ugly?" Front Immunol 15: 1336906.
Ishii, T. (2017). "Germ line genome editing in clinics: the approaches, objectives and global society." Brief Funct Genomics 16(1): 46-56.
Kanduser, M., D. Miklavcic and M. Pavlin (2009). "Mechanisms involved in gene electrotransfer using high- and low-voltage pulses--an in vitro study." Bioelectrochemistry 74(2): 265-271.
Kubota, H., Y. Hayashi, Y. Kubota, K. Coward and J. Parrington (2005). "Comparison of two methods of in vivo gene transfer by electroporation." Fertil Steril 83 Suppl 1: 1310-1318.
Michaelis, M., A. Sobczak and J. M. Weitzel (2014). "In vivo microinjection and electroporation of mouse testis." J Vis Exp(90).
Muramatsu, T., O. Shibata, S. Ryoki, Y. Ohmori and J. Okumura (1997). "Foreign gene expression in the mouse testis by localized in vivo gene transfer." Biochem Biophys Res Commun 233(1): 45-49.
Raina, A., S. Kumar, R. Shrivastava and A. Mitra (2015). "Testis mediated gene transfer: in vitro transfection in goat testis by electroporation." Gene 554(1): 96-100.
Sadelain, M., E. P. Papapetrou and F. D. Bushman (2011). "Safe harbours for the integration of new DNA in the human genome." Nat Rev Cancer 12(1): 51-58.
Thonneau, P., S. Marchand, A. Tallec, M. L. Ferial, B. Ducot, J. Lansac, P. Lopes, J. M. Tabaste and A. Spira (1991). "Incidence and main causes of infertility in a resident population (1,850,000) of three French regions (1988-1989)." Hum Reprod 6(6): 811-816.
Usmani, A., N. Ganguli, H. Sarkar, S. Dhup, S. R. Batta, M. Vimal, N. Ganguli, S. Basu, P. Nagarajan and S. S. Majumdar (2013). "A non-surgical approach for male germ cell mediated gene transmission through transgenesis." Sci Rep 3: 3430.
Wang, L., C. Liu, H. Wei, Y. Ouyang, M. Dong, R. Zhang, L. Wang, Y. Chen, Y. Ma, M. Guo, Y. Yu, Q. Y. Sun and W. Li (2022). "Testis electroporation coupled with autophagy inhibitor to treat nonobstructive azoospermia." Mol Ther Nucleic Acids 30: 451-464.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Reviewer 1:
Summary: This work presents an Interpretable protein-DNA Energy Associative (IDEA) model for predicting binding sites and affinities of DNA-binding proteins. Experimental results demonstrate that such an energy model can predict DNA recognition sites and their binding strengths across various protein families and can capture the absolute protein-DNA binding free energies.
We appreciate the reviewer’s careful assessment of the paper, and we thank the reviewer for the insightful suggestions and comments.
Strengths:
(1) The IDEA model integrates both structural and sequence information, although such an integration is not completely original. (2) The IDEA predictions seem to have agreement with experimental data such as ChIP-seq measurements.
We appreciate the reviewer’s comments on the strength of the paper.
Weaknesses:
(1) The authors claim that the binding free energy calculated by IDEA, trained using one MAX-DNA complex, correlates well with experimentally measured MAX-DNA binding free energy (Figure 2) based on the reported Pearson Correlation of 0.67. However, the scatter plot in Figure 2A exhibits distinct clustering of the points and thus the linear fit to the data (red line) may not be ideal. As such. the use of the Pearson correlation coefficient that measures linear correlation between two sets of data may not be appropriate and may provide misleading results for non-linear relationships.
We thank the reviewer for the insightful comments and agree that the linear fit between our predictions and the experimental data may not be ideal. The primary utility of the IDEA model is for assessing the relative binding affinities of different DNA sequences. To further support this, we plan to conduct additional statistical analyses that are independent of the linear correlation assumption but instead focus on the ranked order of DNA sequence binding affinities.
(2) In the same vein, the linear Pearson Correlation analysis performed in Figure 5A and the conclusion drawn may be misleading.
We thank the reviewer for the insightful comments. We will perform the same analysis for Figure 5A as detailed in our response to the previous comments.
(3) The authors included the sequences of the protein and DNA residues that form close contacts in the structure in the training dataset, whereas a series of synthetic decoy sequences were generated by randomizing the contacting residues in both the protein and DNA sequences. In particular, synthetic decoy binders were generated by randomizing either the DNA (1000 sequences) or protein sequences (10,000 sequences) from the strong binders. However, the justification for such randomization and how it might impact the model’s generalizability and transferability remain unclear.
We thank the reviewer for the insightful comments. We will perform additional analyses to assess the robustness of our model predictions with respect to the number of randomized decoys. Additionally, we will examine how randomization would potentially affect the model’s generalizability and transferability.
(4) The authors performed Receiver Operating Characteristic (ROC) analysis and reported the Area Under the Curve (AUC) scores in order to quantitate the successful identification of the strong binders by IDEA. It would be beneficial to analyze the precision-recall (PR) curve and report the PRAUC metric which could be more robust.
We agree with Reviewer 1 that more statistical metrics should be used to evaluate our model’s performance. We will include a more robust approach, such as PRAUC, to evaluate our model.
Reviewer 2:
Summary:
Zhang et al. present a methodology to model protein-DNA interactions via learning an optimizable energy model, taking into account a representative bound structure for the system and binding data. The methodology is sound and interesting. They apply this model for predicting binding affinity data and binding sites in vivo. However, the manuscript lacks discussion of/comparison with state-of-the-art and evidence of broad applicability. The interpretability aspect is weak, yet over-emphasized.
We appreciate the reviewer’s excellent summary of the paper, and we thank the reviewer for the insightful suggestions and comments.
Strengths:
The manuscript is well organized with good visualizations and is easy to follow. The methodology is discussed in detail. The IDEA energy model seems like an interesting way to study a protein-DNA system in the context of a given structure and binding data. The authors show that an IDEA model trained on one system can be transferred to other structurally similar systems. The authors show good performance in discriminating between binding-vs-decoy sequences for various systems, and binding affinity prediction. The authors also show evidence of the ability to predict genome-wide binding sites.
We appreciate the reviewer’s strong assessment of the strengths of this paper.
Weaknesses:
An energy-based model that needs to be optimized for specific systems is inherently an uncomfortable idea. Is this kind of energy model superior to something like Rosetta-based energy models, which are generally applicable? Or is it superior to family-specific knowledge-based models? It is not clear.
We thank the reviewer for the insightful comments. We will include predictions by generic protein-DNA energy models, such as the Rosetta-based energy model or family-specific knowledge-based model, to compare with our model performance.
Prediction of binding affinity is a well-studied domain and many competitors exist, some of which are well-used. However, no quantitative comparison to such methods is presented. To understand the scope of the presented method, IDEA, the authors should discuss/compare with such methods (e.g. PMID 35606422).
We thank the reviewer for the insightful comments. In our initial submission, Figure S5 presents a comparison between our model’s prediction and those of an existing method using 10-fold cross-validation. We agree a more comprehensive comparison with other methods is needed and will include a discussion and comparison of the IDEA model’s performance with additional state-of-the-art models.
The term “interpretable” has been used lavishly in the manuscript while providing little evidence on the matter. The only evidence shown is the family-specific residue-nucleotide interaction/energy matrix and speculations on how these values are biologically sensible. Recent works already present more biophysical, fine-grained, and sometimes family-independent interpretability (e.g. PMID 39103447, 36656856, 38352411, etc.). The authors should put into context the scope of the interpretability of IDEA among such works.
We agree that “interpretability” should be discussed in a relevant context. We will discuss the scope of IDEA interoperability within the context of recent works, including those suggested by the reviewers.
The manuscript disregards subtle yet important differences in commonly used terminology in the field. For example, the authors use the term ”specificity” and ”affinity” almost interchangeably (for example, the caption for Figure 3A uses ”specificity” although the Methods text describes the prediction as about ”affinity”). If the authors are looking to predict specificity, IDEA needs to be put in the context of the corresponding state-of-the-art (PMID 36123148, 39103447, 38867914, 36124796, etc).
We really appreciate the reviewer for pointing out our conflation of “specificity” and “affinity” in the manuscript. To clarify, IDEA’s primary function is to predict the binding affinities of protein-DNA pairs in a sequence-specific manner. The acquired binding affinities of target DNA sequences can then be used to assess the specific binding motifs. We will revise our text to clarify this point.
It is not clear how much the learned energy model is dependent on the structural model used for a specific system/family. It would be interesting to see the differences in learned model based on different representative PDB structures used. Similarly, the supplementary figures show a lack of discriminative power for proteins like PDX1 (homeodomain family), POU, etc. Can the authors shed some light on why such different performances?
We thank the reviewer for the insightful comments and agree that the familyspecific energy model could provide insight into the model predictions. We will examine different energy models based on the protein family, and especially investigate whether they can explain the lack of discriminative power for certain proteins.
It is also not clear if IDEA’s prediction for reverse complement sequences is the same for a given sequence. If so, how is this property being modelled? Either this description is lacking or I missed it.
We thank the reviewer for the insightful comments. The IDEA model treats reverse complementary sequences separately. We will provide additional details on how these sequences are modeled.
Reviewer 3:
Summary:
Protein-DNA interactions and sequence readout represent a challenging and rapidly evolving field of study. Recognizing the complexity of this task, the authors have developed a compact and elegant model. They have applied well-established approaches to address a difficult problem, effectively enhancing the information extracted from sparse contact maps by integrating artificial sequences decoy set and available experimental data. This has resulted in the creation of a practical tool that can be adapted for use with other proteins.
We appreciate the reviewer’s excellent summary of the paper, and we thank the reviewer for the insightful suggestions and comments.
Strengths:
(1) The authors integrate sparse information with available experimental data to construct a model whose utility extends beyond the limited set of structures used for training. (2) A comprehensive methods section is included, ensuring that the work can be reproduced. Additionally, the authors have shared their model as a GitHub project, reflecting their commitment to transparency of research.
We appreciate the reviewer’s strong assessment of the strengths of this paper.
Weaknesses:
(1) The coarse-graining procedure appears artificial, if not confusing, given that full-atom crystal structures provide more detailed information about residue-residue contacts. While the selection procedure for distance threshold values is explained, the overall motivation for adopting this approach remains unclear. Furthermore, since this model is later employed as an empirical potential for molecular modeling, the use of P and C5 atoms raises concerns, as the interactions in 3SPN are modeled between C<sub>α</sub> and the nucleic base, represented by its center of mass rather than P or C5 atoms.
We appreciate the reviewer’s insightful comments. The selection of P and C5 atoms will augment our model prediction, but the prediction is robust without this selection scheme. We will provide more details on the motivation behind this selection.
Regarding the simulation model, we acknowledge a potential disconnection between the coarse-grained level of the 3SPN model (3 coarse-grained sites per nucleotide) and the data-driven model (1 coarse-grained site per nucleotide). The selection of nucleic bases for molecular interactions in the 3SPN model follows the PI’s previous work [PMID: 34057467] and its code implementation. We will test the simulation model by incorporating interactions between Cff and P atoms. In the future, we will work on implementing IDEA model output for 1-bead-per-nucleotide DNA simulation models.
(2) Although the authors use a standard set of metrics to assess model quality and predictive power, some ∆∆G predictions compared to MITOMI-derived ∆∆G values appear nonlinear, which casts doubt on the interpretation of the correlation coefficient.
We thank the reviewer for the insightful comments and agree that the linear fit between our model’s prediction and the experimental data may not be ideal. The primary utility of the IDEA model is for assessing the relative binding affinities of different DNA sequences. To this end, we plan to perform additional statistical analyses that are independent of the linear correlation assumption but instead focus on the ranked order of DNA sequence binding affinities.
(3) The discussion section lacks information about the model’s limitations and a comprehensive comparison with other models. Additionally, differences in model performance across various proteins and their respective predictive powers are not addressed.
We thank the reviewer for the insightful comments and will compare the performance of the IDEA model with state-of-the-art methods. We will also perform detailed analyses of the learned energy models across different proteins and examine their correlation with the model’s predictive powers.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
Summary:
Hahn et al use bystander BRET, NanoBiT assays, and APEX2 proteomics to investigate endosomal signaling of CCR7 by two agonists, CCL19 and CCL21. The authors suggest that CCR7 signals from early endosomes following internalisation. They use spatial proteomics to try to identify novel interacting partners that may facilitate this signaling and use this data to specifically enhance a Rac1 signaling pathway. Many of the results in the first few figures showing simultaneous recruitment of Barr and G proteins by CCR7 have been shown previously (Laufer et al, 2019, Cell Reports), as has signaling from endomembranes, and Rac1 activation at intracellular sites. The new findings are the APEX2 proteomics studies, which could be useful to the scientific community. Unfortunately, the authors only follow up on a single finding, and the expansion of this section would improve the manuscript.
First of all, we would like to thank the reviewer for helping with the manuscript. The summary is mostly accurate except for the statement that simultaneous recruitment of barr and G protein to CCR7 has been shown before. It should also be noted that it has not been demonstrated that CCR7 activates G proteins from endosomes previously nor has the functional role of this signaling mechanism. However, that CCR7 activity at endomembranes is associated with Rac1 signaling was demonstrated in the Laufer et al. study as the reviewer correctly points out.
Strengths:
(1) The APEX2 resource will be valuable to the GPCR and immunology community. It offers many opportunities to follow up on findings and discover new biology. The resource could also be used to validate earlier findings in the current manuscript and in previous manuscripts. Was there enrichment of early endosomal markers, Barr and Gi as this would provide further evidence for their earlier claims regarding endosomal signaling? Previous studies have suggested signaling from the TGN, so it is possible that the different ligands also direct to different sites. This could easily be investigated using the APEX2 data.
Thank you for your comment. We do in fact observe enrichment of TGN/Golgi markers in response to chemokine stimulation, which we now have highlighted in the manuscript (fourth paragraph on page 7).
(2) The results section is well written and can be followed very easily by the reader.
We are glad that the reviewer found the results section very readable.
(3) Some findings verify previous studies (e.g. endomembrane signalling). This should be acknowledged as this shows the validity of the findings of both studies.
This is correct. We have now included more discussion of previous work related to CCR7 signaling at endomembranes (thirdparagraph on page 10).
Weaknesses:
(1) The findings are interesting although the studies are almost all performed in HEK293 cells. I understand that these are commonly used in GPCR biology and are easy to transfect and don't express many GPCRs at high concentrations, but their use is still odd when there are many cell-lines available that express CCR7 and are more reflective of the endogenous state (e.g. they are polarised, they can perform chemotaxis/ migration). Some of the findings within the study should also be verified in more physiologically relevant cells. At the moment only the final figure looks at this, but findings need to be verified elsewhere.
We thank the reviewer for raising this point and giving us an opportunity to elaborate in further detail. The major goal of our study was to investigate whether CCR7 activates G protein from endosomes, the underlying mechanism, and functions of this potential signaling mechanism. The reason we chose CCR7 as our model receptor was that it belongs to a group of GPCRs, the chemokine receptors, that most often have features associated with the ability to promote endosomal G protein activation (phosphorylation site clusters in the C-terminal region).
Specific detection of G protein activation at distinct subcellular compartments is currently very challenging in truly endogenous systems despite new innovative biosensors that are available (not just related to CCR7, but GPCRs in general). To our knowledge, most if not all studies that detect direct activation of G protein at a specific compartment whether at the plasma membrane, endosome, Golgi, or other compartments, have overexpressed either the receptor, G protein, or both. This is why we choose the HEK293 cell system for most of our experiments, which are easy to manipulate. That being said, we did confirm major findings in an indirect manner using Jurkat T-cells, which express CCR7 endogenously and are physiological relevant. Our hope is that in the future we will be able to use highly sensitive biosensors to directly confirm our findings in such a cell system as the reviewer wisely suggests.
(2) The authors acknowledge that the kinetic patterns of the signals at the early endosome are not consistent with the rates of internalisation. They mention that this could be due to trafficking elsewhere. This could be easily looked at in their APEX2 data. Is there evidence of proximity to markers of other membranes? Perhaps this could be added to the discussion. Similarly, previous studies have shown that CCR7 signaling may involve the TGN. Was there enrichment of these markers? If not, this could also be an interesting finding and should be discussed. It is also possible that the Rab5 reporter is just not as efficient as the trafficking one, especially as in later figures the very convincing differences in the two ligands are not as robust as the differences in trafficking.
Excellent point. We have now highlighted the possibility of CCR7 being further trafficked to the trans-Golgi network (TGN) as possible explanation for the transient translocation of activated CCR7 to the early endosome in Fig. 1G-H (second paragraph on page 3).
Furthermore, in the APEX2 experiment we observe enrichment of proteins involved in lysosomal trafficking (LAMP1, VPS16, VAMP7, WDR91, and PP4P1) by CCL19 stimulation at 25 min, and recycling endosomes/TGN markers (SNX6, RAB7L, and GGA) by CCL21 stimulation at 25 min. In addition to this, several markers of TGN/Golgi (SNX3, COG5, YIF1A, SC22B, and AP3S1) were enriched as well in response to both CCL19 and CCL21 stimulation. We have now included a statement in the manuscript, which describes the likely trafficking of CCR7 to the TGN/Golgi in response to CCL19 and CCL21 stimulation (fourth paragraph on page 7).
(3) In the final sentence of paragraph 2 of the results the authors state that the internalisation is specific to CCR7 as there isn't recruitment to V2R. I'm not sure this is the best control. The authors can only really say it doesn't recruit to unrelated receptors. The authors could have used a different chemokine receptor which does not respond to these ligands to show this.
The point with this control experiment was to demonstrate that the loss of NanoBiT signal in response to CCL19 in CCR7-SmBiT/LgBiT-CAAX expressing cells, but not in V2R-SmBiT/LgBiT-CAAX expressing cells, was a result of bona fide CCR7 internalization rather than potential artifactual effects of CCL19 on the NanoBiT system. Our intent was not to demonstrate specificity of CCL19 among chemokine receptors, which already has been thoroughly tested in previous studies. We have now modified the sentence (second paragraph on page 3) “Moreover, CCL19/CCL21-stimulation of receptor internalization to endosomes is specific to CCR7 as none of the chemokines promote internalization or trafficking to endosomes of the vasopressin type 2 receptor (V<sub>2</sub>R)-SmBiT construct (Fig. S1E-F)” to “Moreover, CCL19/CCL21-stimulation did not promote internalization or trafficking to endosomes of the vasopressin type 2 receptor (V<sub>2</sub>R)-SmBiT construct, which validates that these chemokines act specifically via the CCR7-SmBiT system (Fig. S1E-F).”
(4) The miniGi-Barr1 and imaging showing co-localisation could be more convincing if it was also repeated in a more physiological cell line as in the final figure. Imaging of CCR7, miniGi, and Barr1 would also provide further evidence that the receptor is also present within the complex.
We agree with the reviewer’s assessment. However, as mentioned above it is currently extremely challenging to detect endogenous G protein coupling/activation to endogenous receptors. In addition, we are not sure if overexpressing fluorophore-tagged receptor, miniG, and barr1 in a physiological-relevant cell line would provide truly physiological conditions as the expression of these proteins still would be artificially high. This is why we chose to conduct these mechanistic experiments in HEK293 cells and then indirectly verify key findings in an endogenous and physiological-relevant cell line.
(5) The findings regarding Rac1 are interesting, although an earlier paper found similar results (Laufer et al, 2019, Cell Reports), so perhaps following up on another APEX2-identified protein pathway would have been more interesting. The authors' statement that Rac1 is specifically activated, and RhoA and Cdc42 are not, is unconvincing from the current data. Only a single NanoBiT assay was used, and as raw values are not reported it is difficult for the reader to glean some essential information. The authors should show evidence that these reporters work well for other receptors (or cite previous studies) and also need evidence from an independent (i.e. non-NanoBiT or BRET) assay.
The major focus of the study was to investigate whether CCR7 can activate G protein after having been internalized into endosomes via formation of CCR7-Gi/o-barr megaplexes, and to dissect out potential functions of said endosomal G protein signaling. To do this, we used CCL19 and CCL21 which stimulate G protein to the same extent but differ in their ability of promote barr recruitment and receptor internalization with CCL19 being superior to CCL21. To this end, we found that CCL19 also promote endosomal G protein activation to a greater extent than CCL21, and therefore, we specifically looked for proteins enriched by CCL19 in our APEX experiment. This led us to some Rho GTPase regulators that were differentially enriched by CCL19 and CCL21. We agree that there were other interesting effectors related to CCR7 biology identified in the APEX experiment such as EYA2, GRIP2, and EI24. However, those proteins were enriched similar by CCL19 and CCL21 challenge, and thus, do not seem to be activated specifically at endosomes. Following the same argument, we also did not observe any difference in the activity of RhoA or Cdc42 when stimulated with CCL19 or CCL21, so we cannot conclude that these signaling proteins are activated specifically in endosomes. On the other hand, Rac1 was stimulated to a larger degree by CCL19 than CCL21, its activity was inhibited by the Gi/o inhibitor PTX and endocytosis inhibitors Dyngo-4a and PitStop2. CCR7-mediated Rac1 signaling was also inhibited by expression of a dominant negative dynamin mutant that inhibits receptor internalization, and Rac1 was not activated by an internalization-deficient CCR7-DS/T mutant. Finally, the involvement of Rac1 in CCR7 mediated chemotaxis of Jurkat T cells was also demonstrated. We believe that these findings together provide strong basis for the claim that endosomal Gi/o protein signaling by CCR7 activates Rac1.
Following the reviewer’s suggestion, we have now included experiments to show that the activation of RhoA, Rac1, and Cdc42 by CXCR4 also can be detected by the NanoBiT biosensors (Fig. S7D-F). We have also added the appropriate references to the original studies where these biosensors were developed in the results section (first paragraph on page 8).
(6) At present, the studies in Figure 7 do not go beyond those in the previous Laufer et al study in which they showed blocking endocytosis affected Rac1 signalling. The authors could show that Rac1 signalling is from early endosomes to improve this, otherwise, it could be from the TGN as previously reported.
The major purpose of Figure 7 was to indirectly confirm findings from HEK293 cells experiments and to tie them to physiological functions. Our experiments using Jurkat T-cells show that CCL19 promote stronger chemotactic response than CCL21 despite similar Gi/o response. In addition, we showed that CCR7-mediated Gi/o activation, receptor endocytosis, as well as Rac1 activity, are required to drive chemotaxis. The Laufer et al. study did not investigate whether CCR7 activates G protein after having been internalized into endosomes via formation of CCR7-Gi/o-barr megaplexes, and thus, did not focus on functional outcomes of this signaling mechanism. Based on this, we believe our work provides new and valuable knowledge to the field.
Reviewer #2 (Public Review):
Summary:
This manuscript describes a comprehensive analysis of signalling downstream of the chemokine receptor CCR7. A comprehensive dataset supports the authors' hypothesis that G protein and beta-arrestin signalling can occur simultaneously at CCR7 with implications for continued signalling following receptor endocytosis.
We would like to thank the reviewer for helping with the manuscript. We agree on all points made and have now updated the manuscript accordingly.
Strengths:
The experiments are well controlled and executed, employing a wide range of assays using - in the main - CCR7 transfectants. Data are well presented, with the authors' claims supported by the data. The paper also has an excellent narrative which makes it relatively easy to follow. I think this would certainly be of interest to the readership of the journal.
We appreciate the positive assessment of strengths.
Weaknesses:
Since the authors show a differential enrichment of RhoGTPases by CCR7 stimulation with CCL19 versus CCL21, I think that they also need to show that the Gi/o coupling of HEK-292-CCR7-APEX2 cells to both CCL19 and CCL21 is not perturbed by the modification. Currently, the authors only show data for CCL19 signalling, which leaves the potential for a false negative finding in terms of CCL21 signalling being selectively impaired. This should be relatively easy to do and should strengthen the authors' conclusions.
We agree with the reviewer and have now included experiments to show that both CCL19- and CCL21-mediated CCR7-APEX2 stimulation leads to Gi/o activation (Fig. S4C). In addition, our proteomics experiments show strong effects of both CCL19 and CCL21 stimulation, which suggest that the receptor is activated by both ligands.
The authors conclude the discussion by suggesting that their findings highlight endosomal signalling as a general mechanism for chemokine receptors in cell migration. I think this is an overreach. The authors chose several studies of CXC chemokine receptors to support their argument that C-terminal truncation or mutation of the C-terminal phosphorylation sites impairs endocytosis and chemotaxis (refs 40-42). However, in some instances e.g. at the related chemokine receptor CCR4, C-terminal removal of these sites impairs endocytosis but promotes chemotaxis (Nakagawa et al, 2014); Anderson et al, 2020). I therefore think that either the final statement needs to be tempered down or the counterargument discussed a little.
We appreciate the reviewer highlighting this point. We have now modified the concluding sentence from “Thus, the findings from our study highlight endosomal G protein signaling by chemokine receptors as a potential general mechanism that regulates key aspects of cell migration” to “Thus, the findings from our study highlight endosomal G protein signaling by some chemokine receptors as a potential mechanism that regulates key aspects of cell migration.” We hope that the temper level of this sentence is more appropriate.
References:
Anderson, C. A. et al. A degradatory fate for CCR4 suggests a primary role in Th2 inflammation. J Leukocyte Biol 107, 455-466 (2020).
Nakagawa, M. et al. Gain-of-function CCR4 mutations in adult T cell leukaemia/lymphoma. Journal of Experimental Medicine 211, 2497-2505 (2014).
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
(1) The results section is well written, although the introduction needs more information on what is known about CCR7 trafficking and endomembrane signaling. I understand this is because the authors wanted to focus on GPCR signaling, but the study will equally be of interest to researchers in the immunology and chemokine fields, and therefore more CCR7-focussed discussion in the introduction would be useful. Similarly, the discussion would benefit from more discussion of previous studies of CCR7 trafficking and endomembrane signaling (in particular the Laufer et al paper) to acknowledge that many of the findings within this paper verify previous studies.
We have now included additional immunology/endomembrane background information about CCR7 at the place where the receptor is introduced (first paragraph on page 3). We have also expanded our discussion of our work in relation to the Laufer et al. study (third paragraph on page 10).
(2) On page 5, the authors state that 'The response to chemokine stimulation was not observed in mock transfected HEK293 cells'. Figure S4D does not have a legend so it is difficult to see what they mean by mock transfected. Do they mean not transfecting with anything or not with the receptor? The better control would be transfecting the reporters but not the receptor. This may have been done, but the wording needs clarifying and S4D needs a legend.
Thanks for pointing this out. We believe the reviewer refers to Figure S2D and we have now highlighted/clarified the legend better. Mock transfected conditions refer to HEK293 cells transfected with the reporter, but not the receptor. This is written in the legend as “(D) Change in luminescence signal generated between SmBiT-barr1 and LgBiT-miniGi in response to 100 nM CCL19 or 100 nM CCL21 in mock transfected HEK293 cells (no CCR7)”, which we believe should be clear to the audience.
(3) The validation of the APEX2 receptor construct relies on a single assay with one ligand. The authors should show that the receptor expresses at the cell surface, is internalised normally, and that both ligands activate the receptor.
We have now included additional data to show that (1) the receptor is expressed at the cell surface, (2) that the CCR7-APEX2 recruits barr1 to the plasma membrane, (3) that this association leads to barr1 translocation to the early endosomes as an indirect measurement of receptor internalization, and (4) that both CCL19- and CCL21-stimulation inhibit forskolin induced cAMP production (Fig.S4A-C, and described in fifth paragraph on page 6).
(4) The APEX2 section is very short, especially as this is novel data. It lacks some important information, e.g. when the authors state that 'we identified a total of 579 proteins', is this in total for both ligands, separately or were some shared? More information on each ligand separately and combined would make this clearer.
We have now specified that the identified total proteins enriched from our APEX2 approach is when the cells are stimulated with either CCL19 or CCL21 (third paragraph on page 7). Furthermore, we have included a Venn diagram in Fig. S5C to show how many proteins were enriched by CCL19 or CCL21 stimulation and how many of those were shared at different time points.
(5) The discussion would benefit from some further work. The current first two paragraphs just reiterate the introduction and don't discuss the current paper so could be removed completely. The Laufer et al study needs much more discussion as they report many of the findings of the current paper (signaling following endocytosis, Rac1 endomembrane signaling) five years ago. The APEX2 findings that are discussed, though interesting, are not followed up by further experimental evidence and there is little discussion of why the two ligands have different responses or what the physiological effects could be.
We appreciate the reviewer’s effort in helping with the discussion. To this end, we have now expanded our discussion of the mentioned paper further as suggested (third paragraph on page 10). We agree that the findings from our APEX experiment are interesting, but the focus of this study relates to proteins enriched specifically at endosomes. Several of the most enriched proteins did not show this localization bias, which is why these proteins were not further investigated.
Minor changes:
(1) The authors should remove the word 'recent' at the start of the first sentence of the third paragraph. Endosomal signaling by GPCRs was described 15 years ago so cannot really be seen as recent anymore.
We have now adjusted the manuscript accordingly.
(2) Tukey defaulted to Turkey in some places.
We thank the reviewer for pointing out these typos, which now have been corrected.
Reviewer #2 (Recommendations For The Authors):
Minor Points:
(1) ACKRs do not couple to G proteins so it is peculiar to see them in this table. I would limit the table to the conventional CCR1-10, CXCR1-6 and XCR1. The ligand for XCR1 is XCL1 which is absent from the table.
We have now modified the table accordingly.
(2) CCL19 (formerly known as ELC) has been long known to be a more efficacious and potent ligand in chemotaxis assays (Bardi et al, 2001). This earlier reference should be added to the citations in the preceding statement on page 10.
This is an important study showing that CCL19 is more efficacious than CCL21 in promoting chemotaxis and that this has been known for decades. We have now included the reference accordingly (reference 59 in second paragraph on page 11).
(3) Figure 6, Panel Q. I think the legends for CCR7 and CCR7 delta ST might be flipped.
We thank the reviewer for pointing out this error. We have now corrected the figure panel.
(4) Figure S5 (or 5) might benefit from simple Venn diagrams showing the numbers of differentially enriched proteins following treatment with the two ligands at different time points.
We have included a Venn diagram in Fig. S5C to show how many proteins were enriched by CCL19 or CCL21 stimulation and how many of those where shared.
Reference:
Bardi, G., Lipp, M., Baggiolini, M. & Loetscher, P. The T cell chemokine receptor CCR7 is internalized on stimulation with ELC, but not with SLC. European Journal of Immunology 31, 3291-3297 (2001).
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public Review):
Summary:
Understanding the mechanisms of how organisms respond to environmental stresses is a key goal of biological research. Assessment of transcriptional responses to stress can provide some insights into those underlying mechanisms. The researchers quantified traits, fitness, and gene expression (transcriptional) response to salinity stress (control vs stress treatments) for 130 accessions of rice (three replicates for each accession), which were grown in the field in the Philippines. This experimental design allowed for many different types of downstream analyses to better understand the biology of the system. These analyses included estimating the strength of selection imposed on transcription in each environment, evaluating possible trade-offs in gene expression, testing whether salinity induces transcriptional decoherence, and conducting various eQTL-type analyses.
Strengths:
The study provides an extensive analysis of gene expression responses to stress in rice and offers some insights into underlying mechanisms of salinity responses in this important crop system. The fact that the study was conducted under field conditions is a major plus, as the gene expression responses to soil salinity are more realistic than if the study was conducted in a greenhouse or growth chamber. The preprint is generally well-written and the methods and results are mostly well-described.
Weaknesses:
While the study makes good use of analyzing the dataset, it is not clear how the current work advances our understanding of gene regulatory evolution or plant responses to soil salinity generally. Overall, the results are consistent with other prior studies of gene expression and studies of selection across environmental conditions. Some of the framing of the paper suggests that there is more novelty to this study than there is in reality. That said, the results will certainly be useful for those working in rice and should be interesting to scientists interested in how gene expression responses to stress occur under field conditions. I detail other concerns I had about the preprint below:
The abstract on lines 33-35 illustrates some of my concerns about the overstatement of the novelty of the current study. For example, is it really true that the role of gene expression in mediating stress response and adaptation is largely unexplored? There have been numerous studies that have evaluated gene expression responses to stresses in a wide range of organisms. Perhaps, I am missing something critically different about this study. If so, I would recommend that the authors reword this sentence to clarify what gap is being filled by this study. Further, is it really the case that none of them have evaluated how the correlational structure of gene expression changes in response to stresses in plants, as implied in lines 263-265? Don't the various modules and PC analyses of gene expression get at this question?
We have re-worded these sentences, and highlighted the novelty of our work.
There were some places in the methods of the preprint that required more information to properly evaluate. For example, more information should be provided on lines 664-668 about how G, E, and GxE effects were established, especially since this is so central to this study. What programs/software (R? SAS? Other?) were used for these analyses? If R, how were the ANOVAs/models fit? What type of ANOVA was used? How exactly was significance determined for each term? Which effects were considered fixed and which were random? If the goal was to fit mixed models, why not use an approach like voom-limma (Law et al. 2014 Genome Biology)? More details should also be added to lines 688-709 about these analyses, including what software/programs were used for these analyses.
We have added more details in the methods. Also, although we could in priciple use voom-limma to fit our mixed model, to be able to partition variance into G, E and G×E, we need to use the function fitExtractVarPartModel (from package VariancePartition) which requires all categorical variables to be modeled as random effects. Therefore, we couldn’t model environment as a fixed effect.
One thing that I found a bit confusing throughout was the intermixing of different terms and types of selection. In particular, there seemed to be some inconsistencies with the usage of quantitative genetics terms for selection (e.g. directional, stabilizing) vs molecular evolution terms for selection (e.g. positive, purifying). I would encourage the authors to think carefully about what they mean by each of these terms and make sure that those definitions are consistently applied here.
We have defined the selection terms used in the study and used these terms consistently throughout the manuscript.
It would be useful to clarify the reasons for the inherent bias in the detection of conditional neutrality (CN) and antagonistic pleiotropy (AP; Lines 187-196). It is also not clear to me what the authors did to deal with the bias in terms of adjusting P-value thresholds for CN and AP the way it is currently written. Further, I found the discussion of antagonistic pleiotropy and conditional neutrality to be a bit confusing for a couple of reasons, especially around lines 489-491. First of all, does it really make sense to contrast gene expression versus local adaptation, when lots of local adaptation likely involves changes in gene expression? Second, the implication that antagonistic pleiotropy is more common for local adaptation than the results found in this study seems questionable. Conditional neutrality appears to be more common for local adaptation as well: see Table 2 of Wadgymar et al. 2017 Methods in Ecology and Evolution. That all said, it is always difficult to conclude that there are no trade-offs (antagonistic pleiotropy) for a particular locus, as the detecting trade-offs may only manifest in some years and not others and can require large sample sizes if they are subtle in effect.
We have now explained the cause of the inherent bias in the detection of CN, and also elaborated on how we deal with this bias. Also, we have edited our discussion and added relevant citations to indicate both conditional neutrality and antagonistic pleiotropy can lead to local adaptations and added the caveat regarding detecting antagonistic pleiotropy.
Reviewer #2 (Public Review):
The authors investigate the gene expression variation in a rice diversity panel under normal and saline growth conditions to gain insight into the underlying molecular adaptive response to salinity. They present a convincing case to demonstrate that environmental stress can induce selective pressure on gene expression, which is in agreement to their earlier study (Groen et al, 2020). The data seems to be a good fit for their study and overall the analytic approach is robust.
(1) The work started by investigating the effect of genotype and their interaction at each transcript level using 3'-end-biased mRNA sequencing, and detecting a wide-spread GXE effect. Later, using the total filled grain number as a proxy of fitness, they estimated the strength of selection on each transcript and reported stronger selective pressure in a saline environment. However, this current framework relies on precise estimation of fitness and, therefore can be sensitive to the choice of fitness proxy.
We now acknowledge this caveat in the discussion.
(2) Furthermore, the authors decomposed the genetic architecture of expression variation into cis- and trans-eQTL in each environment separately and reported more unique environment-specific trans-eQTLs than cis-. The relative contribution of cis- and trans-eQTL depends on both the abundance and effect size. I wonder why the latter was not reported while comparing these two different genetic architectures. If the authors were to compare the variation explained by these two categories of eQTL instead of their frequency, would the inference that trans-eQTLs are primarily associated with expression variation still hold?
We have now also reported the effect sizes for both cis- and trans-eQTLs in the two environments and showed that the trans-eQTLs have higher effect sizes as compared to cis-eQTLs, indicating that they are able to explain higher proportion of variation in transcript abundances in the two environments.
(3) Next, the authors investigated the relationship between cis- and trans-eQTLs at the transcript level and revealed an excess of reinforcement over the compensation pattern. Here, I struggle to understand the motivation for testing the relationship by comparing the effect of cis-QTL with the mean effect of all trans-eQTLs of a given transcript. My concern is that taking the mean can diminish the effect of small trans-eQTLs potentially biasing the relationship towards the large-effect eQTLs.
We wanted to estimate compensating vs reinforcing effects, which essentially entails identifying genes that have opposing directionality of cis and trans-effects. To get the total trans-effect we decided to take the mean effect of trans-eQTLs. This mean was only used to identify the compensating/reinforcing genes and although the mean effects diminishes the effect of small trans-eQTLs, this mean was not used in downstream analyses.
Reviewer #3 (Public Review):
In this work, the authors conducted a large-scale field trial of 130 indica accessions in normal vs. moderate salt stress conditions. The experiment consists of 3 replicates for each accession in each treatment, making it 780 plants in total. Leaf transcriptome, plant traits, and final yield were collected. Starting from a quantitative genetics framework, the authors first dissected the heritability and selection forces acting on gene expression. After summarizing the selection force acting on gene expression (or plant traits) in each environment, the authors described the difference in gene expression correlation between environments. The final part consists of eQTL investigation and categorizing cis- and trans-effects acting on gene expression.
Building on the group's previous study and using a similar methodology (Groen et al. 2020, 2021), the unique aspect of this study is in incorporating large-scale empirical field works and combining gene expression data with plant traits. Unlike many systems biology studies, this study strongly emphasizes the quantitative genetics perspective and investigates the empirical fitness effects of gene expression data. The large amounts of RNAseq data (one sample for each plant individual) also allow heritability calculation. This study also utilizes the population genetics perspective to test for traces of selection around eQTL. As there are too many genes to fit in multiple regression (for selection analysis) and to construct the G-matrix (for breeder's equation), grouping genes into PCs is a very good idea.
Building on large amounts of data, this study conducted many analyses and described some patterns, but a central message or hypothesis would still be necessary. Currently, the selection analysis, transcript correlation structure change, and eQTL parts seem to be independent. The manuscript currently looks like a combination of several parallel works, and this is reflected in the Results, where each part has its own short introduction (e.g., 185-187, 261-266, 349-353). It would be great to discuss how these patterns observed could be translated to larger biological insights. On a related note, since this and the previous studies (focusing on dry-wet environments) use a similar methodology, one would also wonder what the conclusions from these studies would be. How do they agree or disagree with each other?
We acknowledge that the manuscript currently presents some analyses in a somewhat independent manner. Although it would be ideal to have a central hypothesis/message, our study is meant to broadly outline the various responses and fitness effects of salinity stress in rice. Throughout the manuscript, we have also included comparisons between our findings and that of our previous studies on drought stress to highlight any consistent themes or novel insights.
Many analyses were done separately for each environment, and results from these two environments are listed together for comparison. Especially for the eQTL part, no specific comparison was discussed between the two environments. It would be interesting to consider whether one could fit the data in more coherent models specifically modeling the X-by-environment effects, where X might be transcripts, PCs, traits, transcript-transcript correlation, or eQTLs.
We do plan to consider fitting models that explicitly incorporate X-by-environment interactions to provide a more detailed understanding of the genetics of plasticity between the two environments, but it is beyond the scope of this paper. This will be explored in a separate report.
As stated, grouping genes into PCs is a good idea, but although in theory, the PCs are orthogonal, each gene still has some loadings on each PC (ie. each PC is not controlled by a completely different set of genes). Another possibility is to use any gene grouping method, such as WGCNA, to group genes into modules and use the PC1 of each module. There, each module would consist of completely different sets of genes, and one would be more likely to separate the biological functions of each module. I wonder whether the authors could discuss the pros and cons of these methods.
We recognize that individual genes can contribute to multiple PCs, and this is precisely why we choose PCA clustering over WGCNA where one gene can belong to only one module. Our aim was to recognize all biological processes that could be under selection in either environment, and since one gene can be involved in various different processes, we wanted to identify the contribution of these genes to different processes which can be done effectively by a PCA analyses.
Reviewer #4 (Public Review):
The manuscript examines how patterns of selection on gene expression differ between a normal field environment and a field environment with elevated salinity based on transcript abundances obtained from leaves of a diverse panel of rice germplasm. In addition, the manuscript also maps expression QTL (eQTL) that explains variation in each environment. One highlight from the mapping is that a small group of trans-mapping regulators explains some gene expression variation for large sets of transcripts in each environment. The overall scope of the datasets is impressive, combining large field studies that capture information about fecundity, gene expression, and trait variation at multiple sites. The finding related to patterns indicating increased LD among eQTLs that have cis-trans compensatory or reinforcing effects is interesting in the context of other recent work finding patterns of epistatic selection. However, other analyses in the manuscript are less compelling or do not make the most of the value of collected data. Revisions are also warranted to improve the precision with which field-specific terminology is applied and the language chosen when interpreting analytical findings.
Selection of gene expression:
One strength of the dataset is that gene expression and fecundity were measured for the same genotypes in multiple environments. However, the selection analyses are largely conducted within environments. The addition of phenotypic selection analyses that jointly analyze gene expression across environments and or selection on reaction norms would be worthwhile.
We do plan to consider fitting models that explicitly incorporate G×E interactions to provide a more detailed understanding of the genetics of plasticity between the two environments, but it is beyond the scope of this paper. This will be explored in a separate report.
Gene expression trade-offs:
The terminology and possibly methods involved in the section on gene expression trade-offs need amendment. I specifically recommend discontinuing reference to the analysis presented as an analysis of antagonistic pleiotropy (rather than more general trade-offs) because pleiotropy is defined as a property of a genotype, not a phenotype. Gene expression levels are a molecular phenotype, influenced by both genotype and the environment. By conducting analyses of selection within environments as reported, the analysis does not account for the fact that the distribution of phenotypic values, the fitness surface, or both may differ across environments. Thus, this presents a very different situation than asking whether the genotypic effect of a QTL on fitness differs across environments, which is the context in which the contrasting terms antagonistic pleiotropy and conditional neutrality have been traditionally applied. A more interesting analysis would be to examine whether the covariance of phenotype with fitness has truly changed between environments or whether the phenotypic distribution has just shifted to a different area of a static fitness surface.
We recognize that pleiotropy is a property of a genotype, and not phenotype, but since our phenotype (gene expression) is strongly coupled with the genotype, we choose to call trade-offs as antagonistic pleiotropy. That being said, we did test whether the covariance of gene expression with phenotype significantly varies between environments, and found that to indeed be the case.
Biological processes under selection / Decoherence: PCs are likely not the most ideal way to cluster genes to generate consolidated metrics for a selection gradient analysis. Because individual genes will contribute to multiple PCs, the current fractional majority-rule method applied to determine whether a PC is under direct or indirect selection for increased or decreased expression comes across as arbitrary and with the potential for double-counting genes. A gene co-expression network analysis could be more appropriate, as genes only belong to one module and one can examine how selection is acting on the eigengene of a co-expression module. Building gene co-expression modules would also provide a complementary and more concrete framework for evaluating whether salinity stress induces "decoherence" and which functional groups of genes are most impacted.
We recognize that individual genes can contribute to multiple PCs, and this is precisely why we choose PCA clustering over WGCNA where one gene can belong to only one module. Our aim was to recognize all biological processes that could be under selection in either environment, and since one gene can be involved in various different processes, we wanted to identify the contribution of these genes to different processes which can be done effectively by a PCA analyses. But again as pointed out by the reviewer, our PCs did contain contribution (even negligible) of each gene, so to identify the ‘primary’ biological processes represented by the PCs, we chose the majority rule. As for testing decoherence, we agree that a co-expression module analyses would have provided additional support to the specific test performed in our manuscript, but since it would just be additional support, we choose to not add it in the manuscript.
But based on the recommendation of the reviewer(s), we did perform a WGCNA analyses and found a total of 14 and 13 modules in normal and saline conditions, of which 0 and 2 modules (with no significant GO enrichment) were under directional selection. This supports our reasoning of potentially missing on identification of processes under selection.
Selection of traits:
Having paired organismal and molecular trait data is a strength of the manuscript, but the organismal trait data are underutilized. The manuscript as written only makes weak indirect inferences based on GO categories or assumed gene functions to connect selection at the organismal and molecular levels. Stronger connections could be made for instance by showing a selection of co-expression module eigengene values that are also correlated with traits that show similar patterns of selection, or by demonstrating that GWAS hits for trait variation co-localize to cis-mapping eQTL.
We did perform a GWAS for all the traits collected in both normal and saline environment, and only found significant hits for fecundity (in both normal and saline environment) and chlorophyll_a content (in the saline environment). But these regions did not overlap with any candidate genes or cis-mapping eQTL. Hence we choose to mention it in the manuscript. Additionally, using the WGCNA modules, we found that the only two module under selection in the saline environment were not significantly correlated with any of the traits measured.
Genetic architecture of gene expression variation:
The descriptive statistics of the eQTL analysis summarize counts of eQTLs observed in each environment, but these numbers are not broken down to the molecular trait level (e.g., what are the median and range of cis- and trans-eQTLs per gene). In addition, genetic architecture is a combination of the numbers and relative effect sizes of the QTLs. It would be useful to provide information about the relative distributions of phenotypic variance explained by the cis- vs. trans- eQTLs and whether those distributions vary by environment. The motivation for examining patterns of cis-trans compensation specifically for the results obtained under high salinity conditions is unclear to me. If the lines sampled have predominantly evolved under low salinity conditions and the hypothesis being evaluated relates to historical experience of stabilizing selection, then my intuition is that evaluating the eQTL patterns under normal conditions provides the more relevant test of the hypothesis.
We have added the median number of eQTLs per gene in each environment. Additionally, we recognize that genetic architecture is a combination id numbers and effect size, and we have added information regarding the effect sizes of eQTLs by type and by environment as recommend by another reviewer. We did explore the distributions of phenotypic variance explained by the cis- vs. trans- eQTLs as recommended here, and found that trans-eQTLs explain more phenotypic variance than cis-eQTLs in both environments and that the distribution of either type of eQTL does not vary by environment. We are choosing to not add this in the main text due to space limitations. Lastly, we examined the patterns of cis-trans compensation/reinforcement under both normal and salinity conditions and have compared and contrasted the results from both in the main text.
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
Lines 126: I would recommend citing those who originally developed the 3' end targeted RNA sequencing methods (e.g. Meyer et al 2011 Molecular Ecology).
We have cited the recommended paper.
Lines 128-130: It would be useful to include a description here of what models were fit to the data to partition out G, E, and GxE effects.
Due to space limitations, we have in brief added a sentence to this effect.
Line 139: I would suggest changing "found little" to "no" since the test was not significant.
The sentence has been modified to say no evidence.
Line 313: I think you mean directional selection instead of positive selection.
We have corrected the text
Lines 362-363: Would the authors also expect an enrichment of reinforcing genes for most scenarios where that has been divergent selection, such as local adaptation among populations?
Based on our hypothesis, we would indeed expect an enrichment of reinforcing genes for scenarios of local adaptation where different alleles are maintained in different populations due to local adaptation.
Reviewer #3 (Recommendations For The Authors):
Figures 1d-e are not mentioned in the Results.
The figures have been referenced in appropriate places.
Lines 41-45: Terms such as reinforcement and compensation need to be explained in this specific context. Also "different selection regimes" is a bit broad and vague.
Due to word-count limitation, we are choosing to not elaborate the terms reinforcement and compensation in the abstract (since these are commonly used in the literature, and we have also defined these in the main text). Additionally, we now explicitly state the selection pressures associated with cis and trans eQTLs.
Table 1: Please explain S and C in the footnote.
We have added the recommended footnote
Figures: Some panel labels (a, b, c...) are mingled with the graphs.
We are re-made our figure such that the panel labels do not mingle with the plots.
Lines 588-591: font.
Modified
Lines 620-633: Please describe how these RNAseq libraries were allocated/pooled into different sequencing lanes to avoid potential batch effects among sequencing lanes.
The sequencing was performed on the same Illumina NextSeq 500 machine and we have added the sequencing libraries/pool plan in the methods (lines 688-689).
Lines 690-692: At the beginning of this paragraph, it was mentioned that the un-standardized coefficients were estimated. But here, it seems like the transcript data were already standardized in the data preparation step. What do lines 687-688 refer to? Further standardizing those estimated coefficients so that the whole distribution has mean=0 and sd=1?
Thank you pointing out our oversight. We checked our scripts and data preparation did not include transcript standardization, and we have removed the above line from the manuscript.
Lines 705-711: Please explain why assigning the positive/negative selection status for each gene is important. "Positive selection" here is defined as genes whose increased expression also increases fitness, but traditionally positive selection was defined as "the derived state is favored over the ancestral state". For a gene whose ancestral expression is high but lower expression increases fitness in this experiment, could we also say this gene is under positive selection? Given that we don't know the ancestral state here, maybe the authors could explain whether this definition is necessary. Also, given that many genes positively or negatively regulate each other in a pathway, it is also unclear whether it is necessary to assign the positive/negative status for a PC using the majority rule (lines 710-711).
We have now defined the different selection terms with respect to our study and use them consistently throughout the manuscript.
Lines 711-715: If I understand correctly, PCs were used as traits, and by definition PCs should all be orthogonal. Is this section saying only retaining PCs whose correlation < 0.6 with each other? What is the rationale?
PCA were performed on transcript abundance and the resulting orthogonal PCs explaining over 0.5% variance were all retained for selection analyses.
We also performed selection analyses on the functional traits measured in the field, but since these functional traits are correlated (and as such would not satisfy the independent variable requirement of regression analyses), we retained only those functional traits which had a Pearson correlation coefficient < 0.6.
Line 729: Please briefly describe what CLIP is doing.
We have added the required description.
Lines 736-741: The accession numbers do not add up to 125.
Thank you for catching our oversight. We have edited the text, and now the numbers add upto 125.
Line 796: Please remind readers where these 247k SNPs come from. Supposedly all accessions have been whole-genome sequenced, so the total number of SNPs should be larger than this.
We have detailed method detailing how the SNPs were obtained and processed in the lines preceding this. Indeed the number of SNPs would have been much bigger, but the stringent cutoffs and linkage disequilibrium pruning reduced our dataset to about 247k SNPs.
Lines 154-160: This is a bit confusing. The authors first mentioned, for the raw selection differentials, the mean and variance differ between environments, meaning they are misleading (why?). The next sentence then says non-standardized selection differentials will be used.
The mean and variance for transcript abundances vary between the two environments. Because traits are usually measured in different scales, it is recommended to standardize trait values using variance or mean before estimating selection coefficients. Multiplying this variance (or mean) standardized selection differential with heritability gives the expected response to selection in standard deviation (or mean) units. But if the trait variance (or mean) varies between traits or environments, it leads to a conflation between the standardized selection differential and trait variance (or mean), which can be misleading. So to avoid this, and given that our traits (transcript abundance in this case) were all measured on the same scale, we chose to not standardize our trait values and estimated raw selection differentials.
Figure 1 c-e: Please explain how the horizontal axis values were obtained. Is it assuming these selection differentials have a normal distribution of mean=0 & sd=1?
Yes, horizontal axis represents theorical quantile for selection differential assuming they have a normal distribution with mean=0 and sd=1. This has been added to the figure legend.
Line 162-168: Please clarify this part. What does “general trend towards stronger positive compared to negative selection on gene expression” mean? Does it mean the whole distribution of S is significantly different from 0, the difference in the number of genes in the S>0 vs S<0 category, or the a-bit-higher median |S| in the S>0 vs S<0 category? If it is the last one, are the small differences biological meaningful (0.053 vs. 0.047 for control & 0.051 vs. 0.050 for salt conditions), given that the authors defined |S|<0.1 as neutral?
By “general trend towards stronger positive compared to negative selection on gene expression”, we mean that more transcripts were under positive directional selection as compared to negative directional selection. We have also clarified this in the text now.
Line 177-178: This sentence implies disruptive selection is more important than stabilizing selection in the saline environment, but the test was not significant (line 176).
Although there was no significant difference in the magnitude of stabilizing vs disruptive selection within the saline environment, the number of transcripts experiencing stronger disruptive selection in the saline condition was greater than the number of transcripts experiencing disruptive selection in the normal conditions. And so comparing between conditions, disruptive selection plays an important role in the saline conditions.
Line 188-190: How CN vs. AP was statistically defined was not mentioned in the Methods section.
We have added in the main text within the Results section.
Line 203-214: How do these results fit with the previous observations that almost all transcripts have significant heritability?
Although we do find that all but three transcripts have a have significant genetic effect (and thus have significant heritability), the median broad-sense heritability for 51 antagonistically pleiotropic genes is 0.23. Give that, we would only be able to detect SNPs regulating gene expression with high effect size since our sample size is n=130. Additionally, we used a very stringent criteria (FDR < 0.001) to define eQTLs. These two factors in combination could lead to us not being able to detect significant eQTLs for AP genes.
Line 246-250: Please explain why the current conclusion would be opposite from the previous study. Supposedly the PCA, G matrix, and breeder’s equation were done for each environment separately. It makes sense that the G matrix and response to selection could be different between saline and drought treatments, but for the control treatments in the two studies, do they still differ? Why? Also in Table S7, it would be nice to show the % variation explained by each PC.
Although both our studies had largely overlapping samples, about 20% samples were unique to each study. Additionally, although the site where the study was performed was the same across the two studies, we found significant temporal differences in gene expression due to micro-environmental differences. Both these factors can lead to changes in direct and indirect selection and its response, and we are examining these differences as part of a separate study. We also highlight these caveats in our discussion.
Information on percent explained by each PCs is given in Table S5.
Figure 2b: The vertical axis was labeled as “selection gradient”, but I think the responses to selection (D, I, T) have different units.
We have re-labeled the vertical axis as “selection”.
Reviewer #4 (Recommendations For The Authors):
The manuscript mixes terminology for selection from quantitative genetics with that from population genetics. This is problematic, and the adjectives positive and negative should be replaced as descriptors of selection by instead rewording, for example, positive directional selection as directional selection for higher transcript abundance.
Lines 193-196: The phrasing here reads as if the selection is solely acting on the presence/absence of expression rather than on quantitative variation in expression. During revision, it would be worth considering including an analysis of genes that parses genes that show the presence/absence of variation of expression within or across environments separately from genes that are expressed to non-trivial levels in both environments.
We have modified the sentence in question now. Also, we pre-processed RNA-seq data to remove all transcripts with low expression signals (sigma signal < 20), and further retained only transcripts that had non-trivial expression in at least 10% of the population, which we believe represents presence/absence of variation of expression within or across environments.
Lines 216-231: Is this analysis solely for directional selection? Not clear since previous sections examined both directional and stabilizing selection.
Yes, we performed this analysis for only directional selection, and have clarified this in the text too.
Lines 224-226: The meaning of this sentence is unclear and should be written more concretely.
We have rephrased the sentence to be more clear.
Lines 232-241: The description of the scientific logic here could be read as implying that genes interacting in networks are the sole source of indirect selection. I recommend revising the language to indicate this cause is one of several potential causes.
We have reworded the sentence such that we indicate selection acting on interacting genes is just one of the causes of indirect selection.
The strength of the conclusions of the decoherence analysis should be evaluated in light of caveats with such analyses (see Cai and Des Marais New Phytologist 2023).
We have added the caveat with relevant citation in the manuscript.
Rename this section as "Selection on Organismal Traits", as the previous sections have also been investigating selection on traits, just molecular traits.
We have renamed the section as recommended
Lines 314-318: Rewrite for clarity. Most environments select for an optimal phenotype; it is just the case here that the phenotypic distribution in the high salinity environment overlaps with the optimum.
We have rephrased and clarified the statement.
Lines 343-345: Rephrase to "These results indicate that natural variation in gene regulation under..."
Rephrased.
Line 354: "most" reads as too strong a descriptor here if the majority is ~60%.
We have reworded the sentence to read “more than half”
Lines 359-361: It is unclear to me how this interpretation follows from the above analysis.
We have reworded the sentence so that the claim follows our analysis.
Line 372: Is the expectation here more specifically one of epistatic selection? Other processes could stochastically lead to the genetic fixation of compensatory/reinforcing variants, but I think only epistasis for fitness would cause the interesting patterns of LD observed.
The expectation here is that certain cis and trans variants only exists to compensate/reinforce, potentially through epistasis. We have clarified this in the text.
Line 405: Change "adaptive organismal responses of organisms" to "organismal responses." As written, the sentence reads as being about plasticity rather than evolutionary responses, which are by populations, not organisms. None of the analyses included the manuscript test specifically test for adaptive plasticity.
Rephrased.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public review):
The conserved AAA-ATPase PCH-2 has been shown in several organisms including C. elegans to remodel classes of HORMAD proteins that act in meiotic pairing and recombination. In some organisms the impact of PCH-2 mutations is subtle but becomes more apparent when other aspects of recombination are perturbed. Patel et al. performed a set of elegant experiments in C. elegans aimed at identifying conserved functions of PCH-2. Their work provides such an opportunity because in C. elegans meiotically expressed HORMADs localize to meiotic chromosomes independently of PCH-2. Work in C. elegans also allows the authors to focus on nuclear PCH-2 functions as opposed to cytoplasmic functions also seen for PCH-2 in other organisms.
The authors performed the following experiments:
(1) They constructed C. elegans animals with SNPs that enabled them to measure crossing over in intervals that cover most of four of the six chromosomes. They then showed that doublecrossovers, which were common on most of the four chromosomes in wild-type, were absent in pch-2. They also noted shifts in crossover distribution in the four chromosomes.
(2) Based on the crossover analysis and previous studies they hypothesized that PCH-2 plays a role at an early stage in meiotic prophase to regulate how SPO-11 induced double-strand breaks are utilized to form crossovers. They tested their hypothesis by performing ionizing irradiation and depleting SPO-11 at different stages in meiotic prophase in wild-type and pch-2 mutant animals. The authors observed that irradiation of meiotic nuclei in zygotene resulted in pch-2 nuclei having a larger number of nuclei with 6 or greater crossovers (as measured by COSA-1 foci) compared to wildtype. Consistent with this observation, SPO11 depletion, starting roughly in zygotene, also resulted in pch-2 nuclei having an increase in 6 or more COSA-1 foci compared to wild type. The increased number at this time point appeared beneficial because a significant decrease in univalents was observed.
(3) They then asked if the above phenotypes correlated with the localization of MSH-5, a factor that stabilizes crossover-specific DNA recombination intermediates. They observed that pch-2 mutants displayed an increase in MSH-5 foci at early times in meiotic prophase and an unexpectedly higher number at later times. They conclude based on the differences in early MSH-5 localization and the SPO-11 and irradiation studies that PCH-2 prevents early DSBs from becoming crossovers and early loading of MSH-5. By analyzing different HORMAD proteins that are defective in forming the closed conformation acted upon by PCH-2, they present evidence that MSH-5 loading was regulated by the HIM-3 HORMAD.
(4) They performed a crossover homeostasis experiment in which DSB levels were reduced. The goal of this experiment was to test if PCH-2 acts in crossover assurance. Interestingly, in this background PCH-2 negative nuclei displayed higher levels of COSA-1 foci compared to PCH-2 positive nuclei. This observation and a further test of the model suggested that "PCH-2's presence on the SC prevents crossover designation."
(5) Based on their observations indicating that early DSBS are prevented from becoming crossovers by PCH-2, the authors hypothesized that the DNA damage kinase CHK-2 and PCH2 act to control how DSBs enter the crossover pathway. This hypothesis was developed based on their finding that PCH-2 prevents early DSBs from becoming crossovers and previous work showing that CHK-2 activity is modulated during meiotic recombination progression. They tested their hypothesis using a mutant synaptonemal complex component that maintains high CHK-2 activity that cannot be turned off to enable crossover designation. Their finding that the pch-2 mutation suppressed the crossover defect (as measured by COSA-1 foci) supports their hypothesis.
Based on these studies the authors provide convincing evidence that PCH-2 prevents early DSBs from becoming crossovers and controls the number and distribution of crossovers to promote a regulated mechanism that ensures the formation of obligate crossovers and crossover homeostasis. As the authors note, such a mechanism is consistent with earlier studies suggesting that early DSBs could serve as "scouts" to facilitate homolog pairing or to coordinate the DNA damage response with repair events that lead to crossing over. The detailed mechanistic insights provided in this work will certainly be used to better understand functions for PCH-2 in meiosis in other organisms. My comments below are aimed at improving the clarity of the manuscript.
We thank the reviewer for their concise summary of our manuscript and their assessment of our work as “convincing” and providing “detailed mechanistic insight.”
Comments
(1) It appears from reading the Materials and Methods that the SNPs used to measure crossing over were obtained by mating Hawaiian and Bristol strains. It is not clear to this reviewer how the SNPs were introduced into the animals. Was crossing over measured in a single animal line? Were the wild-type and pch-2 mutations made in backgrounds that were isogenic with respect to each other? This is a concern because it is not clear, at least to this reviewer, how much of an impact crossing different ecotypes will have on the frequency and distribution of recombination events (and possibly the recombination intermediates that were studied).
We have clarified these issues in the Materials and Methods of our updated preprint. The control and pch-2 mutants were isogenic in either the Bristol or Hawaiian backgrounds. Control lines were the original Bristol and Hawaiian lines and pch-2 mutants were originally made in the Bristol line and backcrossed at least 3 times before analysis. Hawaiian pch-2 mutants were made by backcrossing pch-2 mutants at least 8 times to the Hawaiian background and verifying the presence of Hawaiian SNPs on all chromosomes tested in the recombination assay. To perform the recombination assays, these lines were crossed to generate the relevant F1s.
(2) The authors state that in pch-2 mutants there was a striking shift of crossovers (line 135) to the PC end for all of the four chromosomes that were tested. I looked at Figure 1 for some time and felt that the results were more ambiguous. Map distances seemed similar at the PC end for wildtype and pch-2 on Chrom. I. While the decrease in crossing over in pch-2 appeared significant for Chrom. I and III, the results for Chrom. IV, and Chrom. X. seemed less clear. Were map distances compared statistically? At least for this reviewer the effects on specific intervals appear less clear and without a bit more detail on how the animals were constructed it's hard for me to follow these conclusions.
We hope that the added details above makes the results of these assays more clear. Map distances were compared and did not satisfy statistical significance, except where indicated. While we agree that the comparisons between control animals and pch-2 mutants may seem less clear with individual chromosomes, we argue that more general, consistent patterns become clear when analyzing multiple chromosomes. Indeed, this is why we expanded our recombination analysis beyond Chromosome III and the X Chromosome, as reported in Deshong, 2014. We have edited this sentence to: “Moreover, there was a striking and consistent shift of crossovers to the PC end of all four chromosomes tested.”
(3) Figure 2. I'm curious why non-irradiated controls were not tested side-by-side for COSA-1 staining. It just seems like a nice control that would strengthen the authors' arguments.
We have added these controls in the updated preprint as Figure 2B.
(4) Figure 3. It took me a while to follow the connection between the COSA-1 staining and DAPI staining panels (12 hrs later). Perhaps an arrow that connects each set of time points between the panels or just a single title on the X-axis that links the two would make things clearer.
To make this figure more clear, we have generated two different cartoons for the assay that scores GFP::COSA-1 foci and the assay that scores bivalents. We have also edited this section of the results to make it more clear.
Reviewer #2 (Public review):
Summary:
This paper has some intriguing data regarding the different potential roles of Pch-2 in ensuring crossing over. In particular, the alterations in crossover distribution and Msh-5 foci are compelling. My main issue is that some of the models are confusingly presented and would benefit from some reframing. The role of Pch-2 across organisms has been difficult to determine, the ability to separate pairing and synapsis roles in worms provides a great advantage for this paper.
Strengths:
Beautiful genetic data, clearly made figures. Great system for studying the role of Pch-2 in crossing over.
We thank the reviewers for their constructive and useful summary of our manuscript and the analysis of its strengths.
Weaknesses:
(1) For a general audience, definitions of crossover assurance, crossover eligible intermediates, and crossover designation would be helpful. This applies to both the proposed molecular model and the cytological manifestation that is being scored specifically in C. elegans.
We have made these changes in an updated preprint.
(2) Line 62: Is there evidence that DSBs are introduced gradually throughout the early prophase? Please provide references.
We have referenced Woglar and Villeneuve 2018 and Joshi et. al. 2015 to support this statement in the updated preprint.
(3) Do double crossovers show strong interference in worms? Given that the PC is at the ends of chromosomes don't you expect double crossovers to be near the chromosome ends and thus the PC?
Despite their rarity, double crossovers do show interference in worms. However, the PC is limited to one end of the chromosome. Therefore, even if interference ensures the spacing of these double crossovers, the preponderance of one of these crossovers toward one end (and not both ends) suggest something functionally unique about the PC end.
(4) Line 155 - if the previous data in Deshong et al is helpful it would be useful to briefly describe it and how the experimental caveats led to misinterpretation (or state that further investigation suggests a different model etc.). Many readers are unlikely to look up the paper to find out what this means.
We have added this to the updated preprint: “We had previously observed that meiotic nuclei in early prophase were more likely to produce crossovers when DSBs were induced by the Mos transposon in pch-2 mutants than in control animals but experimental caveats limited our ability to properly interpret this experiment.”
(5) Line 248: I am confused by the meaning of crossover assurance here - you see no difference in the average number of COSA-1 foci in Pch-2 vs. wt at any time point. Is it the increase in cells with >6 COSA-1 foci that shows a loss of crossover assurance? That is the only thing that shows a significant difference (at the one time point) in COSA-1 foci. The number of dapi bodies shows the loss of Pch-2 increases crossover assurance (fewer cells with unattached homologs). So this part is confusing to me. How does reliably detecting foci vs. DAPI bodies explain this?
We have removed this section to avoid confusion.
(6) Line 384: I am confused. I understand that in the dsb-2/pch2 mutant there are fewer COSA-1 foci. So fewer crossovers are designated when DSBs are reduced in the absence of PCH-2.
How then does this suggest that PCH-2's presence on the SC prevents crossover designation? Its absence is preventing crossover designation at least in the dsb-2 mutant.
We have tried to make this more clear in the updated preprint. In this experiment, we had identified three possible explanations for why PCH-2 persists on some nuclei that do not have GFP::COSA-1 foci: 1) PCH-2 removal is coincident with crossover designation; 2) PCH-2 removal depends on crossover designation; and 3) PCH-2 removal facilitates crossover designation. The decrease in the number of GFP::COSA-1 foci in dsb2::AID;pch-2 mutants argues against the first two possibilities, suggesting that the third might be correct. We have edited the sentence to read: “These data argue against the possibility that PCH-2’s removal from the SC is simply in response to or coincident with crossover designation and instead, suggest that PCH-2’s removal from the SC somehow facilitates crossover designation and assurance.”
(7) Discussion Line 535: How do you know that the crossovers that form near the PCs are Class II and not the other way around? Perhaps early forming Class I crossovers give time for a second Class II crossover to form. In budding yeast, it is thought that synapsis initiation sites are likely sites of crossover designation and class I crossing over. Also, the precursors that form class I and II crossovers may be the same or highly similar to each other, such that Pch-2's actions could equally affect both pathways.
We do not know that the crossovers that form near the PC are Class II but hypothesize that they are based on the close, functional relationship that exists between Class I crossovers and synapsis and the apparent antagonistic relationship that exists between Class II crossovers and synapsis. We agree that Class I and Class II crossover precursors are likely to be the same or highly similar, exhibit extensive crosstalk that may complicate straightforward analysis and PCH-2 is likely to affect both, as strongly suggested by our GFP::MSH-5 analysis. We present this hypothesis based on the apparent relationship between PCH-2 and synapsis in several systems but agree that it needs to be formally tested. We have tried to make this argument more clear in the updated preprint.
Reviewer #3 (Public review):
Summary:
This manuscript describes an in-depth analysis of the effect of the AAA+ ATPase PCH-2 on meiotic crossover formation in C. elegant. The authors reach several conclusions, and attempt to synthesize a 'universal' framework for the role of this factor in eukaryotic meiosis.
Strengths:
The manuscript makes use of the advantages of the 'conveyor' belt system within the c.elegans reproductive tract, to enable a series of elegant genetic experiments.
We thank this reviewer for the useful assessment of our manuscript and the articulation of its strengths.
Weaknesses:
A weakness of this manuscript is that it heavily relies on certain genetic/cell biological assays that can report on distinct crossover outcomes, without clear and directed control over other aspects and variables that might also impact the final repair outcome. Such assays are currently out of reach in this model system.
In general, this manuscript could be more generally accessible to non-C.elegans readers. Currently, the manuscript is hard to digest for non-experts (even if meiosis researchers). In addition, the authors should be careful to consider alternative explanations for certain results. At several steps in the manuscript, results could ostensibly be caused by underlying defects that are currently unknown (for example, can we know for sure that pch-2 mutants do not suffer from altered DSB patterning, and how can we know what the exact functional and genetic interactions between pch-2 and HORMAD mutants tell us?). Alternative explanations are possible and it would serve the reader well to explicitly name and explain these options throughout the manuscript.
We have made the manuscript more accessible to non-C. elegans readers and discuss alternate explanations for specific results in the updated preprint.
Recommendations for the authors:
Reviewing Editor Comments:
(1) Please provide 'n' values for each experiment.
n values are now included in the Figure legends for each experiment.
(2) Line 129: Please represent the DCOs as percent or fraction (1%-9.8%, instead of 1-13).
We have made this change.
(3) Figure 3A legend: the grey bar should read 20hr. COSA-1/ 32 hr DAPI. In Figure 3E, it is not clear why 36hr Auxin and 34hr Auxin show a significant difference in DAPI bodies between control and pch-2, but 32hr Auxin treatment does not. Here again 'n' values will help.
We have made this change. We also are not sure why the 32 hour auxin treatment did not show a significant difference in DAPI stained bodies. We have included the n values, which are not very different between timepoints and therefore are unlikely to explain the difference. The difference may reflect the time that it takes for SPO-11 function to be completely abrogated.
(4) Line 360: Please provide the fraction of PCH-2 positive nuclei in dsb-2.
We have made this change.
Please also address all reviewer comments.
Reviewer #1 (Recommendations for the authors):
(1) Page 3, line 52. While I agree that crossing over is important to generate new haplotypes, work has suggested that the contribution by an independent assortment of homologs to generate new haplotypes is likely to be significantly greater. One reference for this is: Veller et al. PNAS 116:1659.
We deeply appreciate this reviewer pointing us to this paper, especially since it argues that controlling crossover distribution contributes to gene shuffling and now cite it in our introduction! While we agree that this paper concludes that independent assortment likely explains the generation of new haplotypes to a greater degree than crossovers, the authors performed this analysis with human chromosomes and explicitly include the caveat that their modeling assumes uniform gene density across chromosomes. For example, we know this is not true in C. elegans. It would be interesting to perform the same analysis with C. elegans chromosomes in control and pch-2 mutants, taking into account this important difference.
(2) Figure 2. It would really help the reader if an arrow and text were shown below each irradiation sign to indicate the stage in meiosis in which the irradiation was done as well as another arrow in the late pachytene box to show when the COSA-1 foci were analyzed. In general, having text in the figures that help stage the timing in meiosis would help the non C. elegans reader. This is also an issue where staging of C. elegans is shown (Figure 4).
We have made these changes to Figure 2. To help readers interpret Figure 4, we have added TZ and LP to the graphs in Figure 4B and 4D and indicated what these acronyms (transition zone and late pachytene, respectively) are in the Figure legend.
(3) Page 12, line 288. It would be valuable to first outline why the him3-R93Y and htp-3H96Y alleles were chosen. This was eventually done on Page 13, but introducing this earlier would help the reader.
We have introduced these mutations earlier in the manuscript.
(4) Page 13, line 323. A one sentence description of the OLLAS tagging system would be useful.
We have added this sentence: “we generated wildtype animals and pch-2 mutants with both GFP::MSH-5 and a version of COSA-1 that has been endogenously tagged at the Nterminus with the epitope tag, OLLAS, a fusion of the E. coli OmpF protein and the mouse Langerin extracellular domain”
Reviewer #2 (Recommendations for the authors):
(1) The title is a little awkward. Consider: PCH-2 controls the number and distribution of crossovers in C. elegans by antagonizing their formation
We have made this change.
(2) Abstract:
Consider removing "that is observed" from line 20.
We have made this change.
I'm confused by the meaning of "reinforcement of crossover-eligible intermediates" from line 27.
We have removed this phrase from the abstract.
A definition of crossover assurance would be helpful in the abstract.
We have added this to the abstract: “This requirement is known as crossover assurance and is one example of crossover control.”
(3) Line 36: I know a stickler but many meioses only produce one haploid gamete (mammalian oocytes, for example)
Thanks for the reminder! We have removed the “four” from this sentence.
(4) Line 284 - are you defining MSH-5 foci as crossover-eligible intermediates? If so, please state this earlier.
We have added this to the introduction to this section of the results: “In C. elegans, these crossover-eligible intermediates can be visualized by the loading of the pro-crossover factor MSH-5, a component of the meiosis-specific MutSγ complex that stabilizes crossover-specific DNA repair intermediates called joint molecules”
(5) Can the control be included in Figure S1?
We have made this change.
(6) Can you define that crossover designation is the formation of a COSA-1 focus?
We did this in the section introducing GFP::MSH-5: “In the spatiotemporally organized meiotic nuclei of the germline, a functional GFP tagged version of MSH-5, GFP::MSH-5, begins to form a few foci in leptotene/zygotene (the transition zone), becoming more numerous in early pachytene before decreasing in number in mid pachytene to ultimately colocalize with COSA-1 marked sites in late pachytene in a process called designation”
(7) Would it be easier to see the effect of DSB to crossover eligible intermediates in Spo-11, Pch-2 vs. Spo-11 mutant with irradiation using your genetic maps? At least for early vs. late breaks?
Unfortunately, irradiation does not show the same bias towards genomic location that endogenous double strand breaks do so it is unlikely to recapitulate the effects on the genetic map.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public review):
Weaknesses:
In my estimation, the following would improve this manuscript:
(1) The physiological relevance of these data could be better highlighted. For instance, future work could revolve around incubating oocytes with oviduct fluid (or OVGP1) to reduce polyspermy in porcine IVF, and naturally improve sperm selection in human IVF.
Thank you for the suggestions. We have added these physiological relevance points at the end of the discussion.
(2) Biological and technical replicate values for each experiment are unclear - for semen, oocytes, and oviduct fluid pools. I suggest providing in the Materials and Methods and/or Figure legends.
Biological and technical replicates are now indicated in M&M. Number of oocytes or ZPs used were already indicated in every Supplementary Table.
(3) Although differences presented in the bar charts seem obvious, providing statistical analyses would strengthen the manuscript.
Statistical analyses are now indicated in each bar chart.
(4) Results are presented as {plus minus} SEM (line 677); however, I believe standard deviation is more appropriate.
This was a mistake; all the results are indicated as standard deviation.
(5) Given the many independent experimental variables and combinations, a schematic depiction of the experimental design may benefit readers.
A schematic depiction of the experimental design is now included as Figure 1. This new Figure modifies the number assigned to the rest of Figures.
(6) Attention to detail can be improved in parts, as delineated in the "author recommendation" review section.
Done
Reviewer #2 (Public review):
Weaknesses:
The authors postulate a role for oviductal fluid in species-specific fertilization, but in my opinion, they cannot rule out hormonal effects or differences in the method of oocyte maturation employed.
As we indicate below, the effect of hormones has been analyzed, and we have demonstrated that it is not the cause of zona pellucida specificity.
They also cannot unequivocally prove that OVGP1 is the oviductal protein involved in the effect. Additional experiments are necessary to rule out these alternative explanations.
Our work does not demonstrate that other proteins could be involved, but it does show that OVGP1 is involved in the process.
When performing the EZPT assay on mouse oocytes obtained either from the ovary or from the oviduct, the oocytes obtained from the ovary came from mice primed with eCG, whereas the ones collected from the oviduct were obtained from superovulated mice (eCG plus hCG). This difference in the hormonal environment may make a difference in the properties of the ZP. Additionally, the ones obtained from the ovary were in vitro matured, which is also different from the freshly ovulated eggs and, again, may change the properties of the ZP. I suggest doing this experiment superovulating both groups of mice but collecting the fully matured MII eggs from the ovary before they get ovulated. In that way the hormonal environment will be the same in both groups and in both groups, oocytes will be matured in vivo. Hence, the only difference will be the exposure to oviductal fluids.
In Figure 2, we compare ZPs from murine oocytes obtained from the ovary using only PMSG with ZPs from oviductal oocytes treated with both HCG and PMSG. But in Figure 7, however, we compared ZPs from murine oocytes exposed only to PMSG, with the only difference being whether or not they had been in contact with OVGP1. This shows that it is not the effect of the hormone but rather the contact with OVGP1 that determines their specificity.
Mice with OVGP1 deletion are viable and fertile. It would be quite interesting to investigate the species-specificity of sperm-ZP binding in this model. That would indicate whether OVGP1 is the only glycoprotein involved in determining species-specificity. Alternatively, the authors could immunodeplete OVGP1 from oviductal fluid and then ascertain whether this depleted fluid retains the ability to impede cross-species fertilization.
We agree with the reviewer that it would be interesting to investigate sperm-ZP binding in this model. Unfortunately, we do not have the OVGP1 knockout mouse strain. We also believe that immunodepletion of OVGP1 would not completely remove the protein, so its effect would likely not be entirely eliminated.
What is the concentration of OVGP1 in the oviduct? How did the authors decide what concentration of protein to use in the experiments where they exposed ZPs to purified OVGP1? Why did they use this experimental design to check the structure of the ZP by SEM? Why not do it on oocytes exposed to oviductal fluid, which would be more physiological?
We have included in the manuscript that the concentration of OVGP1 in the oviductal fluid was quantified using ImageJ software by comparing the mean gray value of the band in the oviductal fluid to the band in the recombinant protein lane. By establishing this relationship, along with the known concentration of protein amount in the recombinant one and in the total protein amount of oviductal fluid, the concentration of OVGP1 in the oviductal fluid was determined as the average of three western blots. The concentration of OVGP1 in oviductal fluids was in the range of 100-150 ng/µl in mice and 150-200 ng/µL in cow. We have included also in the manuscript the concentration that we have use for the EZPTs, 30 ng/µL of recombinants OVGP1 (bovine, murine and human) for 30 minutes in 20µL drops. With this concentration, we observed a clear effect on zona specificity with no negative impact on the gametes.
As you can see in supplementary Fig S8B, we already realized SEM of oocytes exposed to oviductal fluid.
None of the figures show any statistical analysis. Please perform analysis for all the data presented, include p values, and indicate which statistical tests were performed. The Statistical analysis section in the Methods indicating that repeated measures ANOVA was used must refer to the tables. Was normality tested? I doubt all the data are normally distributed, in which case using ANOVA is not appropriate.
Statistical results are now included in each Figure and Table. All the statistical analysis are included, all the data pass normality, homogeneity of variance and independence; for this reason the data analysis was conducted by using a one-way ANOVA, followed by Tukey´s post hoc test. Significance level was set at p <0.05.
Why was OVGP1 selected as the probable culprit of the species specificity? In the Results section entitled "Homology of bovine, human and murine OVGP1 proteins..." the authors delve into the possible role of this protein without any rationale for investigating it. What about other oviductal proteins?
A sentence indicating this rationale for investigating OVGP1 has been introduced in this paragraph.
Reviewer #3 (Public review):
Weaknesses:
The manuscript began with a well-written introduction, but problems started to surface in the Results section, in the Discussion, as well as in the Materials and Methods. Major concerns include inconsistencies, misinterpretation of results, lacking up-to-date literature search, numerous errors found in the figure legends, misleading and incorrect information given in the Materials and Methods, missing information regarding statistical analysis, and inadequate discussion. These concerns raise questions regarding the authenticity of the study, reliability of the findings, and interpretation of the results. The manuscript does not provide solid and convincing findings to support the conclusion.
We have modified and clarified all the issues, some of which are misunderstandings, we have also performed the suggested experiment of putting sperm in contact with OVGP1.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
(1) Ensure consistency in (past) tense, for example, "decondensed" (line 102), "induced" (line 103), and elsewhere.
Done
(2) Replace "table" with "Table" throughout.
Done
(3) The authors often refer to "co-incubation". I believe this should read "incubation". My understanding is that oocytes were incubated with oviduct fluid or sperm but never both simultaneously as "co-incubation" implies.
Done
(4) Synonymous terms "OVGP1" and "oviductin" are used interchangeably. Consider using one or the other for consistency.
We believe that by using both terms, reading is more fluid.
(5) Delete "around" on line 256 and "approximately" on line 263 and provide actual percentages.
Done
(6) The point of the sentence on lines 311-313 is unclear to me.
Rewritten
(7) Suggest specifying "wildtype" on line 419.
All the mice used in this work are wildtype
(8) Do the authors have details regarding cattle oocyte donor breeds?
Done
(9) What do the authors mean by "strengthen" on line 500?
The word strengthen has been changed to carefully isolated
(10) Ponceau and vinculin (Figure 3) details are not provided in the manuscript.
Ponceau and vinculin details are now included in the manuscript
(11) Address formatting issues (e.g. citation 26 among others).
Done
(12) Primary and secondary antibody controls for immunofluorescent imaging (to fully exclude autofluorescence) are lacking.
Controls for immunofluorescent imaging are indicated in Supplementary Figure S7.
(13) The corresponding author on the manuscript and in the eLife submission system are different
It was a problem during submission, now it is corrected.
Reviewer #2 (Recommendations for the authors):
(1) For the experiment depicted in Figures 3C and D, the authors need to perform a negative control to demonstrate that this fluorescent signal is specific. What happens if they express a different FLAG-tagged protein instead of bOVGP1 and mOVGP1? FLAG antibodies give quite strong non-specific binding. Or if they expressed untagged bovine and mouse OVGP1?
The negative controls are in the supplementary Figure S7. A rabbit polyclonal antibody to the human OVGP1 was used for murine and bovine IVM ZPs from ovaries and murine superovulated ZPs recovered from mouse oviducts. There is a remarkable difference in the ones that are not incubated with any OVGP1 and the endogenous one, given the specificity of the antibody.
Also, IVM mouse and bovine oocytes incubated or not with OF were immunoblotted with anti-Flag-tag antibody. Since any of them present OVGP1 tagged to Flag, there is not signal in the immunofluorescence.
(2) For the Western blots of recombinant proteins, why are the authors not showing the blots using His and FLAG tag antibodies? Is the 50-kDa band observed for the mouse OVGP1 detected with His-Tag antibody?
We have included a supplementary figure S6 with the western blot with anti-His and anti-Flag. The protein around 50 kDa is not a specific band (there is not signal with anti-Flag). This new figure modifies the number assigned to the rest of supplementary figures (S6-S8).
(3) How was the estrous cycle stage determined in mice? It is not described in the Methods.
Estrous cycle stage was determined in mice by visual examination of the vaginal opening and cytological examination of the vagina smear. This is now included in the M&M
(4) For sperm binding, what does the percentage mean?
It was a mistake, percentages were related to pronuclear formation and cleavage not to sperm binding, this is now corrected.
(5) In Figure 3A, the labels for regions C, D, and E are mixed up. It is regions A and C that are conserved (or orange and blue, if the letters are incorrect). The purple region is only present in the mouse (E?), and the red region (D?) is only in the human form. Also, the legend for this panel is repeated verbatim in the Results section. Please remove one of them.
Errors in Figure 3a have been corrected. Legend repetition is removed.
(6) In the title of Figure 1B and in different places in the text, it should be mouse (not mice) oocytes.
Done
(7) In line 140, I would change the part indicating "We extracted the cytoplasmic contents from the oocytes". It is not only the cytoplasm, but all the oocyte, including the nucleus and membranes, that are being removed.
Done
(8) Please rephrase the sentence in lines 245-247, as it is quite confusing.
Done
(9) In line 236, the authors indicate that "During in vitro maturation (IVM), oocytes displayed a porous ZP structure...". Do they mean after IVM? When were those oocytes collected for SEM?
The sentence has been modified by “after IVF”. Bovine oocytes were collected from slaughterhouse ovaries and were similar to those used in the rest of the experiments in the manuscript.
(10) In the legend of Figure 1, please indicate what the parthenogenic group is.
Done
(11) In the legend to Figure 1G, the text indicates "Note sperm only appear outside the zona". However, I cannot see any sperm in that image.
The phrase has been removed, as when enlarging the image to better see the sperm that are inside the area, the vision of those that are outside has been lost.
(12) In the legend to Figure 2 describing the different zona pictures, the letters of the panels are not correct.
Done
(13) In line 999, please provide the right concentration for NMase (it indicates 10 μ/mL).
Done
(14) Where does the model depicted at the end of the manuscript go? Is it a Figure? A graphical abstract? In that model, please correct some typos: it should be "ZP obtained from ovarian oocytes"; and change specie for species in all three panels.
Done. It is a model (Fig. 10)
(15) The FITC-PNA staining to visualize acrosomes is not described in the Methods section.
Done
Reviewer #3 (Recommendations for the authors):
The present study reports findings from a series of experiments suggesting that bovine oviductal fluid and species-specific oviductal glycoprotein (OVGP1 or oviductin) from bovine, murine, or human sources modulate the species specificity of bovine and murine oocytes. The manuscript began with a well-written introduction, but problems started to surface in the Results section, Discussion as well as in the Materials and Methods. Major concerns include inconsistencies, misinterpretation of results, lacking up-to-date literature search, numerous errors found in the figure legends, misleading and incorrect information given in the Materials and Methods, missing information regarding statistical analysis, and inadequate discussion.
We have modified and clarified all the issues, some of which are misunderstandings, we have also performed the suggested experiment of putting sperm in contact with OVGP1.
Specific comments:
(1) Lines 142 to 143 on page 5: It is stated that "Because this experiment was done on empty ZPs, we called this test "empty zona penetration test" (EZPT)". In fact, the experiment was not actually done on empty ZPs, but on oocytes with the ooplasm extracted. Therefore, the zona pellucidae used in the experiment were not empty but contained an intact zona matrix of glycoproteins. The term "EZPT" used by the authors in the manuscript is a misnomer. A better term should be used to reflect the ZPs which were intact and not empty.
We extracted the cytoplasmic containing all the organelles, nucleus and membranes, and the polar body. This has been clarified in the text.
(2) The authors need to distinguish between sperm penetration and sperm binding in the manuscript. In lines 169 to 177 on page 6, the authors mixed up the terms "penetration" and "binding" in the text. In writing about events leading to fertilization in reproductive biology, the term "sperm binding" refers to the interaction between the sperm plasma membrane and the oocyte zona pellucida (ZP), whereas the term "sperm penetration" refers to the passage of the sperm through the ZP. Therefore, the statements in lines 169 to 177 describing the binding of bovine, murine, and human sperm to bovine oocytes with and without prior treatment with oviductal fluid are misleading and not correct. In fact, Figure 2 and Table 6 show sperm penetration and not sperm binding.
Figure 2A and B (now 3A and 3B), and Tables S6 show both sperm penetration (% penetration rate and average sperm in penetrated ZPs) and sperm binding (average sperm bound to ZPs). Throughout the manuscript, a clear distinction is made between sperm attached to the ZP and sperm that have penetrated it.
(3) Lines 182 to 187 on page 6: What is being described in the text here does not match what is being shown in Figure 3A. As a result, the information provided in lines 182 to 187 is not correct and misleading. For example, it is stated in lines 182 to 183 that "As depicted in Fig. 3A, the sequences of these three OVGP1 have five distinct regions (A, B, C, D and E)." However, Figure 3A shows that hOVGP1 and mOVGP1 both have only 4 regions and bOVGP1 has only 3 regions. None of the three has 5 regions. In lines 183 to 184, the authors continued to state that "Regions A and D are conserved in the different mammals." This statement is also not true because Figure 3A shows that only region A is conserved in all three species but not region D which is found only in the human. What is stated in lines 186 to 187 is also not correct based on the information provided in Figure 3A. It is stated here that "Region C is an insertion present only in the mouse (Mus) and region E is typical of human oviductin." However, based on the color codes provided in Figure 3A, region C is present in all three species while region E is present only in the mouse.
Errors with naming regions in Figure 3A (now 4A) have been corrected.
(4) In lines 195 to 197 on page 6, the authors stated that "Western blots of the three OVGP1 recombinants indicated expected sizes based on those of the proteins: 75 kDa for human and murine OVGP1 and around 60 kDa for bovine OVGP1 (Fig. 3B)." However, the expected size of the recombinant human OVGP1 is not in agreement with what has been published in literature regarding the molecular weight of recombinant human OVGP1. It has been previously reported that a single protein band of approximately 110-150 kDa was detected for recombinant human OVGP1 using an antibody against human OVGP1. The authors provided Western blots of murine oviductal fluid and bovine oviductal fluid in Figure 3B but not a Western blot of native human oviductal fluid. The latter should have been included for a comparison with the recombinant human OVGP1.
We do not have human oviductal fluid, but we have included now a supplementary figure 6S of a western blot with antibody again His and Flag (present in the recombinant OVGP1) which shows that the size of the recombinant protein is as indicated in the Figure 3B (now 4B).
(5) Lines 220 to 229 on page 7: In this experiment, the authors conducted the EZPT using ZPs from bovine oocytes that were either treated with or without bOVGP1 followed by incubation, respectively, with homologous sperm (bovine) and heterologous sperm (human and murine). This is a logical experiment to determine if OVGP1 plays a species-specific role in setting the specificity of the zona pellucida. However, in the in vivo situation, sperm that reach the lumen of the ampulla region of the oviduct where fertilization takes place are also exposed to oviductal fluid of which OVGP1 is a major constituent. Therefore, an additional experiment in which sperm are treated with OVGP1 prior to incubation with ZP should be carried out for a comparison.
The additional experiment in which sperm are treated with OVGP1 prior to incubation with ZP has been done (Table S9). No effects were observed. This is now included in the manuscript.
(6) Regarding the results obtained with the use of neuraminidase (lines 278 to 293 on pages 8 to 9), if neuraminidase treatment of bovine ZP prevented bovine sperm penetration regardless of whether ZPs had been or had not been in contact with OVGP1, that means OVGP1 is not responsible for penetration despite the description of earlier findings in the manuscript. Sialic acid is likely associated with the sugar side chains of ZP glycoproteins and not sugar side chains of OVGP1. To attribute the species-specific property of sialic acid to OVGP1 for sperm binding, an experiment in which OVGP1 will be treated with neuraminidase prior to performing the EZPT is needed.
We conducted the experiment by treating only OVGP1 with neuraminidase and then isolating OVGP1 from the enzyme previously to incubate treated OVGP1 with ZPs. The results agree with our previous findings, indicating the importance of sialic acid on OVGP1 for sperm binding and penetration, and confirming that OVGP1 is responsible for species-specific penetration. Results are shown in Fig. 9 and Table S14.
(7) The Discussion appears superficial and a more in-depth discussion regarding the results obtained in the present study in relation to other reports about OVGP1 published in literature is needed (e.g. a recent paper published by Kenji Yamatoya et al. (2023) Biology of Reproduction https://doi.org/10.1093/biolre/ioad159). Lines 317 to 342 of the Discussion on pages 10 to 11 should belong to the Introduction.
Results of Yamatoya are now included in discussion. Part of the discussion from 317 to 342 are now in the introduction
(8) In is not clear what the authors exactly want to say in lines 343 to 344 of the Discussion on page 11. It is stated here that "The empty zona penetration test (EZPT) enables heterologous sperm to overcome the oocyte's second barrier, the plasma membrane or oolemma." Do the authors mean that the sperm can now enter the empty space encircled by the ZP without having to go through the plasma membrane or oolemma? In Figure S4 which depicts the method used to empty the ooplasm in the bovine oocyte, does the method extract only the ooplasm (or cytoplasmic contents) leaving behind the intact plasma membrane or oolemma? This needs to be clearly shown and clearly explained. High magnifications of the zona pellucida are also needed to show whether the plasma membrane (or oolemma) is still present and intact after extraction of the ooplasm.
This is clearly explained in the text. To obtain empty ZP, everything except ZP (nucleus, organelles, membranes and cytoplasmic contents of the oocytes) was removed using a micromanipulator, following the procedure outlined in Figure S4.
(9) The authors stated in the Discussion in lines 383 to 383 on page 12 that "After ovulation, the changes reported in the carbohydrate composition of the ZP (3, 25) are likely induced by the addition of glycoproteins of oviductal origin, as we have seen here with OVGP1." There is no evidence in the present study to suggest that OVGP1 or glycoproteins of oviductal origin have changed or can change the carbohydrate composition of the ZP. At present, it is not known if OVGP1 or glycoproteins of oviductal origin directly interact with ZP glycoproteins (including ZP1, ZP2, ZP3 and/or ZP4) that make up the zona matrix.
There is scientific evidence suggesting that oviductal glycoproteins, including OVGP1, interact with the zona pellucida (ZP) glycoproteins of the oocyte. Studies have shown that OVGP1 binds to the ZP of the oocyte. Specifically, OVGP1 is thought to interact with ZP glycoproteins, such as ZP2 and ZP3, in a way that may help stabilize the oocyte or modify the ZP structure during its passage through the oviduct. This interaction is believed to influence processes like sperm binding, oocyte maturation, and potentially the prevention of polyspermy during fertilization. For example, in several studies, the absence of OVGP1 in knockout animals (such as in Ovgp1-KO hamsters) has been associated with impaired fertilization and embryonic development, which indicates the importance of this interaction. However, the detailed molecular mechanisms and functional significance of these interactions require further exploration. We have use the work “likely” to soften this statement.
Velásquez, J. G., Canovas, S., Barajas, P., Marcos, J., Jiménez‐Movilla, M., Gallego, R. G., ... & Coy, P. (2007). Role of sialic acid in bovine sperm–zona pellucida binding. Molecular reproduction and development, 74(5), 617-628.
Kunz, P., et al. (2013). "The role of oviductal glycoprotein 1 in sperm–egg interaction and early embryonic development." Reproduction, 145(3), 225-233. DOI: 10.1530/REP-12-0300
Yamatoya, K., Kurosawa, M., Hirose, M., Miura, Y., Taka, H., Nakano, T., ... & Araki, Y. (2024). The fluid factor OVGP1 provides a significant oviductal microenvironment for the reproductive process in golden hamster. Biology of reproduction, 110(3), 465-475.
(10) Lines 390 to 391 page 12: The statement "This determines that OVGP1 modifications are critical to define the barrier among the different species of mammals." needs to be rephrased because there is no evidence in the present study showing that OVGP1 has been modified. There are many concerns with errors, important information that is missing, and inconsistencies as well as wrong and misleading information in the Materials and Methods which are troublesome. These concerns raise questions regarding the authenticity and reliability of the study. Some of the major concerns are listed below:
All concerns have been fixed
(11) It says in line 399 on page 13 that "Human semen samples were obtained from a normozoospermic donor...". Do the authors really mean that the semen samples were obtained from only one donor?
Samples were obtained from 3 normozoospermic donor, this is now indicated in M&M
(12) In lines 409 to 411 on page 13, what do the authors mean by "...the samples were frozen into pellets..."? Was centrifugation of the samples carried out prior to freezing the samples? Secondly, what do the authors mean by "....and stored in liquid nitrogen at -196{degree sign}C or lower.", particularly what do the authors mean by "or lower"? The temperature of liquid nitrogen is -196{degree sign}C. What is the "lower" temperature?
Centrifugation of the samples were no carried out at this time. A more detailed protocol is now included The word lower has been removed.
(13) Line 424 on page 13: Provide the full name of "M2" when it is first used in the text then followed by the abbreviation.
Done
(14) Is there a reason why different counting chambers were used to determine sperm concentrations? In line 432 on page 13, a Thomas cell counting chamber was used to determine the sperm count of epididymal mouse sperm whereas it is mentioned in line 441 on page 14 that a Neubauer cell counting chamber was used to determine epididymal cat sperm. Furthermore, where did the cat's sperm come from?
The cat sperm was obtained and processed at the Faculty of Veterinary Medicine and the rest of the samples were processed in the INIA-CSIC lab, and different chambers were used in both places.
(15) The mention of the use of cat spermatozoa in line 439 on page 14 is a worrisome problem of the manuscript. The present study used bovine, mouse, and human sperm and not cat. Therefore, the sudden mentioning of the use of cat spermatozoa in the Materials and Methods is troublesome and worrisome. It appears that the paragraph from lines 439 to 450 was directly copied and pasted from previously published work. Furthermore, lines 441 to 445 do not flow and do not make sense. In fact, what is described in this paragraph (lines 439 to 450) does not appear to correspond to the method(s) used to obtain the results presented in the Results section of the manuscript.
I don't understand why the reviewer says we don't use cat sperm. This study uses cat sperm. Results of cat sperm are indicated in the Figure 1A (now 2A). We have modified the M&M to clarify frozen description.
(16) Similarly, several problems are also found in the paragraphs (lines 453-478 on page 14) describing the methods and procedures to obtain homologous and heterologous IVF of bovine oocytes. Firstly, it is mentioned here (in line 460) that COCs were co-incubated with selected sperm without removing the cumulus cells. However, the results of the sperm penetration experiment indicated otherwise. Figures 2 and 3 show that the oocytes were denuded of cumulus cells. Secondly, it is very worrisome and troublesome to read what is written in line 468 on page 14 that "...from other species (cat, human, mouse, and rabbit)." One wonders where the cat and rabbit came from. Again, it appears that this paragraph was directly copied and pasted from previously published work.
Cat sperm was used in this manuscript and it is correctly indicated in every section and figures. About IVF and EZPT protocols, in the protocol of IVF for bovine oocytes, COCs were used without removing the cumulus cells. For the EZPT cumulus cells were removed, this is described in the following sections of the material and methods. The word rabbit was a mistake and it has been removed.
(17) In lines 468 to 469 on page 14, it is mentioned that "Sperm-egg interactions were assessed through a sperm-ZP binding assay...". The authors only examined sperm penetration in their study. Therefore, this needs to be specified in the Materials and Methods. Secondly, the authors did not use the conventional sperm-ZP binding assay in their study. Instead, they used the EZPT in their study. There appear to be many inconsistencies throughout the manuscript.
When the IVF experiments using bovine COCs were done (Fig 2A and C, Fig 1S to 3S, and Tables 1S to 4S) conventional sperm-egg interaction was assessed at 2.5 hours after IVF. EZPT was used in the rest of experiments. IVF with COCs and EZPT with ZPs are different experiments.
(18) Lines 480 to 489 on page 15 under the sub-heading of "In vitro culture of presumptive zygotes to first cleavage embryos on Day 2" do not provide the correct methodology used for obtaining the results presented in the manuscript. In line 482, it is not clear where the "synthetic oviductal fluid" came from. In fact, in the Results section, none of the results came from the use of synthetic oviductal fluid. In line 487, humans and rabbits are mentioned here. However, human and rabbit oocytes were not used in the present study. It is very strange indeed to read human and rabbit in the sentence.
SOF reference is now included. Human results are in Fig 1A; the sentence is referred about the cultures of bovine oocytes inseminated with sperm of bull, human, mouse or cat). Rabbit word is a mistake and is now eliminated of the manuscript.
(19) In line 500 on page 15, what do the authors mean by "Each oviduct was strengthen by removing the adjacent tissue..."?
The sentence has been modified.
(20) On page 15 in the Materials and Methods, the authors described the collection of bovine and mouse oviductal fluid. However, there is no mention of human oviductal fluid and how it was collected. This important information is missing.
We have not use human oviductal fluid in this manuscript.
(21) Line 510 on page 15: The sub-heading of "Preparation of empty zonae pellucidae from bovine ovarian oocytes" should be rephrased. As pointed out earlier in my review, the ZPs prepared by the authors were intact and not "empty". It was the oocyte which was empty after extraction of the ooplasm.
Everything except the ZP were removed from the oocyte, this is now clarified in the manuscript.
(22) Line 518 on page 16 and line 553 on page 17: "Figure S5" should be "Figure 4S".
Done
(23) Line 538 and line 547 on page 16: "mice oocytes" should be "mouse oocytes".
Done
(24) On page 17, the procedures for in vitro fertilization, sperm penetration, and binding assessment in mice were described here in lines 560 to 574. Several problems are noted in this paragraph as listed below:<br /> a. As mentioned earlier the authors in the present manuscript mixed up sperm penetration and sperm binding which are two separate events. Based on the results presented in the manuscript, they represent sperm penetration and not sperm binding. Therefore, the authors need to precisely explain in the manuscript whether the results presented refer to sperm penetration or sperm binding.
Both sperm penetration and binding have been analyzed in this work.
b. In line 570 on page 17, the term "insemination" is wrongly used here. Insemination is the introduction of semen into the female reproductive tract either through sexual intercourse or through an instrument. The procedure used in the present study was carried out in vitro in a co-incubation manner and not by transferring sperm into the female reproductive tract.
The word insemination has been changed to incubation
c. Information regarding procedures for treatment with various oviductal fluid and OVGP1s are all missing in the Materials and Methods.
This information is now in M&M
d. The concentrations of various oviductal fluids and OVGP1s used and the number of ZPs used in each incubation are also missing.
Concentrations are now indicated in the manuscript. All the numbers and ZPs used are indicated in supplementary figures.
(25) Lines 577 to 603 on pages 17 to 18: Were recombinant bovine and murine glycoproteins prepared using the same methodology? In line 595 on page 18, it is stated that "Supernatant was saved in subsequent experiments." It is not clear exactly what experiments the supernatant was subsequently used in.
Details about how the bovine and murine glycoproteins were prepared are now included. Sentence about subsequent experiment is delete; supernatant was used for the next steps of protein purification.
(26) What is being described in lines 604 to 609 on page 18 is problematic. The paragraph starts by saying that "Human recombinant oviductin was obtained from Origene Technologies....". Strangely, the paragraph continues by saying that the recombinant proteins were produced by transfection in HEK293T...". If recombinant human OVGP1 had already been obtained from Origene Technologies, why did the authors want to produce it again? It does not make sense.
We briefly described the method that Origene used for the production of the human recombinant OVGP1
(27) In lines 626 to 627 on page 18, it is stated that "Zonae pellucidae previously incubated with OVGP1 proteins from several species and murine oviductal fluid...". Were the zonae pellucidae previously incubated with only murine oviductal fluid or also with others?
It was only incubated with OVGP1 or with oviductal fluid, this is now clarified in the text.
(28) In lines 638 and 639 on page 19, can the authors please explain the difference between "endogenous OVGP1 and bOVGP1" and "exogenous recombinant hOVGP1 and mOVGP1"?
This is now clarified
(29) As stated in lines 676 to 679 on page 20, statistical analysis was performed in the study. Strangely, no "n" numbers and p values were provided in any of the figures that require statistical analysis. This is problematic.
Statistical analysis and significant differences are now included in the figures, all the numbers used are included in the supplementary tables that are related with the figures.
There are also many errors noted in the Figure Legends. These concerns raise questions regarding the reliability of the findings and interpretation of the results. Some major ones that require attention are listed below:
(30) Figure legend 1 on page 27: In line 912, where did the "cat sperm" come from? In line 913, where did the "feline sperm" come from? In line 918, as pointed out earlier, the term "empty zona penetration test (EZPT)" is a misnomer and should be replaced with a better term. In line 924, it is stated that "Note sperm only appear outside the zona." However, no sperm can be seen outside the zona pellucida shown in Figure 1.
Cat sperm is used in this manuscript. Term EZPT is now clarified The sentence about sperm outside of ZP is removed
(31) Figure legend 2 on page 27 (lines 928 to 940) needs to be rewritten. Some of the sentences are not clearly written. Authors, please check all the capital labeling letters some of which appear to be wrong.
Done
(32) As is written, Figure legend 3 on pages 28 and 29 (lines 943 to 959) presents many problems:
a. Contrary to what is stated in the figure legend, not all five regions are present in the hOVGP1, mOVGP1, and bOVGP1.
Done
b. Contrary to what is stated in line 946, region D is not conserved in the mouse and bull as shown in Figure 3A, and region C is not present only in the mouse.
Done
c. Based on what is shown in Figure 3A, region E is present only in the mouse and not in the human.
Done
d. What is stated in line 951 that "Proteins were expressed in mammalian cells..." is not correct. Based on the information provided in the manuscript, recombinant human OVGP1 was obtained from Origene Technologies and was not expressed in mammalian cells as claimed.
All the recombinant proteins were produced in mammalian cells.
(33) Figure legend 6 on page 28: In lines 985 to 986, what do the authors mean by "...and combinations of the three oviductins with sperm of the three species."? As is written, it appears that the bovine ZPs were pretreated with a combination of all three oviductins and then co-incubated with sperm from the bull, mouse and human together.
We have clarified this sentence
(34) What is described in the figure legend for the supplemental figure (Figure S7) does not make sense.
Legend of Fig S7 (now S8) is related to pictures A to E, the legend is now clarified.
(35) In addition to the figures and supplemental figures provided in the manuscript, there is also an additional figure labeled with "Model" showing three diagrams. Strangely, there is no mention of this additional figure in the manuscript. There is no figure legend for or description of this figure. It is not clear what is being shown in this figure, and it is not clear about the purpose of the use of this figure.
We have included a legend to the model that is now Figure 10.
-
-
www.researchsquare.com www.researchsquare.com
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
Summary:
In this study, a chromosome-level genome of the rose-grain aphid M. dirhodum was assembled with high quality, and A-to-I RNA-editing sites were systematically identified. The authors then demonstrated that: 1) Wing dimorphism induced by crowding in M. dirhodum is regulated by 20E (ecdysone signaling pathway); 2) an A-to-I RNA editing prevents the binding of miR-3036-5p to CYP18A1 (the enzyme required for 20E degradation), thus elevating CYP18A1 expression, decreasing 20E titer, and finally regulating the wing dimorphism of offspring.
Strengths:
he authors present both genome and A-to-I RNA editing data. An interesting finding is that a A-to-I RNA editing site in CYP18A1 ruin the miRNA binding site of miR-3036-5p. And loss of miR-3036-5p regulation lead to less 20E and winged offspring.
Weaknesses:
How crowding represses the miR-3036-5p is still unclear.
Reviewer #2 (Public Review):
Summary:
Environmental influences on development are ubiquitous, affecting many phenotypes in organisms. However molecular genetic and cellular mechanisms transducing environmental signals are still only barely understood. This study examines part of one such intracellular mechanism in a polyphenic (or dimorphic) aphid.
Strengths:
While other published reports have linked phenotypic plasticity to RNA editing before, this study reports such an interaction in insects. The study uses a wide array of molecular tools to identify connections upstream and downstream of the RNA editing to elucidate the regulatory mechanism, which is illuminating.
Weaknesses:
While this system is intriguing, this report does not foster confidence in its conclusions. Many of the analyses seem based on very small sample sizes. It is itself problematic that sample sizes are not obvious in most figures, although based on Methods section covering RNAseq, they seem to be either 3, 6 or 9, depending on whether stages were pooled, but that point is not made clear. With such small sample sizes, statistical tests of any kind are unreliable. Besides the ambiguity on sample sizes, it's unclear what error bars or whiskers show in plots throughout this study. When sample sizes are small estimates of variance are not reliable. Student's t-test is not appropriate for comparisons with such small sample sizes. Presently, it is not possible to replicate the tests shown in Figures 3, 4 and 6. (Besides the HT-seq reads, other data should also be made publicly available, following the journal's recommendations.) Regardless, effect sizes in some comparisons (Fig 3J, 4A-C, 6E, H) are clearly not large, making confidence in conclusions low. The authors should be cautious about over-interpreting these data.
We appreciate very much for the reviewers’ time spent on our manuscript and the referees for the valuable suggestions and comments.
To Reviewer #1:
At present, researches on miRNAs mainly focus on its role in gene regulation by binding to the mRNA of target genes, “how miRNAs are regulated” has received less attention.
Recent researches indicated that the expression of miRNAs is also regulated at the transcriptional or post transcriptional level. Transcriptional regulation including changes in the promoter of microRNA genes, and post-transcriptional mechanisms such as changes in miRNA processing and stability can both affect the final expression level of miRNAs.
This article did not address how crowding treatment regulates miRNA expression. But this will be a very interesting issue, and we will pay attention to it in our future research.
Thank you for this suggestion.
To Reviewer #2:
(1) “Transgenerational wing dimorphism was observed in M. dirhodum in which crowding of the parent (100 mother aphids in a 10 cm³ tube) increased the winged offspring (Fig 3E).” In this experiment, over 250 offsprings were used to calculate the proportion of winged and wingless individuals in normal (277), crowding (255) and crowding+20E (272) groups, respectively.
“The RNAi-mediated knockdown of CYP18A1 and ADAR2 can significantly increase the titer of 20E (Fig. 4E) and reduce the number of winged offspring by 29.6% and 24.4% (Fig. 4F), respectively.” In this experiment, over 245 offsprings were used to calculate the proportion of winged and wingless individuals in dsEGFP (273), dsCYP18A1(248), and dsADAR2 (250) groups, respectively.
“miR-3036-5p agomir and antagomir treatments could affect the proportion of winged offspring under normal conditions (Fig. 6F), but have no effect on the wing dimorphism of offspring under crowded conditions (Fig. 6L).” In this experiment, over 235 offsprings were used to calculate the proportion of winged and wingless individuals in each group, respectively.
So I think our conclusion that crowding treatment, A-to-I RNA editing, and miRNAs could affect the wing dimorphism of offspring in M. dirhodum is very reliable. Because the number of aphids we use to count the results is sufficient.
(2) The quantitative PCR method is used to detect changes in gene expression levels of CYP18A1 and ADAR2 after treatment with crowding, 20E, dsRNA, miRNA agomir and antagomir, and the results are shown in Fig. 3J, 4A-C, 5B, 6B, H, respectively. 5 biological replicates (more than 100 aphids were used for each biological replicate) were used in each sample, which might be sufficient for qPCR experiments. And among these biological replicates, the differences in gene expression levels are relatively small.
(3) The titer of 20E was detected after treatment with crowding, 20E, dsRNA, miRNA agomir and antagomir, and the results are shown in Fig. 3I, 4E, 6E, K, respectively. 8 biological replicates (more than 100 aphids were used for each biological replicate) were used in each sample.
The number of biological replicates used in each analysis and the number of aphids included in each biological replicate have been added in the Materials and Methods section. Thank you very much for pointing out this important issue.
Reviewer #1 (Recommendations For The Authors):
Several questions:
(1) This study was conducted on the rose-grain aphid M. dirhodum. However, pea aphid Acyrthosiphon pisum seems to be a better object in wing dimorphism and development studies. Have the authors also identified the A-to-I RNA editing on pea aphids or other aphids?
Wheat is one of the main grain crops in China as well as in the world. Metopolophium dirhodum is one of the most important wheat aphids around China, and has posed a significant threat to grain production. The current study was conducted to determine the regulatory mechanism of wing dimorphism on M. dirhodum, which might be of great significance to better control this pest in wheat production.
Surely the pea aphid offers more established experimental tools and genomic resources. However, with the development of high-throughput sequencing technology, the chromosome level genomes of many insect species have been assembled. That means any of various insects might be studied as a model species, and not limited to Drosophila melanogaster, Acyrthosiphon pisum, etc.
We didn’t identify the A-to-I RNA editing on pea aphids or other aphids. A recent study has shown that editing events are poorly conserved across different Xenopus species. Even sites that are detected in both X. laevis and X. tropicalis show largely divergent editing levels or developmental profiles. In protein-coding regions, only a small subset of sites that are found mostly in the brain are well conserved between frogs and mammals. The conservation of RNA editing in aphids is still unknown, and we will continue to pay attention to this issue in our future research works.
Reference: Nguyen TA, Heng JWJ, Ng YT, Sun R, Fisher S, Oguz G, Kaewsapsak P, Xue S, Reversade B, Ramasamy A, Eisenberg E, Tan MH. Deep transcriptome profiling reveals limited conservation of A-to-I RNA editing in Xenopus. BMC Biology. 2023, 21(1):251.
(2) "Two miRNA-target prediction software programs, miRanda and RNAhybrid, were used to identify the miRNAs that potentially act on CYP18A1. The results showed that miR-3036-5p could bind to the sequence containing edited position (editing site 528) of CYP18A1 in M. dirhodum." Is there any other miRNA that can also act on CYP18A1, thereby regulating its expression?
The predicted results indicate that there are several other miRNAs can act on CYP18A1, but none of them can bind to this editing site (editing site 528). Therefore, we did not pay attention to other miRNAs.
(3) 11678 A-to-I RNA-editing sites were systematically identified in M. dirhodum. Does that mean RNAi-mediated knockdown of ADAR2 may affect the RNA-editing and expression of a large number of genes? Please clarify.
It is of course possible that RNAi-mediated knockdown of ADAR2 may affect the RNA-editing and expression of a large number of genes. A-to-I RNA editing was also observed in 5 other genes that involved in 20E biosynthesis and signaling pathway, but no evident difference was identified for the RNA editing and expression levels of these 5 genes after crowding treatment (Fig. S2, Table S5). That means the A-to-I RNA editing of CYP18A1 might be crucial in 20E-mediated wing dimorphism in M. dirhodum.
(4) It is interesting that "the transcriptional level of ADAR2 was 2.19 fold higher in the crowding+20E treatment parent than that in the normal group, but no significant difference was identified between the crowding and normal groups". ADAR2 can be induced by 20E, rather than crowding. How should the author explain? It seems that 20E induction can also cause many RNA editing events.
20-hydroxyecdysone (20E) can affect the growth and development, molting, metamorphosis, and reproductive processes of insects. According to this result, 20E induction can also cause RNA editing events by regulating the expression of ADAR2, and which may provide valuable references for the future study on 20E. Meanwhile, we will also continue to pay attention to this issue in our future research works.
(5) Authors provided a lot of text to describe the genome assembly. I don't think it's necessary, authors can make appropriate deletions.
Thank you for this suggestion. This is the first high-quality chromosome-level genome of M. dirhodum, which will be very helpful for the cloning, functional verification, and evolutionary analysis of genes in this important species or even other Hemiptera insects. Therefore, I think it is necessary to provide a detailed description. We will also make appropriate deletions in the “Result and Discussion” sections.
Reviewer #2 (Recommendations For The Authors):
Additional concerns
- With an existing genome sequence available for the peas aphid *Acyrthosiphon pisum*, why have these authors chosen to use the rose-grain aphid for this study? It would be helpful to address any limitations in *Acyrthosiphon pisum* or advantages in *Metopolophium dirhodum* that explain that decision.
Wheat is one of the main grain crops in China as well as in the world. Metopolophium dirhodum is one of the most important wheat aphids around China, and has posed a significant threat to grain production. The current study was conducted to determine the regulatory mechanism of wing dimorphism on M. dirhodum, which might be of great significance to better control this pest in wheat production.
Surely the pea aphid offers more established experimental tools and genomic resources. However, with the development of high-throughput sequencing technology, the chromosome level genomes of many insect species have been assembled. That means any of various insects might be studied as a model species, and not limited to Drosophila melanogaster, Acyrthosiphon pisum, etc.
- In Figure 5E, what anatomy is being shown in FISH? Moreover, this represents a single sample. It would be preferable to include a supplemental figure with comparable images from at least 3 additional specimens.
It is the whole aphid body, and we have already uploaded additional 2 FISH images to the supplementary material Fig. S5. Thank you for this suggestion.
- L190: Conservation alone seems inadequate to conclude that a chromosome functions as a sex chromosome. It would be fine to note the homology between Chr1 and the X of other Aphidini, but there are other explanations for that. Inference that Chr 1 is a sex chromosome might come from observations in karyotypes (by relative size comparisons or ideally from FISH) or from comparison of reads mapped to the chromosomes, suggesting Chr1 is hemizygous in males.
Karyotype analysis experiment was not conducted in this research, so here the sex chromosome was determined based on chromosome homology between M. dirhodum and A. pisum genome. We have made appropriate modifications to the description in the article. Thank you for this suggestion.
- L205: It's unclear to me how to interpret RNA editing results, based on RNAseq data, that map to "intergenic regions", especially when this is such a large fraction (37.3%) of the total result. Does this suggest a fundamental problem with the analysis, that so much RNAseq data maps to parts of the genome that are not annotated as genes?
Non-coding RNA regions often account for a large proportion in the genome, and this RNAseq data is mapped to non-coding RNA transcription regions (37.3%) between protein-coding genes (intergenic regions).
- L288-290: What degrees of confidence are attached to the predictions of these miRNA targets?
There is no clear research indicating the accuracy of miRNA target prediction software. However, by comprehensively utilizing multiple prediction tools and experimental verification, the accuracy and reliability of prediction can be significantly improved.
Actually, the prediction of miRNA targets is only a preliminary identification step, and we have subsequently demonstrated that miR-3036-5p can act on CYP18A1 through dual-luciferase reporter assay, RNA immunoprecipitation and FISH, etc.
- L296-298: The mechanism proposed in this study seems to imply that miR-3036-5p should be absent (not expressed) in aphids under crowded conditions. Therefore, relative realtime PCR is not particularly useful here. Finding that the miR relative expression is reduced by 48.8% is meaningless, because in *relative* expression, zero has no special meaning. In this case, absolute quantitative PCR, measuring actual transcript numbers, would be far more informative.
miR-3036-5p is not absent in aphids under crowded conditions. Only a significant decrease of miR-3036-5p in expression level under crowded conditions was identified compared to normal feeding conditions (Fig. 5B). So it should be reasonable to use relative quantitative methods for expression level analysis.
- L361: Isn't alternative mRNA splicing a more common post-transcriptional modification?
I'm very sorry, this sentence has been modified to “A-to-I RNA editing is one of the most prevalent forms of posttranscriptional modification in animals, plants, and other organisms.” Thank you for this suggestion.
- L372: "Functional wing polymorphism is commonly observed in insects as a form of adaptation and a source of variation for natural selection (14)." The relationship between plastic phenotypic variation and natural selection is complex, and there is a large theoretical literature in evolutionary biology and evo-devo on this topic, but it is not a focus in the cited review by Zhang et al.. It would be helpful if the authors could expand on this idea with reference to some of this literature (e.g. Levins 1968; Harrison 1980; Moran 1992; Roff 1996; West-Eberhard 2003; Zera 2009).
I have changed the citation and expanded on this idea. “Wing polymorphism is commonly observed in insects, resulting from variation in both genetic factors and environmental factors (Zera 2009).”
- L404: Use the word "accurate" seems inappropriate in this context. Both morphs are equally "accurate".
This sentence has been modified to “resulting in the alteration of CYP18A1 expression and wing dimorphism of offspring regulated by miR-3036-5p”, Thank you for this suggestion.
- L412: Reference 67 seems irrelevant to this point.
References have been changed and added.
67. E.J. Duncan, C.B. Cunningham, P.K. Dearden. Phenotypic plasticity: what has DNA methylation got to do with it? Insects. 13(2):110 (2022).
68. K.J. Rangan, S.L. Reck-Peterson, RNA recoding in cephalopods tailors microtubule motor protein function. Cell 186, 2531-2543 (2023).
- L443: Is this referring to "mixed stage" aphids?
Yes. To make it clearer, this sentence has been modified to “Approximately 200 mg of fresh M. dirhodum with mixed stages (including first- to fourth-instar nymphs and winged and wingless adults)”.
- L483: What mass or number of individual aphids was used? I assume multiple individuals were pooled?
Each sample contains approximately 200 aphids.
- L499: Why was k = 17 used? The default is k = 21.
The selection of k is usually an odd number between 15 and 21, which ensures that the types of k-mers can cover the genome while being small enough to avoid erroneous effects. Therefore, using 17 is very reasonable.
- L574: what does it mean "multiple editing types"? What different types are possible? Are you referring to things other than A-to-I editing?
That means besides A-to-I, this locus may also have other editing situations, such as A-to-C. If this situation occurs, it will be discarded.
- L635: Which luciferase construct or plasmid has been used in this experiment? Citation to that source is necessary.
PmirGLO vector (Promega, Leiden, Netherlands) was used in this experiment, and a reference has been added.
B. Zhu, L. Li, R. Wei, P. Liang, X. Gao. Regulation of GSTu1-mediated insecticide resistance in Plutella xylostella by miRNA and lncRNA. PLoS Genetics. 17(10), e1009888 (2021).
- L644: Did cDNA synthesis employ random primers or a poly-dT primer?
This kit provides mixed primers, including random and poly-dT primers. (PrimeScript™ RT reagent Kit with gDNA Eraser (Perfect Real Time), Takara Biotechnology, Dalian, China).
- Fig 4D: Seems like this panel should be divided to cover the two sites, as in Fig 3F. Right now the x-axis labels seem redundant.
Done. Thank you for this suggestion.
- Fig 7: Consider adding ADAR2 to this figure.
Done. Thank you for this suggestion.
- Table 1: It would be helpful to represent this data in a figure where the phylogenetic relationships among the species can be shown.
The phylogenetic relationships among the species were shown in Fig. 1D, and the table here may present genome information in more detail.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public review)
This paper focuses on secondary structure and homodimers in the HIV genome. The authors introduce a new method called HiCapR which reveals secondary structure, homodimer, and long-range interactions in the HIV genome. The experimental design and data analysis are well-documented and statistically sound. However, the manuscript could be further improved in the following aspects.
Major comments:
(1) Please give the full name of an abbreviation the first time it appears in the paper, for example, in L37, "5' UTR" "RRE".
Thank you for your suggestion. We have added the full name of these abbreviations.
(2) The introduction could be strengthened by discussing the limitations of existing methods for studying HIV RNA structures and interactions and highlighting the specific advantages of the HiCapR method.
Thank you for your insightful suggestion. We have modifed sentences in the introduction section (line 66 -line 71, line 80-line 81 in the revised manuscript).
(3) Please reorganize Results Part 1.
Thank you for your advice. We have reorganized results part 1. We hope the revision provides a logical flow and clarity to the results, making it easier for readers to follow the progression of the study and the significance of the findings regarding to the HiCapR method.
(4) Is there any reason that the authors mention "genome structure of SARS-CoV-2" in L95?
Thank you for your insightful question. We have deleted this sentence in the revised paper.
Initially, the mention of our previous work on SARS-CoV-2 serves two purposes: firstly, to demonstrate our capability to perform proximity ligation assays on viral samples; and secondly, to underscore the necessity of the hybridization step, which is particularly relevant for the study of HIV.
Unlike SARS-CoV-2, which is highly abundant in infected cells and does not require post-library hybridization, HIV-1 presents a unique challenge due to its typically low viral RNA input within cells. The simplified SPLASH protocol, while effective for more abundant viral RNAs, does not provide the necessary coverage for high-resolution analysis when applied directly to HIV samples.
Now, we have deleted this sentence according to your comments, and discuss the technical difference elsewhere.
(5) L102: Please clarify the purpose of comparing "NL4-3" and "GX2005002." Additionally, could you explain what NL4-3 and GX2005002 are? The connection between NL4-3, GX2005002, and HIV appears to be missing.
Thank you for your question, and we apologize for the misleading. "NL4-3" and "GX2005002" are two distinct HIV-1 strains that exhibit different prevalence patterns in various geographical regions. The NL4-3 strain is a well-characterized laboratory strain that is widely used in HIV research and is representative of the HIV-1 subtype B, which is highly prevalent in Europe and the Americas. On the other hand, GX2005002 is a primary isolate of the CRF01_AE subtype, which is one of the most prevalent strains in Southeast Asia, particularly in China.
The reason for comparing these two strains in our study is twofold. Firstly, it allows us to assess the applicability and versatility of our HiCapR method across different HIV-1 strains that may have distinct genetic and structural features. This is crucial for understanding the potential broad utility of our method in studying various HIV-1 strains globally. Secondly, by comparing these strains, we can begin to elucidate any strain-specific differences in RNA structure, homodimer formation, and long-range interactions, which may have implications for viral pathogenesis, transmission, and response to therapeutic interventions.
The connection between NL4-3, GX2005002, and HIV lies in their representation of different subtypes of the HIV-1 virus, which exhibit genetic diversity and are associated with different geographical distributions. This diversity is epidemiologically and clinically relevant, as it may be associated with different pathogenesis and resistance mechanisms, and might has implications for vaccine development and treatment strategies.
(6) Figure 1A is not able to clearly present the innovation point of HiCapR.
Thank you for your comment. We have revised this figure to more clearly illustrate the steps and principles of the post-library capture process using HIV pooled probes hybridization and streptavidin pull down to enrich HIV RNA-derived chimeras.
(7) Please compare the contact metrics detected by HiCapR and current techniques like SHAPE on the local interactions to assess the accuracy of HiCapR in capturing local RNA interactions relative to established methods.
Thank you for your request to compare the contact metrics detected by HiCapR and current techniques like SHAPE on local interactions to assess the accuracy of HiCapR in capturing local RNA interactions relative to established methods.
In this study, HiCapR has demonstrated its ability to identify key structural elements within the HIV genome, including TAR, polyA, SL1, SL2, and SL3, as well as the polyA-SL1 in the monomeric conformation. These elements are crucial for understanding the local RNA structures involved in HIV replication and pathogenesis. By visualizing the base pairing probability as a heatmap, we have identified the most stable base pairs in the 5’ UTR of HIV, which is consistent across both NL4-3 and GX2005002 strains (Figure 2D). This consistency suggests robustness in the overall structure despite sequence variations and alternative RNA conformations, indicating a high level of agreement between HiCapR and SHAPE methods in detecting local interactions.
Furthermore, HiCapR not only confirms the presence of known structural elements but also reveals alternative conformations of the 5'UTR that support the alternative conformations found in SHAPE analysis. This additional layer of information provides a more comprehensive view of the RNA structures, highlighting HiCapR's ability to capture local RNA interactions with a high degree of accuracy comparable to established methods like SHAPE.
(8) The paper needs further language editing.
We have thoroughly revised the paper. We hope it’s improved significantly.
Reviewer #2 (Public review):
Summary:
In the manuscript "Mapping HIV-1 RNA Structure, Homodimers, Long-Range Interactions and 1 persistent domains by HiCapR" Zhang et al report results from an omics-type approach to mapping RNA crosslinks within the HIV RNA genome under different conditions i.e. in infected cells and in virions. Reportedly, they used a previously published method which, in the present case, was improved for application to RNAs of low abundance.
Their claims include the detection of numerous long-range interactions, some of which differ between cellular and virion RNA. Further claims concern the detection and analysis of homodimers.
Strengths:
(1) The method developed here works with extremely little viral RNA input and allows for the comparison of RNA from infected cells versus virions.
(2) The findings, if validated properly, are certainly interesting to the community.
Thank you for your comprehensive review and insightful comments on our manuscript. We appreciate your recognition of the strengths of our HiCapR method and the potential interest of our findings to the scientific community.
Weaknesses:
(1) On the communication level, the present version of the manuscript suffers from a number of shortcomings. I may be insufficiently familiar with habits in this community, but for RNA afficionados just a little bit outside of the viral-RNA-X-link community, the original method (reference 22) and the presumed improvement here are far too little explained, namely in something like three lines (98-100). This is not at all conducive to further reading.
Thank you for your feedback on the clarity of our manuscript, particularly regarding the explanation of the HiCapR method and its improvements over the original method mentioned in reference 22
In response to your feedback, we expand on the description of the HiCapR method in the revised manuscript to ensure that it is accessible to a broader audience. We will provide a more thorough comparison between HiCapR and the original method, detailing the specific improvements and how they enable the analysis of low-abundance viral RNAs like HIV. This will include:
Post-library Hybridization: Unlike the original method, HiCapR incorporates a post-library hybridization step. This innovation allows for the capture of target RNA involved in interactions after library construction, offering additional flexibility and enhancing the resolution of the analysis.
Enhanced Sensitivity: HiCapR has been optimized to work with extremely low viral RNA input, which is a significant advancement over the original method. This is crucial for studying viruses like HIV, where obtaining high quantities of viral RNA can be challenging. As a matter of fact,
(2) Experimentally, the manuscript seems to be based on a single biological replicate, so there is strong concern about reproducibility.
Thank you for raising the issue of reproducibility in our study. We understand the importance of experimental replication in ensuring the reliability of our findings. In response to your concern, we would like to provide the following clarification and additional details regarding the reproducibility of our HiCapR experiments:
Replicates in HiCapR Experiments: All ligation and control samples in our HiCapR experiments were performed in three biological replicates. This was done to ensure the high reproducibility of our results. The high degree of correlation (r > 0.99) between these replicates underscores the reliability of our findings.
Dimer Validation Experiments: To validate the dimer formation of RRE and 5’-UTR, we employed multiple independent methods, including Native agarose gel electrophoresis, Agilent 4200 TapeStation Capillary electrophoresis, and Biomolecular Binding Kinetics Assays. These methods provide complementary perspectives on the dimer formation, enhancing the robustness of our validation process. The data presented in Figure 3C and Supplementary figure S12 are representative results from these experiments, which consistently support our findings on dimer formation.
Agreement Between Cellular and Virion RNA: Our study also demonstrates a significant similarity between virions in the supernatant and infected cells from the same viral strain, as shown in Supplementary Figure S3. This consistency further validates the reproducibility and reliability of our HiCapR method in capturing RNA structures and interactions under different conditions.
Consistency across two strains: Our study includes a comprehensive analysis of two distinct HIV-1 strains, NL4-3 and GX2005002, which are prevalent in Europe and Southeast Asia, respectively. The consistency in our findings across these strains serves as a strong indicator of the reproducibility and general applicability of our HiCapR method. Specifically, presence of key structural elements such as TAR, polyA, SL1, SL2, and SL3 in both NL4-3 and GX2005002 strains, suggests a robust structural framework that is conserved across different strains, despite sequence variations. Additionally, our study reveals approximately 20 candidate dimer peaks conserved between the NL4-3 and GX2005002 strains along the genome. The conservation of these dimer peaks across strains indicates a reproducible pattern of dimerization.
(3) The authors perform an extensive computational analysis from a limited number of datasets, which are in thorough need of experimental validation
Thank you for your comment.
In response to your concern, we would like to clarify that while our manuscript does present an extensive computational analysis, we have also conducted a series of experiments. Specifically, we have validated dimer formation using multiple independent methods (afore discussed).
Given the time-consuming nature of additional experiments, we have chosen to share the HiCapR data with the community in a timely manner. This approach allows for more immediate communication and evaluation of the data on HIV structure, which we believe is valuable for advancing the field.
We are committed to further investigating the functional implications of our structural findings. We plan to conduct more experiments to explore the functional linking between the structural insights of HIV, which will help to deepen our understanding of the virus's replication and potential antiviral strategies.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
I suggest a major revision of the manuscript.
Minor comments:
(1) The article lacks consistency in its presentation. The expression of the proper noun is wrong in the paper. For example, (a) L89, "RNA:RNA interaction" →RNA-RNA interaction; (b) L431, "SARS-COV-2" → SARS-CoV-2;
We are sorry for the inconsistency. We have corrected the mistakes.
(2) "We identified dimers based on the methodology described in23." is not a complete sentence.
Thank you for your insightful comment. We have revised the sentence to provide a complete and clear description of our methodology. The revised sentence is as follows: "Homodimers were identified in accordance with the methods previously reported in the literature."
Reviewer #2 (Recommendations for the authors):
(1) The authors perform an extensive computational analysis from a limited number of datasets, which are in thorough need of experimental validation. There is a single series on in vitro validation of the interaction of an homodimerization site, described in five lines (278-283) plus the Figure panel 3c with a very brief legend, and an extremely minimalist Figure S12. The panel to Figure 3c contains Kd values which have not been assessed for significant digits.
Thank you for your constructive feedback on our manuscript.
We acknowledge that our computational analysis is based on a limited number of datasets. Due to the initial exploratory nature of our study and the logistical challenges of generating additional datasets, we have focused on in-depth analysis of the available data. We are currently working on further validating our findings and are committed to publishing these results in a follow-up study.
Regarding Experimental Validation:
We agree that the initial description of our in vitro validation of the homodimerization site was concise. To address this, we have expanded the description of our experimental procedures. Specifically, we have detailed the methods used for the in vitro transcription, the preparation of RNA samples, and the use of the Octet R8 platform for biomolecular binding kinetics assays.
For the Kd values presented in Figure 3c. We have now included standard error of the mean and have defined the significant digits in the figure legend. This revision provides a more accurate representation of the binding affinities.
(2) As a further example to be experimentally validated, splice sites are discussed after lines 354, for which unsophisticated validation techniques such as targeted RT-PCR are widely accepted.
In response to your comment, we would like to clarify that the splice sites mentioned in our study are well-established and widely recognized in the literature. They have been previously characterized and are considered canonical within the HIV research community. Given their established nature, we have relied on this foundational knowledge in our analysis.
However, we concur with the importance of validating the regulatory role of homodimers in splicing, which is a significant aspect of HIV biology. While we have provided evidence for the presence of these homodimers and their potential implications for splicing, we acknowledge the need for further functional studies to elucidate their mechanistic role.
Due to the scope and length constraints of the current manuscript, we have chosen to focus on the structural and interaction analyses provided by HiCapR. The functional validation of these homodimers and their impact on splicing will be pursued in subsequent studies, which we plan to initiate promptly. We believe that a dedicated follow-up study will allow for a more in-depth exploration of this complex and important aspect of HIV gene regulation.
We are committed to advancing our understanding of the functional significance of these homodimers in the context of HIV splicing and will ensure that this line of investigation is thoroughly addressed in our future work.
Thank you again for your valuable feedback. We look forward to contributing further to the field with our ongoing research.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
eLife assessment
“This work presents valuable data demonstrating that a camelid single-domain antibody can selectively inhibit a key glycolytic enzyme in trypanosomes via an allosteric mechanism. The claim that this information can be exploited for the design of novel chemotherapeutics is incomplete and limited by the modest effects on parasite growth, as well as the lack of evidence for cellular target engagement in vivo.”
We agree with this assessment. In this re-worked version, we implemented the textual changes suggested by the reviewers and performed additional in silico work. The reviewers also presented valuable suggestions for additional experiments. However, we currently don’t have dedicated hands and funding for this project, which renders it impossible for us to perform additional “wet lab” experiments at this stage. We have thus not included new experimental “wet lab” data. Finally, the claim that our results may be exploited for the design of novel chemotherapeutics perhaps came across stronger than we intended to. We still believe our findings indicate a potential for such an endeavor, but this clearly requires further investigation and experimental evidence. We have softened this statement by removing it from the abstract and have edited the discussion to end as follows.
“Based on the presented results, we propose that sdAb42 may pinpoint a site of vulnerability on trypanosomatid PYKs that could potentially be exploited for the design of novel chemotherapeutics. Indeed, antibodies (or fragments thereof) are valuable drug discovery tools. Antibodies (and camelid sdAbs especially) are known for their ability to "freeze out" specific conformations of highly dynamic antigens, thereby exposing target sites of interest, which could be exploited for rational drug design (the development of so-called "chemo-superiors", (Lawson, 2012; Khamrui et al., 2013; van Dongen et al., 2019)). While the design of a "chemo-superior" inspired on the sdAb42-mediated allosteric inhibition mechanism will require further investigation, the results presented here provide a foundation to fuel such an endeavour.”
REVIEWER 1:
Summary:
The authors identified nanobodies that were specific for the trypanosomal enzyme pyruvate kinase in previous work seeking diagnostic tools. They have shown that a site involved in the allosteric regulation of the enzyme is targeted by the nanobody and using elegant structural approaches to pinpoint where binding occurs, opening the way to the design of small molecules that could also target this site.
Strengths:
The structural work shows the binding of a nanobody to a specific site on Trypanosoma congolense pyruvate kinase and provides a good explanation as to how binding inhibits enzyme activity. The authors go on to show that by expressing the nanobodies within the parasites they can get some inhibition of growth, which albeit rather weak, they provide a case on how this could point to targeting the same site with small molecules as potential trypanocidal drugs.
Weaknesses:
The impact on growth is rather marginal. Although explanations are offered on the reasons for that, including the high turnover rate of the expressed nanobody and the difficulty in achieving the high levels of inhibition of pyruvate kinase required to impact energy production sufficiently to kill parasites, this aspect of the work doesn't offer great support to developing small molecule inhibitors of the same site.
Recommendations for authors:
Generally, the paper is very well written and the figures and their legends are clear.
Comment 1.1: I thought the Introduction could give more focus to the need for new drugs for veterinary trypanosomiasis. The reality is that with fexinidazole now available and acoziborole soon to be available, with <1,000 cases of human African trypanosomiasis in each of the last five years, the case for needing new drugs is difficult to make. For Animal trypanosomiasis, however, the need for novel drugs is much more pressing.
We agree with this comment and have included an additional section in the Introduction’s second paragraph, which reads as follows.
“Hence, there is a need for alternative compounds, preferably with novel modes of action and/or designed based on mechanistic insights of the target’s structure-function relationship (Field et al., 2017; De Rycker et al., 2018). This need is especially pressing for AAT, which strongly impedes sustainable livestock rearing in Sub-Saharan Africa. AAT results in drastic reductions of draft power, meat, and milk production by the infected animals (small and large ruminants), and its control relies mainly on vector control and chemotherapy, with only few drugs currently available. The lack of routine field diagnosis has resulted in the misuse of trypanocidal drugs, thereby accelerating the rise of parasite resistance and further exacerbating the problem (Richards et al., 2021). As such, AAT-inflicted annual losses are estimated at around $5 billion (and the necessity to invest another $30 million each year to control AAT through chemotherapy), thereby having a devastating impact on the socio-economic development of Sub-Saharan Africa (Fetene et al., 2021). In contrast, HAT is perceived as a minor threat as it has reached a post-elimination phase as a public health problem with less than 1,000 yearly documented cases (Franco et al., 2022). In addition, new and effective drugs for HAT treatment have recently become available (De Rycker et al., 2023). HAT control currently relies on case detection and treatment, and vector control (Büscher et al., 2017).”
Comment 1.2: A few pedantic things can be tidied up too, for example on line 61 it is stated tsetse flies are part of the life cycle for all trypanosomes while some veterinary species e.g. T. evansi and some T.vivax strains use other biting flies for transmission. I'd also add in the Introduction that pyruvate kinase is not a glycosomal enzyme (it is covered in the legend to figure 1 but I think it is quite important to clarify in the Introduction too so as to assure readers aren't wondering if "intrabodies" can get targeted there.
We agree with this comment and have included an additional section in the Introduction’s third paragraph to expand on the life cycles of African trypanosomes, which reads as follows.
“African trypanosomes are extracellular parasites that have a bipartite life cycle involving insect vectors and mammals as hosts (Radwanska et al., 2018). Most HAT (T. brucei gambiense and T. b. rhodesiense) and AAT (T. b. brucei and T. congolense) causing trypanosomes are uniquely vectored by tsetse flies (Glossina spp.) and are confined to Sub-Saharan Africa. T. b. evansi and T. vivax (both causative agents of AAT) have expanded beyond the tsetse belt due to their ability to be mechanically transmitted by a variety of biting flies (Glossina, Stomoxys, and Tabanus spp.). Finally, T. b. equiperdum infects equids and represents an exception as it is transmitted directly from animal to animal through sexual contact.”
The introduction now also explicitly mentions that pyruvate kinase is not a glycosomal enzyme.
Comment 1.3: The introduction would also be a good place to include some more information on what is known about the allosteric effectors of pyruvate kinase in trypanosomes, and emphasize where gaps in knowledge exist too.
We agree with this comment and have included an additional section in the Introduction’s third paragraph, which reads as follows.
“Pyruvate kinase (PYK) represents another attractive glycolytic target. This non-glycosomal enzyme catalyses the last step of the glycolysis (the irreversible conversion of phosphoenolpyruvate (PEP) to pyruvate; Figure 1A). The importance of this reaction is two-fold: i) the generation of ATP through the transfer of a phosphoryl group from PEP to ADP and ii) the formation of pyruvate, a crucial metabolite of the central metabolism. Like most PYKs, trypanosomatid PYKs are homotetramers. The PYK monomer is a ∼55 kDa protein organized into four domains termed ’N’, ’A’, ’B’, and ’C’ (Figure 1B). The A domain constitutes the largest part of the PYK monomer and is characterized by an (𝛼/𝛽)8-TIM barrel fold that contains the active site. Together with the N-terminal domain, it is also involved in the formation of the PYK tetramer AA’ dimer interfaces. The B domain is known as the flexible ’lid’ domain that shields the active site during enzyme-mediated phosphotransfer. Finally, the C domain harbors the binding pocket for allosteric effectors and stabilizes the PYK tetramer by formation of CC’ dimer interfaces. Because of their role in ATP production and distribution of fluxes into different metabolic branches, the activity of trypanosomatid PYKs is tightly regulated through an allosteric mechanism known as the "rock and lock" model (Morgan et al., 2010, 2014; Pinto Torres et al., 2020). In this model (which is detailed in Figure 1C), the binding of substrates and/or effectors (and analogs thereof) to the active and effector sites, respectively, trigger a conformational change from the enzymatically inactive T state to the catalytically active R state. Known effector molecules for trypanosomatid PYKs are fructose 2,6-bisphosphate (F26BP), fructose 1,6-bisphosphate (F16BP) and sulfate (SO<sub>4</sub><sup>2-</sup>), with F26BP being the most potent one (van Schaftingen et al., 1985; Callens and Opperdoes, 1992; Ernest et al., 1994; Tulloch et al., 2008). Interestingly, trypanosomatid PYKs seem to be largely unresponsive to the allosteric regulation of enzyme activity by free amino acids (Callens et al., 1991), which contrasts with human PYKs (Chaneton et al., 2012; Yuan et al., 2018). Known trypanosomatid PYK inhibitors impair enzymatic activity through occupation of the PYK active site (Morgan et al., 2011).”
In the Results, although I am not qualified to analyse the structural data in detail I am confident in the ability of the authors to do so.
Comment 1.4: Differences in nanobody binding kinetics to the T. congolense enzyme when compared to T. brucei and Leishmania enzymes are attributed to the relatively few amino acid differences in those sites. It is desirable to test site-directed mutagenesis of those residues.
This is a highly valuable suggestion from the reviewer. However, we currently don’t have dedicated hands and funding for this project, which renders it impossible for us to perform additional experiments at this stage.
Comment 1.5: In the section on slow-binding inhibition kinetics (lines 194-220) I found it difficult to follow whether it was just the R<>T transition that slowed nanobody inhibition, or whether competition with effectors at the site would also impact on those inhibition kinetics. Can this be clarified?
Since the sdAb42 epitope is located relatively far away from both active and effector sites (~20 and ~40 Å, respectively), it seems highly unlikely the observed “slow-binding inhibition” kinetics are the result of a competition between sdAb42 on one hand and substrates and/or effectors on the other for enzyme binding. Instead, given that sdAb42 selectively binds and locks the enzyme’s inactive T state, these data can be explained by the idea that sdAb42 can only bind to trypanosomatid PYKs after having undergone an R- to T-state transition. To clarify this matter, we slightly reformulated the discussion as indicated below. We also included a small discussion on the observation that there is a 400-fold difference between the Kd and the IC50.
“Since the sdAb42 epitope is located relatively far away from both active and effector sites (~20 and ~40 Å, respectively), it seems highly unlikely that the observed “slow-binding inhibition” kinetics are the result of a direct competition between sdAb42 and substrates and/or effectors. Instead, given that sdAb42 selectively binds and locks the enzyme’s inactive T state, these data can be explained by the idea that sdAb42 can only bind to trypanosomatid PYKs after having undergone an R- to T-state transition. An additional observation in this context, is the 400-fold difference between the K<sub>D</sub> and IC<sub>50</sub> values. Although we currently do not have a mechanistic explanation, similar differences have been observed for the sdAb-mediated allosteric inhibition of other kinases (Singh et al., 2022).”
For the intrabody expression work, the reference cited on line 230 actually points to a growing ability to genetically modify T. congolense. However, it is justifiable to work on T.brucei given the much wider availability and advanced status of the genetic tools.
The growth inhibition data shown in Figure 7 is weak, albeit significant and the case is made as to why that might be.
Comment 1.6: The authors do point to the fact that inhibiting other parts of the glycolytic pathway might be helpful in getting a better growth inhibitory effect. It would be useful, in this regard, to test the ability of the PFK inhibitors in the Macnae et al. paper in the intrabody expressing line, and possibly other inhibitors e.g. 2-deoxy-D-glucose to see if these combinations do have the desired impacts. Also, looking at the metabolome of the intrabody expressors under induction could also give some further insights into changes in flux (although perhaps not on its own given the weak effects on the growth seen).
This is a highly valuable suggestion from the reviewer. However, we currently don’t have dedicated hands and funding for this project, which renders it impossible for us to perform additional experiments at this stage. We would like to point out that, in our experience, studying the effect of enzyme inhibition on the metabolome is usually only useful shortly after adding the onset of inhibition. The system adapts to the lowered flux and relevant changes are mostly transient. Since the induced expression of sdAb42 is almost certainly slow, we expect the metabolic changes will be minimal.
REVIEWER 2:
Summary:
In this work, the authors show that the camelid single-chain antibody sdAb42 selectivity inhibits Trypanosome pyruvate kinase (PYK) but not human PYK. Through the determination of the crystal structure and biophysical experiments, the authors show that the nanobody binds to the inactive T-state of the enzyme, and in silico analysis shows that the binding site coincides with an allosteric hotspot, suggesting that nanobody binding may affect the enzyme active site. Binding to the T-state of the enzyme is further supported by non-linear inhibition kinetics. PYK is an important enzyme in the glycolytic pathway, and inhibition is likely to have an impact on organisms such a trypanosomes, that heavily rely on glycolysis for their energy production. The nanobody was generated against Trypanosoma congolense PYK, but for technical reasons the authors progressed to testing its impact on cell viability in Trypanosoma brucei brucei. First, they show that sdA42 is able to inhibit Tbb PYK, albeit with lower potency. Cell-based experiments next show that expression of sdA42 has a modest, and dose-dependent effect on the growth rate of Tbb. The authors conclude that their data indicates that targeting this allosteric site affects cell growth and is a valuable new option for the development of new chemotherapeutics for trypanosomatid diseases.
Strengths:
The work clearly shows that sdA42A inhibits Trypanosome and Leishmania PYK selectively, with no inhibition of the human orthologue. The crystal structure clearly identifies the binding site of the nanobody, and the accompanying analysis supports that the antibody acts as an allosteric inhibitor of PYK, by locking the enzyme in its apo state (T-state).
Weaknesses:
(1) The most impactful claim of this work is that sdAb42-mediated inhibition of PYK negatively affects parasite growth and that this presents an opportunity to develop novel chemotherapeutics for trypanosomatid diseases. For the following reasons I think this claim is not sufficiently supported:
Comment 2.1: The authors do not provide evidence of target-engagement in cells, i.e. they do not show that sdA42A binds to, or inhibits, Tbb PYK in cells and/or do not provide a functional output consistent with PYK inhibition (e.g. effect on ATP production). Measuring the extent of target engagement and inhibition is important to draw conclusions from the modest effect on growth.
The authors do not explore the selectivity of sdA42A in cells. Potentially sdA42A may cross-react with other proteins in cells, which would confound interpretation of the results.
We understand the reviewer’s concern. While it is theoretically possible that sdAb42 may non-specifically (cross-)react with other proteins in the cell, this would be highly unlikely based on the following arguments. First, many studies have employed sdAbs as intrabodies and reported on specific sdAb-mediated effects (outstanding reviews on the topic are Cheloha et al. (PMID 32868455) and Soetens et al. (PMID 33322697)). Second, it has been demonstrated that selecting sdAbs from an immune library through phage display or “bacteriomatch” (a bacterial system similar to yeast two hybrid) yields highly similar results (Pellis et al., PMID 22583807), thereby indicating that sdAbs interact specifically with their target antigens in an intracellular environment. Third, we identified TcoPYK as the target for sdAb42 by employing sdAb42 as bait in a pull-down from a parasite whole cell lysate (Pinto Torres et al., PMID 29899344). The pull-down fractions were analysed by SDS-PAGE and we observed a clear prominent band, which was further analysed by mass spectrometry and revealed TcoPYK as the target with great certainty. Even though the affinity of sdAb42 for TbrPYK is lower, it still remains high (nM affinity) and we expect it to bind TbrPYK with high specificity.
Regarding measuring the effect on ATP production, we would like to state that such experiments are not obvious. Instead of measuring ATP levels, one should measure ATP turnover as ATP levels may not necessarily be decreased. The latter was observed to be the case for the specific inhibition of trypanosomal PFK (Nare et al. PMID 36864883). The specific trypanosomal PFK inhibitor inhibits motility (and growth) of T. congolense IL3000 at concentrations that only slightly affect ATP levels. One could think of repeating the sdAb42 experiments in a T. congolense model. However, T. congolense BSF metabolism is more complicated than that of T. brucei BSF. First, the T. congolense glucose metabolic network is more expanded, allowing a lower glucose consumption rate to produce ATP and metabolites for growth. Second, pyruvate is not excreted but further metabolised, in part in the mitochondrion. Steketee et al. (PMID 34310651) have shown that T. congolense also takes up pyruvate from the medium. One can thus check if (increased) external pyruvate (partially) rescues the growth inhibition by sdAb42. It will not provide proof, but maybe an indication. As mentioned above, we are currently unable to perform such additional experiments due to lack of dedicated hands and funding.
Comment 2.2: sdA42A only affects minor growth inhibition in Tbb. The growth defect is used as the main evidence to support targeting this site with chemotherapeutics, however based on the very modest effect on the parasites, one could reasonably claim that PYK is actually not a good drug target. The strongest effect on growth is seen for the high expressor clone in Figure 4a, however here the uninduced cells show an unusual profile, with a sudden increase in growth rate after 4 days, something that is not seen for any of the other control plots. This unexplained observation accentuates the growth difference between induced and uninduced, and the growth differences seen in all other experiments, including those with the highest expressors (clones 54 and 55) are much more modest. The loss of expression of sdA42A over time is presented as a reason for the limited effect, and used to further support the hypothesis that targeting the allosteric site is a suitable avenue for the development of new drugs. However, strong evidence for this is missing.
We agree that the growth effect of sdAb42 expression is modest, and we have provided several explanations as to why this could be the case. In addition, as mentioned at the start of this rebuttal, the claim that our results may be exploited for the design of novel chemotherapeutics was perhaps expressed stronger than we intended to. We still believe our findings indicate a potential for such an endeavor, but this clearly requires further investigation and experimental evidence as mentioned by the reviewer.
We, however, disagree that PYK would not be a good drug target. Its potential to serve as a drug target is related to its fundamentally important role in trypanosomal glycolysis and not to the features of sdAb42. Steketee et al. (PMID 34310651) have shown that glycolysis is essential for T. congolense BSF, despite a lower glycolytic flux than in T. brucei BSF. The T. congolense glucose metabolic network is more expanded, allowing a lower glucose consumption rate to produce ATP and metabolites for growth. Also here, PYK is thus almost certainly essential and from that perspective a good drug target.
Comment 2.3: For chemotherapeutic interventions to be possible, a ligandable site is required. There is no analysis provided of the antibody binding site to indicate that small molecule binding is indeed feasible.
We agree with the reviewer’s comment and have included APOP analysis on the TcoPYK T state crystal structure (see also reply to Comment 3.1). Briefly, APOP works by detecting pockets and then perturbing each pocket in the protein's elastic network (GNM) by adding stiffer springs between the surrounding residues. The pockets are scored and ranked based on the calculated shifts in the eigenvalues of the global GNM modes and their local hydrophobic densities, thereby also considering the pocket’s surface accessibility, which renders it suitable for the identification of allosteric (and druggable) pockets. The APOP analysis identifies pockets overlapping with the sdAb42 epitope as highly ranking allosteric ligand binding pockets. The data have been summarized in an additional supplementary figure (Figure 4 – figure supplement 1). The manuscript also contains details on the performed APOP analysis in the Materials and Methods section.
Comment 2.4: The authors comment on the modest growth inhibition, and refer to the need to achieve over 88% reduction in Vmax of PYK to see a strong effect, something that may or may not be achieved in the cell-based model (no target-engagement or functional readout provided). The slow binding model and switch of species are also raised as potential explanations. While these may be plausible explanations, they are not tested which leaves us with limited evidence to support targeting the allosteric site on PYK.
In our understanding of this remark, we believe it be related to Comments 2.1 and 2.2 and thus refer to our answers formulated above.
Comment 2.5: The evidence to support an allosteric mechanism is derived from structural studies, including the in silico allosteric network predictions. Unfortunately, standard enzyme kinetics mode of inhibition studies are missing. Such studies could distinguish uncompetitive from non-competitive behaviour and strengthen the claim that sdAb42 locks the enzyme complex in the apo form.
We agree with the referee that a thorough kinetic analysis could distinguish between uncompetitive (i.e., sdAb only binds to the enzyme if substrate is bound) or non-competitive (i.e., sdAb can bind to apo enzyme and substrate-bound enzyme) inhibition. In both cases, however, this would correspond to an allosteric mechanism of inhibition. Although such a thorough kinetic analysis would be interesting in its own right, we would like to argue that this type of very detailed kinetics is outside the scope of this paper. This is especially the case taking into account that this analysis could be complicated by the slow-onset inhibition behavior.
Comment 2.6: As general comment, the graphical representation of the data could be improved in line with recent recommendations: https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.1002128, https://elifesciences.org/inside-elife/5114d8e9/webinar-report-transforming-data-visualisation-to-improve-transparency-and-reproducibility.
- Bar-charts for potency are ideally presented as dot plots, showing the individual data points, or box plots with datapoints shown.
- Images in Figure 7 show significant heterogeneity of nanobody expression, but the extent of this can not be gleaned from Figure 7B. It would be much better to use box plots or violin plots for each cell line on this figure panel. The same applies to Figure 10.
We thank the reviewer for these suggestions but have taken the decision not to act upon these as the other reviewers explicitly mentioned that our figures are very clear.
Recommendations for authors:
Please find below some minor comments:
Comment 2.7: Line 24: "increasing number of drug failures": This does not really reflect the current situation for human African trypanosomiasis, with NECT treatment retaining efficacy, fexinidazole now being registered, and acoziborole progressing towards registration. It may be worth considering focusing the introduction more on Nagana, as all Trypanosome species used in the paper are animal infective, and the nanobody was discovered for T. congolense.
We refer to our answer formulated in response to Comment 1.1.
Comment 2.8: Line 55: "alarming number of reports describing ..." While resistance is a big problem, this mainly applies to malaria, bacterial and fungal diseases. For kinetoplastids, the number of reports describing resistance in the clinic is pretty limited. However, the drug discovery pipeline for these diseases is sparse, so I definitely agree there is a need to develop new compounds with differentiated mechanisms.
We agree with the reviewer and have slightly adapted our wording here as follows.
“Unfortunately, a number of reports describe treatment failure or parasite resistance to the currently available drugs (De Rycker et al., 2018).”
Comment 2.9: This manuscript is about pyruvate kinase, but the enzyme is not properly introduced. I suggest a short paragraph introducing PYK at line 77 (without duplicating Figure 1), covering its role in glycolysis, the importance of pyruvate, any essentiality data from the literature, and any known inhibitors.
We refer to our answer formulated in response to Comment 1.3.
Comment 2.10: Figure 6: For the top insets it would be useful to somehow show the increasing antibody concentration, either by using a changing intensity or size for each line.
We thank the reviewer for this suggestions, but decided not to act upon it as we found that the inclusion of this information in the figure made it “too crowded”, which is why we opted to provide this information in the figure legend.
“Only a subset of the traces is shown for the sake of clarity. The following curves are shown (from bottom to top): TcoPYK (0.15 nM sdAb42, 500 nM sdAb42, 750 nM sdAb42, 1000 nM sdAb42, 1500 nM sdAb42, 2000 nM sdAb42, no enzyme control), LmePYK (5 nM sdAb42, 750 nM sdAb42, 1250 nM sdAb42, 1500 nM sdAb42, 2500 nM sdAb42, 3000 nM sdAb42, no enzyme control), and TbrPYK (1 nM sdAb42, 1000 nM sdAb42, 1750 nM sdAb42, 2000 nM sdAb42, 3500 nM sdAb42, 4000 nM sdAb42, no enzyme control).”
Comment 2.11: You refer to the curves as biphasic, but they look like 1st order kinetics, and there are no clear 1st and 2nd phases (or at least they are not marked). It may be more appropriate to label these as non-linear.
We agree that the term “biphasic” is potentially an over-simplification of the actual situation. What we mean is that the formation of product as a function of time ([P] versus [t] curve) is not linear at short time ranges but evolves from an initial “weakly inhibited” rate (v<sub>0</sub>) to a “strongly inhibited” steady-state rate (v<sub>ss</sub>). This conversion from v<sub>0</sub> to v<sub>ss</sub> indeed occurs in a fashion following single exponential behavior. With the term “biphasic” we thus meant a non-linear phase (before v<sub>ss</sub> is reached) followed by a linear phase (after v<sub>ss</sub> is reached). To avoid any confusion, we replaced the term “biphasic” by “non-linear”.
Comment 2.12: IC50s - would be useful to provide a comparison with IC50s generated in the pre-incubation experiments - is the antibody less potent without pre-incubation? I could not find IC50s for the pre-incubation experiments shown in Figure 2.
We determined an IC50 value for sdAb42 against TcoPYK under pre-incubation conditions, but initially decided not to include this into the manuscript. We agree with the reviewer that a comparison between IC50 values determined under pre- and post-incubation conditions would be of interest, and have therefore included the pre-incubation IC50 data for TcoPYK in Figure 2 (panel B). The data indeed show that sdAb42 far more efficiently inhibits an enzyme that is not continuously cycling between R and T states (IC50 values of 15 nM and 359 nM under pre- and post-incubation conditions, respectively). This is now discussed in the results section of the manuscript. We did not determine IC50 values for sdAb42 against TbrPYK and LmePYK under pre-incubation conditions, but suspect that a similar observation will be made upon comparing these values to IC50 under post-incubation conditions.
REVIEWER 3:
Summary:
Out of the 20 Neglected Tropical Diseases (NTD) highlighted by the WHO, three are caused by members of the trypanosomatids, namely Leishmanaisis, Trypanosomiasis, and Chagas disease. Trypanosomal glycolytic enzymes including pyruvate kinase (PyK) have long been recognised as potential targets. In this important study, single-chain camelid antibodies have been developed as novel and potent inhibitors of PyK from the T, congolense. To gain structural insight into the mode of action, binding was further characterised by biophysical and structural methods, including crystal structure determination of the enzyme-nanobody complex. The results revealed a novel allosteric mechanism/pathway with significant potential for the future development of novel drugs targeting allosteric and/or cryptic binding sites.
Strengths:
This paper covers an important area of science towards the development of novel therapies for three of the Neglected Tropical Diseases. The manuscript is very clearly written with excellent graphics making it accessible to a wide readership beyond experts. Particular strengths are the wide range of experimental and computational techniques applied to an important biological problem. The use of nanobodies in all areas from biophysical binding experiments and X-ray crystallography to in-vivo studies is particularly impressive. This is likely to inspire researchers from many areas to consider the use of nanobodies in their fields.
Weaknesses:
There is no particular weakness, but I think the computational analysis of allostery, which basically relies on a single server could have been more detailed.
Recommendations for authors:
Overall an excellent paper, there are just a couple of points that the authors could consider, if time allows.
Comment 3.1: As mentioned above the computational analysis of allostery appears to be based on a single server based on coordinates alone with no in-depth analysis. It would be extremely interesting to see if more sophisticated methods based on elastic network model and/or molecular dynamics simulation gave similar results. I realize that this would require quite a lot of work though.
We agree with the reviewer’s comment and have complemented the perturbation analysis (previously presented in the manuscript) with dGNM and APOP analyses to identify allosteric communication pathways and allosteric binding pockets, respectively. dGNM, which is based on transfer entropy, allowing for a detailed characterization of the dynamic coupling and information transfer between residues. Meanwhile, APOP employs a perturbation-based approach to detect and rank allosteric pockets. The findings are in good agreement with the previously presented perturbation data and have been summarized in an additional supplementary figure (Figure 4 – figure supplement 1). The manuscript also contains details on the performed transfer entropy and APOP analyses in the Materials and Methods section.
Comment 3.2: The figures are excellent and really help the reader - with the exception of the screenshots (Figure 8). Using pymol or chimera (or any other more expensive commercial package) would really help the reader and will not take much time.
We agree with the referee that this is not the most beautiful figure. However, we find the quality and clarity of the figure to be adequate for its purpose (i.e., a supplemental figure).
Comment 3.3: Finally, I would have liked to see at least the PDB validation files. This is a highly regarded and experienced team, nevertheless, the resolution is rather mediocre. As the crystal coordinates were used as input for the computational, any experimental inaccuracies will affect the computational results.
We agree with the reviewer that we could have provided the validation report together with the submitted manuscript and we apologise for the inconvenience. The validation reports will be released together with the structures following final manuscript publication. Regarding the resolution of the crystal structures, we agree with the reviewer’s comment, but we obviously employed data sets from our best diffracting crystals and could not obtain a higher resolution despite our best efforts.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
eLife Assessment:
This important study investigates the propensity of the intravacuolar pathogen, Leishmania, to scavenge lipids which it utilizes for its accelerated growth within macrophages. Although some of the data compellingly links increased lipid acquisition to parasite growth, data to support the underlying mechanism to describe the proposed model is incomplete. The study adds to other work that has implicated pathogen-derived processes in the selective recruitment of vesicles to the pathogen-containing vacuole, based on the content of the cargo.
We appreciate the time and effort that Editor and Reviewers have provided to provide the assessment of our work (eLife: eLife-RP-RA-2024-102857). We thank them all for this assessment.
Regarding some of the concerns raised by Reviewer 1, particularly the lack of data on NPC-1 knockdown, we would like to clarify that this information was included in our original submission (as elaborated in detail in the following section). Additionally, we acknowledge that one of the major concerns about the completeness of our work stems from Reviewer 1’s comments on the isolation and purity of the parasitophorous vacuole (PV). Reviewer 2 has also emphasized the importance of this experiment in strengthening the technical rigor of our study, and we fully agree with this recommendation. We acknowledge that this is a very appropriate suggestion by both the Reviewers and we will include this data in the subsequent revision of this work for revaluation of assessment. Also, ahead of a full revision of the paper, we would like to address the concerns raised by the reviewers outlining our revision plans.
Public Reviews:
Reviewer #1 (Public review):
Although the use of antimony has been discontinued in India, the observation that there are Leishmania parasites that are resistant to antimony in circulation has been cited as evidence that these resistant parasites are now a distinct strain with properties that ensure their transmission and persistence. It is of interest to determine what are the properties that favor the retention of their drug resistance phenotype even in the absence of the selective pressure that would otherwise be conferred by the drug. The hypothesis that these authors set out to test is that these parasites have developed a new capacity to acquire and utilize lipids, especially cholesterol which affords them the capacity to grow robustly in infected hosts.
We sincerely appreciate Reviewer 1's thoughtful and positive evaluation of our manuscript. We acknowledge that the reviewer has a few major concerns, and we would like to address them one by one in the following section of this initial response before submitting a full revision of our work.
Major issues:
(1) There are several experiments for which they do not provide sufficient details, but proceed to make significant conclusions.
Experiments in section 5 are poorly described. They supposedly isolated PVs from infected cells. No details of their protocol for the isolation of PVs are provided. They reference a protocol for PV isolation that focused on the isolation of PVs after L. amazonensis infection. In the images of infection that they show, by 24 hrs, infected cells harbor a considerable number of parasites. Is it at the 24 hr time point that they recover PVs? What is the purity of PVs? The authors should provide evidence of the success of this protocol in their hands. Earlier, they mentioned that using imaging techniques, the PVs seem to have fused or interconnected somehow. Does this affect the capacity to recover PVs? If more membranes are recovered in the PV fraction, it may explain the higher cholesterol content.
We would like to thank the reviewer for correctly pointing out lack of details regarding PV isolation and its purity. There are multiple questions raised by the reviewer and we will answer them one by one in a point wise manner:
Firstly, “Is it at the 24 hr time point that they recover PVs?”
In the ‘Methods’ section of the original submission (Line number-606-611), there is a separate section on “Parasitophorous vacuole (PV) Isolation and cholesterol measurement”, where it is clearly mentioned, “24Hrs LD infected KCs were lysed by passing through a 22-gauge syringe needle to release cellular contents. Parasitophorous vacuoles (PV) were then isolated using a previously outlined protocol [Ref: 73].” However, we do acknowledge further details might be useful to enrich this section, and hence we would like to include the following details in the revised manuscript, “10<sup>7</sup> KCs were seeded in a 100 mm plate and allowed to adhere for 24 hours. Following infection with Leishmania donovani (LD) for 24 hours, the infected KCs were harvested by gentle scraping and lysed through five successive passages through an insulin needle to ensure membrane disruption while preserving organelle integrity. The lysate was centrifuged at 200 × g for 10 minutes at 4°C to remove intact cells and large debris. The resulting supernatant was carefully collected and subjected to a discontinuous sucrose density gradient (60%, 40%, and 20%). The gradient was centrifuged at 700 × g for 25 minutes at 4°C to facilitate organelle separation. The interphase between the 40% and 60% sucrose layers, enriched with PVs, was carefully collected and subjected to a final centrifugation step at 12,000 × g for 25 minutes at 4°C. The supernatant was discarded, and the resulting pellet was enriched for purified parasitophorous vacuoles, suitable for downstream biochemical and molecular analyses.”
Secondly, What is the purity of PVs? Earlier, they mentioned that using imaging techniques, the PVs seem to have fused or interconnected somehow. Does this affect the capacity to recover PVs? If more membranes are recovered in the PV fraction, it may explain the higher cholesterol content.
We appreciate the reviewer for pointing this critical lack of data in the current version of the manuscript. We will be providing data on the purity of isolated fraction by performing western blot against PV and cytoplasmic fraction in the Revised manuscript. We admit, as rightly pointed out by the reviewer we need to access the purity of isolated PV in our experiment and we plan to show this is in the Revised manuscript along with a biochemical quantification of total PV membrane isolated under different experimental condition using Amplex Red kit (Invitrogen™ A12216) or similar other methods.
(2) In section 6 they evaluate the mechanism of LDL uptake in macrophages. Several approaches and endocytic pathway inhibitors are employed. The authors must be aware that the role of cytochalasin D in the disruption of fluid phase endocytosis is controversial. Although they reference a study that suggests that cytochalasin D has no effect on fluid-phase endocytosis, other studies have found the opposite (doi: 10.1371/journal.pone.0058054). It wasn't readily evident what concentrations were used in their study. They should consider testing more than 1 concentration of the drug before they make their conclusions on their findings on fluid phase endocytosis.
We thank the reviewer for this insightful comment and we apologise for missing out mentioning Cytochalasin D concentration. To clarify, LDL uptake by LD-R infected KCs is LDL-receptor independent as clearly shown in Section 6, Figure 4A, Figure S4A, Figure S4B i and Figure S4B ii in the Submitted manuscript. In (Figure 4F and Figure S4D) of the Submitted manuscript, as referred by the Reviewer, Cytochalasin D was used at a concentration of 2.5µg/ml. At this concentration, we did not observe any effect of Cytochalsin D on LDL-receptor independent fluid phase endocytosis as intracellular LD-R amastigotes was able to uptake LDL successfully and proliferate in infected Kupffer cells, unlike Latranculin-A (5µM) treatment which completely inhibited intracellular proliferation of LD-R amastigotes by blocking only receptor independent Fluid phase endocytosis (Movie 2A and 2B and Figure 4E in the Submitted manuscript). In fact, the study referred by the reviewer (doi: 10.1371/journal.pone.0058054), used a concentration of 4µg/ml Cytochalasin D which did affect both LDL-receptor dependent and also receptor independent endocytosis in bone marrow derived macrophages. We would also like to clarify that in this work during our preliminary experiments we have also tested higher concentration Cytochalasin-D (5µg/ml). However, even at this higher concentration there were no significant effect of Cytochalasin-D on LD-R induced LDL-receptor independent fluid phase endocytosis as observed from intracellular LD-R amastigote count represented in Author response image 1. Thus, we strongly believe that Cytochalasin D does not have any impact on LD-R induced fluid phase endocytosis even at higher concentration. We will include this in the discussion section of the revised manuscript to clear out any confusion that readers might have, and also concentration of all the inhibitors used in the study will be mentioned in the Result section, as well as in the revised Figure legends.
Author response image 1.
A. Giemsa-stained images illustrating the impact of concentrations of CYT-D (2.5 and 5 µg/ml) on LD-R-infected Kupffer cells. Black arrow showing intracellular amastigotes. Scale bar 10µM. B. Graphical representation depicting the effect of varying concentrations of CYT-D on the intracellular growth of LD-R. ‘ns’ depicts no significant change.
(3) In Figure 5 they present a blot that shows increased Lamp1 expression from as early as 4 hrs after infection with LD-R and by 12 hrs after infection of both LD-S and LD-R. Increased Lamp1 expression after Leishmania infection has not been reported by others. By what mechanism do they suggest is causing such a rapid increase (at 4hrs post-infection) in Lamp-1 protein? As they report, their RNA seq data did not show an increase in LAMP1 transcription (lines 432 - 434).
We would like to express our gratitude to the reviewer for highlighting the novelty of this observation. Indeed, to the best of our knowledge, no similar findings have been reported previously in primary macrophages infected with Leishmania donovani (LD). Firstly, we would like to point out, as stated in the Methods section (lines 562–566) of the Submitted manuscript: "Flow-sorted metacyclic LD promastigotes were used at a MOI of 1:10 (with variations of 1:5 and 1:20 in some cases) for 4 hours, which was considered the 0th point of infection. Macrophages were subsequently washed to remove any extracellular loosely attached parasites and incubated further as per experimental requirements.” This indicates that our actual study points correspond to approximately the 8th hour post-infection”. We just wanted to clarify this to prevent any potential confusion.
Now regarding LAMP1 expression, although we could not find any previous reports of its expression in LD infected primary macrophages, we would like to mention that a previous report (doi.org/10.1128/mBio.01464-20), has shown a similar punctuated LAMP-1 upregulation (as observed by us in Figure 5A i of the Submitted manuscript) in response to leishmania infection in non-phagocytic fibroblast. It is tempting to speculate that increased LAMP-1 expression observed in response to LD-R infected macrophages might be due to increased lysosomal biogenesis, required for degrading increased endocytosed-LDL into bioavailable cholesterol. However, since no change in LAMP-1 expression in RNA seq data (Figure 6, of the Submitted manuscript), we can only speculate that this is happening due to some post transcriptional or post translational modifications. But further work will definitely require to investigate this mechanism in details which is beyond the scope of this work. That is why, in the Submitted manuscript, (Line 432-435), we have discussed this, “Although available RNAseq analysis (Figure 6) did not support this increased expression of lamp-1 in the transcript level, it did reflect a notable upregulation of vesicular fusion protein (VSP) vamp8 and stx1a in response to LD-R-infection. LD infection can regulate LAMP-1 expression, and the role of VSPs in LDL-vesicle fusion with LD-R-PV is worthy of further investigation.”
However, we agree with the reviewer that this might not be enough for the clarification. Hence in the revised manuscript we plan to update this part as follows, “Although available RNAseq analysis (Figure 6) did not support this increased expression of lamp-1 in the transcript level, it did reflect a notable upregulation of vesicular fusion protein (VSP) vamp8 and stx1a in response to LD-R-infection. How, LD infection can regulate LAMP-1 expression, and the role of VSPs in LDL-vesicle fusion with LD-R-PV is worthy of further investigation. It is possible and has been earlier reported that LD infection can regulate host proteins expression through post transcriptional and post translational modifications (doi.org/10.1111/pim.12156, doi.org/10.3389/fmicb.2017.00314, doi: 10.3389/fimmu.2023.1287539). It is tempting to speculate that LD-R amastigote might be promoting an increased lysosomal biogenesis through any such mechanism to increase supply of bioavailable cholesterol through action of lysosomal acid hydrolases on LDL.”
(4) In Figure 6, amongst several assays, they reported on studies where SPC-1 is knocked down in PECs. They failed to provide any evidence of the success of the knockdown, but nonetheless showed greater LD-R after NPC-1 was knocked down. They should provide more details of such experiments.
Although we do understand the concern raised by the reviewer, this statement in question is factually incorrect. We would like to point out that in Figure 6 F i, of the Submitted manuscript, we have demonstrated decreased NPC-1 staining following transfection with NPC-1-specific siRNA, whereas no such reduction was observed with scrambled RNA. Similar immunofluorescence data confirming LDL-receptor knockdown has also been provided in Figure S4B i of the Submitted manuscript. However, we acknowledge that the reviewer may be referring to the lack of quantitative validation of the knockdown via Western blot. We would like to clarify although, we already had this data, but we did not include it to avoid duplication to reduce the data density of the manuscript. But as suggested by the reviewer, we will be including western blot for both NPC-1 and LDL-receptor knock down in the revised manuscript as represented in Author response image 2. Additionally, as suggested by the reviewer, we also noticed lack of details in Methods section of the submitted manuscript, concerning siRNA mediated Knock down (KD). Therefore, we plan to include more details in the revised manuscript, which will read as, “For all siRNA transfections, Lipofectamine® RNAiMAX Reagent (Life Technologies, 13778100) specifically designed for knockdown assays in primary cells was used according to the manufacturer's instructions with slight modifiction. PECs were seeded into 24-well plates at a density of 1x10<sup>5</sup> per well, and incubated at 37°C with 5% CO2. The transfection complex, comprising (1µl Lipofectamine® RNAiMAX and 50µl Opti MEM) and (1 µl siRNA and 50µl Opti MEM) mixed together directly added to the incubated PECs. Gene silencing was checked by IFA and by Western blot as mentioned previously”.
Author response image 2.
SiRNA-mediated gene knockdown analysis. (A-i, A-ii) Representative immunofluorescence microscopy image and corresponding Western blot analysis demonstrating the knockdown efficiency of NPC1 following SiRNA-mediated gene silencing, scale bar 10µm. (B-i, B-ii) Immunofluorescence image and Western blot confirming LDLr knockdown upon SiRNA treatment. Scrambled RNA (ScRNA) was used as a negative control, while Small Interfering RNA (SiRNA) specifically targeted NPC1 and LDLr transcripts, scale bar 10µm. TR-1 and TR-2 represent independent experimental trials. β-Actin was used as an endogenous loading control for Western blot normalization.
Minor issues
(1) There is an implication that parasite replication occurs well before 24hrs post-infection? Studies on Leishmania parasite replication have reported on the commencement of replication after 24hrs post-infection of macrophages (PMCID: PMC9642900). Is this dramatic increase in parasite numbers that they observed due to early parasite replication?
We thank the reviewer for this insightful comment and appreciate the opportunity to clarify our findings. Indeed, as rightly assumed by the Reviewer, as our data suggest, and we also believe that this increase intracellular amastigotes number is a consequence of early replication of Leishmania donovani. As already mentioned in response to Point number 3 raised by Reviewer 1, we would again like to highlight that in the Methods section (lines 562–566), it is clearly stated: "Flow-sorted metacyclic LD promastigotes were used at a MOI of 1:10 (with variations of 1:5 and 1:20 in some cases) for 4 hours, which was considered the 0th point of infection. Macrophages were subsequently washed to remove any extracellular loosely attached parasites and incubated further as per experimental requirements.” This effectively means that our actual study points correspond to approximately the 8th and 28th hours post-infection and we just want to mention it to avoid any confusion.
Now, regarding specific concern, the study referred by the reviewer on the commencement of replication after 24hrs, was conducted on Leishmania major, which may differ significantly from Leishmania donovani owing to its species and strain-specific characteristics. In fact, doubling time of Leishmania donovani (LD) has been previously reported to be approximately 11.4 hours (doi: 10.1111/j.1550-7408. 1990.tb01147.x). Moreover, multiple studies have indicated an exponential increase in intracellular LD amastigote number (more than two-fold increase) by 24hrs post infection. However, by 48hrs post-infection, the replication rate appeared to slow down, with amastigote numbers not increasing (doubling) proportionally (doi:10.1128/AAC.01196-07, doi.org/10.1016/j.ijpara.2011.07.013). We also have a similar observation for both infected PEC and KC as depicted in Figure 1Ci and Figure S1Ci in the Submitted manuscript) along with Author response image 3. Hence it was an informed decision from our side to focus on 24 hours’ time point to perform the analysis on intracellular proliferation.
Author response image 3.
Graph representing number of intracellular LD-R (MHOM/IN/2009/BHU575/0) parasite burden at different time points post-infection. *** signifies p value < 0.0001, * signifies p value < 0.05.
(2) Several of the fluorescence images in the paper are difficult to see. It would be helpful if a blown-up (higher magnification image of images in Figure 1 (especially D) for example) is presented.
We apologise for the inconvenience. Although we have provided Zoomed images for several Figures in the Submitted manuscript, like Figure 4, Figure 5, Figure 6 and Figure 8. However, this was not always doable for all the figures (like for Figure 1D), due to lack of space and Figure arrangements requirements. However, to accommodate Reviewer’s request we would like to provide a blown-up image for Figure 1D as represented in Author response image 4 in the Revised version. If the reviewer similar representation for any other particular Figures, we will be happy to perform a similar presentation.
Author response image 4.
Three-Dimensional morphometric representation of Parasitophorous Vacuoles (PVs) in Leishmania infected Kupffer Cells at 24 Hours Post-Infection: Confocal 3D reconstruction illustrating the spatial distribution of parasitophorous vacuoles (PVs) in Kupffer cells (KCs) infected for 24 hours. ATP6V0D2, a lysosomal vacuolar ATPase subunit, is visualized in magenta, while the nucleus is depicted in cyan. The final panel highlights PV structural grooves outlined in red solid lines, with intracellular Leishmania donovani (LD) amastigotes indicated by white arrows. Higher magnification of Figure 1D further emphasizes the increased abundance of PVs in LD-R infected cells, suggesting enhanced intracellular replication and adaptation mechanisms of drug-resistant strains. Scale bar 5µM. Both yellow and magenta solid line box represents the same area of the image.
(3) The times at which they choose to evaluate their infections seem arbitrary. It is not clear why they stopped analysis of their KC infections at 24 hrs. As mentioned above, several studies have shown that this is when intracellular amastigotes start replicating. They should consider extending their analyses to 48 or 72 hrs post-infection. Also, they stop in vitro infection of Apoe-/- mice at 11 days. Why? No explanation is given for why only 1 point after infection.
Reviewer has raised two independent concerns and we would like to address them individually.
Firstly, “The times at which they choose to evaluate their infections seem arbitrary. It is not clear why they stopped analysis of their KC infections at 24 hrs. As mentioned above, several studies have shown that this is when intracellular amastigotes start replicating. They should consider extending their analyses to 48 or 72 hrs post-infection.”
We have already provided a detail justification for time point selection in our response to Reviewer Minor Comment 1. As mentioned already we observed a significant and sharp rise in the number of intracellular amastigotes between 4 and 24Hrs post-infection (Author response image 4), with replication rate appeared to be not increaseing proportionally after that. This early stage of rapid replication of LD amastigotes, therefore likely coincides with a critical period of lipid acquisition by intracellular amastigotes (Movie 2A and 2B and Figure 4E in the submitted manuscript) and thus 24hrs infected KC was specifically selected. In this regard, we would also like to add that at 72hrs post-infection, we noticed a notable number of infected Kupffer cells began detaching from the wells with extracellular amastigotes probably egressing out from the infected KCs. This phenomenon potentially reflects the severe impact of prolonged infection on Kupffer cell viability and adhesion properties as shown in Author response image 5 and Author Response Video 1. This point further influenced our decision to conclude all infection studies in Kupffer cells by the 48Hrs post-infection, which necessitate to complete the infection time point at 24 Hrs, for allowing treatment of Amp-B for another 24 Hrs (Figure 8, and Figure S5, in the Submitted manuscript). We acknowledge that we should have been possibly more clear on our selection of time points and as the Reviewer have suggested we plan to include this information in the revised manuscript for clear understanding of the reader.
Author response image 5.
Representative images of Kupffer cells infected with Leishmania donovani at 72Hrs post-infection showing a significant morphological changes. Infected cells exhibit a rounded morphology and progressive detachment. Scale bar 10µm.
Secondly “Also, they stop in vitro infection of Apoe-/- mice at 11 days. Why? No explanation is given for why only 1 point after infection.”
We apologize for not providing an explanation regarding the selection of the 11-day time point for Apoe-/- experiments (Figure 2 of the Submitted manuscript). Our rationale for this choice is based on both previous literature and the specific objectives of our study. Previous report suggests that Leishmania donovani infection in Apoe-/- mice triggers a heightened inflammatory response at approximately six weeks’ post-infection compared to C57BL/6 mice, leading to more efficient parasite clearance. This is owing to unique membrane composition of Apoe-/- which rectifies leishmania mediated defective antigen presentation at a later stage of infection (DOI 10.1194/jlr.M026914). Additionally, previous studies (doi: 10.1128/AAC.47.5.1529-1535.2003) have also indicated that Leishmania donovani infection is well-established in vivo within 6 to 11 days post-infection in murine models. Given that in this experiment we particularly aimed to assess the early infection status (parasite load) in diet-induced hypercholesterolemic mice, we would like to argue that the selection of the 11-day time point was intentional and well-aligned with our study objectives as this time point within this window are optimal for capturing initial parasite burden depending on initial lipid utilization, before host-driven immune clearance mechanisms could significantly alter infection dynamics. We will include this explanation in the Revised manuscript as suggested by the Reviewer.
Reviewer #2 (Public review):
Summary:
This study by Pradhan et al. offers critical insights into the mechanisms by which antimony-resistant Leishmania donovani (LD-R) parasites alter host cell lipid metabolism to facilitate their own growth and, in the process, acquire resistance to amphotericin B therapy. The authors illustrate that LD-R parasites enhance LDL uptake via fluid-phase endocytosis, resulting in the accumulation of neutral lipids in the form of lipid droplets that surround the intracellular amastigotes within the parasitophorous vacuoles (PV) that support their development and contribute to amphotericin B treatment resistance. The evidence provided by the authors supporting the main conclusions is compelling, presenting rigorous controls and multiple complementary approaches. The work represents an important advance in understanding how intracellular parasites can modify host metabolism to support their survival and escape drug treatment.
We would like to sincerely thank the reviewer for appreciating our work and find the evidence compelling to address the issue of emergence of drug resistance in infection with intracellular protozoan pathogens. Before we submit a full revision of the paper, we would like to provide a primary response addressing the concerns of the reviewer.
Strengths:
(1) The study utilizes clinical isolates of antimony-resistant L. donovani and provides interesting mechanistic information regarding the increased LD-R isolate virulence and emerging amphotericin B resistance.
(2) The authors have used a comprehensive experimental approach to provide a link between antimony-resistant isolates, lipid metabolism, parasite virulence, and amphotericin B resistance. They have combined the following approaches:
(a) In vivo infection models involving BL/6 and Apoe-/- mice.
(b) Ex-vivo infection models using primary Kupffer cells (KC) and peritoneal exudate macrophages (PEC) as physiologically relevant host cells.
(c) Various complementary techniques to ascertain lipid metabolism including GC-MS, Raman spectroscopy, microscopy.
(d) Applications of genetic and pharmacological tools to show the uptake and utilization of host lipids by the infected macrophage resident L. donovani amastigotes.
(3) The outcome of this study has clear clinical significance. Additionally, the authors have supported their work by including patient data showing a clear clinical significance and correlation between serum lipid profiles and treatment outcomes.
(4) The present study effectively connects the basic cellular biology of host-pathogen interactions with clinical observations of drug resistance.
(5) Major findings in the study are well-supported by the data:
(a) Intracellular LD-R parasites induce fluid-phase endocytosis of LDL independent of LDL receptor (LDLr).
(b) Enhanced fusion of LDL-containing vesicles with parasitophorous vacuoles (PV) containing LD-R parasites both within infected KCs and PECs cells.
(c) Intracellular cholesterol transporter NPC1-mediated cholesterol efflux from parasitophorous vacuoles is suppressed by the LD-R parasites within infected cells.
(d) Selective exclusion of inflammatory ox-LDL through MSR1 downregulation.
(e) Accumulation of neutral lipid droplets contributing to amphotericin B resistance.
Weaknesses:
The weaknesses are minor:
(1) The authors do not show how they ascertain that they have a purified fraction of the PV post-density gradient centrifugation.
(2) The study could have benefited from a more detailed analysis of how lipid droplets physically interfere with amphotericin B access to parasites.
We have addressed both these concerns as our preliminary response in details in subsequent “Recommendations for the Authors section” before we submit a complete Revised manuscript,
Impact and significance:
This work makes several fundamental advances:
(1) The authors were able to show the link between antimony resistance and enhanced parasite proliferation.
(2) They were also able to reveal how parasites can modify host cell metabolism to support their growth while avoiding inflammation.
(3) They were able to show a certain mechanistic basis for emerging amphotericin B resistance.
(4) They suggest therapeutic strategies combining lipid droplet inhibitors with current drugs.
Recommendations for the authors:
Reviewer #2 (Recommendations for the authors):
(1) Experimental suggestions:
a) The authors could have provided a more detailed analysis of lipid droplet composition. This is a critically missing piece in this nice study.
We completely agree with the reviewer on this, a more detailed analysis of lipid droplets composition, dynamics of its formation and mechanism of lipid transfer to amastigotes residing within the PV would be worthy of further investigation. To answer the reviewers, we are already conducting investigation in this direction and have very promising initial results which we are willing to share with the reviewer as unpublished data if requested. Since, we plan to address these questions independently, we hope reviewer will understand our hesitation to include these data into the present work which is already immensely data dense. We sincerely believe existence of lipid droplet contact sites with the PV along with the specific lipid type transfer to amastigotes and its mechanism requires special attention and could stand out as an independent work by itself.
b) The macrophages (PEC, KC) could have been treated with latex beads as a control, which would indicate that cholesterol and lipids are indeed utilized by the Leishmania parasitophorous vacuole (PV) and essential for its survival and proliferation.
We thank the reviewer for this nice suggestion, which we believe will further strengthen the conclusion of this work. This has also been suggested by Reviewer 1 and we are planning to conduct this experiment and will include this data in the revised version of this manuscript.
c) HMGCoA reductase is an important enzyme for the mevalonate pathway and cholesterol synthesis. The authors have not commented on this enzyme in either host or parasite. Additionally, western blots of these enzymes along with SREBP2 could have been performed.
We appreciate the concern and do see the point why reviewer is suggesting this. We would like to mention that regarding HMGCoA we already do have real time qPCR data which perfectly aligns with our RNAseq data (Figure 6 Ai, in the Submitted manuscript), showing significant downregulation specifically in LD-R infected KC as compared to uninfected control. We are including this data as Author response image 6. However, we did not proceed with checking the level of HMGCoA at the protein level as we noticed several previous reports have suggested that HMGCoA remains under transcriptional control of SERBP2(doi.org/10.1016/j.cmet.2011.03.005,doi: 10.1194/jlr.C066712,doi:10.1194/jlr.RA119000201), which acts the master regulator of mevalonate pathway and cholesterol synthesis (doi.org/10.1161/ATVBAHA.122.317320). However, as suggested by the Reviewer, we will perform this experiment and will update the Revised manuscript with the expression data on HMGCoA probably in the Supplementary section
Author response image 6.
qPCR Analysis of HMGCR Expression Following Leishmania donovani Infection: Quantitative PCR analysis showing the relative expression of hmgcr (3-hydroxy-3-methylglutaryl-CoA reductase) in Kupffer cells after 24 hours of Leishmania donovani (LD) infection compared to uninfected control cells. Gene expression levels are normalized to β-actin as an internal control, and fold change is represented relative to the uninfected condition.
d) The authors should discuss the expression pattern of any enzyme of the mevalonate pathway that they have found to be dysregulated in the transcript data.
As per the reviewer’s suggestion, we have already looked into the RNA seq data and observed that apart from hmgcr, hmgcs (_3-hydroxy-3-methylglutaryl-CoA synthase), another key enzyme in the mevalonate pathway, is significantly downregulated in host PECs in response to LD-R infection compared to the LD-S infection. We will update this in the Discussion section of the Revised manuscript, which will read as “Further analysis of RNA sequencing data revealed a significant downregulation of _hmgcs (3-hydroxy-3-methylglutaryl-CoA synthase) in LD-R infected PECs as compared to LD-S infecton. HMGCS which catalyzes the condensation of acetyl-CoA with acetoacetyl-CoA to form 3-hydroxy-3-methylglutaryl-CoA (HMG-CoA), which serves as an intermediate in both cholesterol biosynthesis and ketogenesis. The downregulation of hmgcs further supports our observation that LD-R-infected PECs preferentially rely on endocytosed low-density lipoprotein (LDL)-derived cholesterol rather than de novo synthesized cholesterol for their metabolic needs.”
e) The authors have followed a previously published protocol by Real F (reference 73) to enrich for parasitophorous vacuole (PV). However, they do not show how they ascertain that they have a purified fraction of the PV post-density gradient centrifugation. The authors should at least show Western blot data for LAMP1 for different fractions of density gradient from which they enriched the PV.
As we previously stated in our response to Reviewer 1, the Revised manuscript will include a detailed analysis of purity for different fractions during PV isolation. We sincerely appreciate the reviewer for highlighting this important concern and for suggesting an approach to conduct the experiment. We believe this experiment is crucial and will further reinforce the conclusions of our study.
(2) Presentation improvements:
a) Add a clear timeline for infection experiments.
Sure. We will be including a schematic of Timelines in the revised figures 2 and 7
b) Provide more details on patient sample collection and analysis.
We plan to include more details on the sample collection in the Method section of the Revised manuscript as follows, “Blood samples were collected from a total of 22 individuals spanning a diverse age range (8 to 70 years) by RMRI, Bihar, India. Among these, nine samples were obtained from healthy individuals residing in endemic regions to serve as controls. Serum was isolated from each blood sample through centrifugation, and the lipid profile was subsequently analysed using a specialized diagnostic kit (Coral Clinical System) following the manufacturer's protocol.”
c) Consider reorganizing figures to better separate mechanistic and clinical findings.
We would like to thank the reviewer for this suggestion. However, we feel that the arrangement of the Figures as presented in the Original Submission is really helping a smooth flow of the story and hence, we would not want to disturb that. However, having said that, if the reviewer has specific suggestion regarding rearrangement of any particular figure, we will be happy to consider that.
(3) Technical clarifications needed:
a) Specify exact concentrations used for inhibitors.
We apologise for this unwanted and unnecessary mistake. Please note we will clearly mention the concentration of all the inhibitors used in this study in Result section and in Revised Figure legends. The revised section will read as, “Finally, we infected the KCs with GFP expressing LD-R for 4Hrs, washed and allowed the infection to proceed in presence of fluorescent red-LDL and Latrunculin-A ( 5µM), a compound which specifically inhibits fluid phase endocytosis by inducing actin depolymerization [41]. Real-time fluorescence tracking demonstrated that Latrunculin-A treatment not only prevented the uptake of fluorescent red-LDL but also severely impacted intracellular proliferation of LD-R amastigotes (Movie 2A and 2B and Figure 4E). In contrast, treatment with Cytochalasin-D (2.5µg/ml), which alters cellular F-actin organization but does not affect fluid phase endocytosis”
b) Include more details on image analysis methods.
Please note that in specific sections like in Line numbers 574-579, 653-658, 1047-1049 of the Submitted manuscript, we have put special attention in describing the Image analysis process. However, we agree that in some particular cases more details will be appreciated by the reader. Hence we will be including an additional section of Image Analysis in the Methods section of the revised manuscript. This section will read as, “Image processing and analysis were conducted using Fiji (ImageJ). For optimal visualization, Giemsa-stained macrophages (MΦs) were represented in grayscale to enhance contrast and structural clarity. To improve the distinction of different fluorescent signals, pseudo-colors were assigned to fluorescence images, ensuring better differentiation between various cellular components. For colocalization analysis (Figures 3, 5, 6, and S2), we utilized the RGB profile plot plugin in ImageJ, which allows for the precise assessment of signal overlap by generating fluorescence intensity profiles across selected regions of interest. This approach provided quantitative insights into the spatial relationship between labeled molecules within infected cells. Additionally, for analyzing the distribution of cofilin in Figure 4, the ImageJ surface plot plugin was employed. This tool enabled three-dimensional visualization of fluorescence intensity variations, facilitating a more detailed examination of cofilin localization and its potential reorganization in response to infection.”
c) Clarify statistical analysis procedures.
Response: We have already provided a dedicated section of Statistical Analysis in the Methods section and also have also shown the groups being compared to determine the statistical analysis in the Figure and in the Figure Legends of the Submitted manuscript. Furthermore, we plan to add additional clarification regarding the statistical analysis performed Revised manuscript. For example, in the Revised manuscript this section will read as, “All statistical analyses were performed using GraphPad Prism 8 on raw datasets to ensure robust and reproducible results. For datasets involving comparisons across multiple conditions, one-way or two-way analysis of variance (ANOVA) was conducted, followed by Tukey’s post hoc test to assess pairwise differences while controlling for multiple comparisons. A 95% confidence interval (CI) was applied to determine the statistical reliability of the observed differences. For non-parametric comparisons across multiple groups, Wilcoxon rank-sum tests were employed, maintaining a 95% confidence interval, which is particularly useful for analysing skewed data distributions. In cases where only two groups were compared, Student’s t-test was used to determine statistical significance, ensuring an accurate assessment of mean differences. All quantitative data are represented as mean ± standard error of the mean (SEM) to illustrate variability within experimental replicates. Statistical significance was determined at P ≤ 0.05. Notation for significance levels: *P ≤ 0.05; **P ≤ 0.001; ***P ≤ 0.0001.”
(4) Minor corrections:
a) Methods section could benefit from more details on Raman spectroscopy analysis.
We agree with this suggestion of the Reviewer. For providing more clarity we will incorporate additional details in the Methodology for the Raman section of the Revised manuscript. The updated section will read as follows in the revised manuscript. “For confocal Raman spectroscopy, spectral data were acquired from individual cells at 1000× magnification using a 100 × 100 μm scanning area, following previously established specifications. After spectral acquisition, distinct Raman shifts corresponding to specific biomolecular signatures were extracted for further analysis. These included: Cholesterol (535–545 cm⁻¹), Nuclear components (780–790 cm⁻¹), Lipid structures (1262–1272 cm⁻¹), Fatty acids (1436–1446 cm⁻¹) Following spectral extraction, pseudo-color mapping was applied to highlight the spatial distribution of each biomolecular component within the cell. These processed spectral images are presented in Figure 3D1, where the first four panels illustrate the individual biomolecular distributions. A merged composite image was then generated to visualize the co-localization of these biomolecules within the cellular microenvironment, with the final panel specifically representing the spatial distribution of key biomolecules.”
b) In the methods section line 609, page 14, the authors cite Real F protocol as reference 73 for PV enrichment. However, in the very next section on GC-MS analysis (lines 615-616, page 15), they state they have used reference 74 for PV enrichment. Can they explain why a discrepancy in PV isolation references this? Reference 74 does not mention anything related to PV isolation.
We would like to sincerely apologise for this confusion which probably raised from our writing of this section. We would like to confirm that our PV isolation protocol is based on the published work of Real F protocol (reference 73). However, in the next section of the submitted manuscript, GC-MS analysis was described and that was performed based on protocol referenced in 74. In the Revised manuscript, we will avoid this confusion and made correction by putting the references in the proper places. Revised section will read as,
“GC-MS analysis of LD-S and LD-R-PV
Following a 24Hrs infection period, KCs were harvested, washed with phosphate-buffered saline (PBS), and pelleted. Subsequent to this, PV isolation was carried out using the previously described method [73]. The resulting parasitophorous vacuole (PV) pellet was processed for sterol isolation for GC_MS analysis following a previously established protocol [74], with slight modification. Briefly, the PV pellet was resuspended in 20 ml of dichloromethane:methanol (2:1, vol/vol) and incubated at 4°C for 24hours. After centrifugation (11,000 g, 1 hour, 4°C), the supernatant was checked through thin layer chromatography (TLC) and subsequently evaporated under vacuum. The residue and pellet were saponified with 30% potassium hydroxide (KOH) in methanol at 80°C for 2 hours. Sterols were extracted with n-hexane, evaporated, and dissolved in dichloromethane. A portion of the clear yellow sterol solution was treated with N, O-bis(trimethylsilyl)trifluoroacetamide (BSTFA) and heated at 80°C for 1 hour to form trimethylsilyl (TMS) ethers. Gas chromatography/mass spectrometry (GC/MS) analysis was performed using a Varian model 3400 chromatograph equipped with DB5 columns (methyl-phenylsiloxane ratio, 95/5; dimensions, 30 m by 0.25 mm). Helium was used as the gas carrier (1 ml/min). The column temperature was maintained at 270°C, with the injector and detector set at 300°C. A linear gradient from 150 to 180°C at 10°C/min was used for methyl esters, with MS conditions set at 280°C, 70 eV, and 2.2 kV.”
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Public Reviews:
Reviewer #1 (Public review):
Summary:
This work by Ding et al uses agent-based simulations to explore the role of the structure of molecular motor myosin filaments in force generation in cytoskeletal structures. The focus of the study is on disordered actin bundles which can occur in the cell cytoskeleton and have also been investigated with in vitro purified protein experiments.
Strengths:
The key finding is that cooperative effects between multiple myosin filaments can enhance both total force and the efficiency of force generation (force per myosin). These trends were possible to obtain only because the detailed structure of the motor filaments with multiple heads is represented in the model.
We appreciate your comments about the strength of our study.
Weaknesses:
It is not clearly described what scientific/biological questions about cellular force production the work answers. There should be more discussion of how their simulation results compare with existing experiments or can be tested in future experiments.
Thank you for the comment. First, our study explains why non-muscle myosin II in stress fibers shows focal distributions rather than uniform distributions; if they stay closely, they can generate much larger forces in the stress fibers via the cooperative overlap. Our study also predicts a difference between bipolar structures (found in skeletal muscle myosins and non-muscle myosins) and side polar structures (found in smooth muscle myosins) in terms of the likelihood of the cooperative overlap. As shown below, myosin filaments with the bipolar structure can add up their forces better than those with the side polar structure when their overlap level is the same. We will add discussion about these in the revised manuscript.
Author response image 1.
As the reviewer noticed, our results were briefly compared with prior observations in Ref. 4 (Thoresen et al., Biophys J, 2013) where different myosin isoforms were used for in vitro actin bundles. We will add more quantitative comparisons between the in vitro study and our results.
In addition, at the end of the conclusion section, we suggested future experiments that can be used for verifying our results. In particular, experiments with synthetic myosin filaments with tunable geometry seem to be suitable for verifying our computational predictions and observations.
The model assumptions and scientific context need to be described better.
We apologize for the insufficient descriptions about the model. We will revise those parts to better explain model assumptions and scientific context.
The network contractility seems to be a mere appendix to the bundle contractility which is presented in much more detail.
We included some cases run with the two-dimensional network in this study to prove the generality of our conclusions. We included minimal preliminary results in this study because we are currently working on a follow-up study with network structures. I hope that the reviewer would understand our intention and situation.
Reviewer #2 (Public review):
Summary:
In this study, the authors use a mechanical model to investigate how the geometry and deformations of myosin II filaments influence their force generation. They introduce a force generation efficiency that is defined as the ratio of the total generated force and the maximal force that the motors can generate. By changing the architecture of the myosin II filaments, they study the force generation efficiency in different systems: two filaments, a disorganized bundle, and a 2D network. In the simple two-filament systems, they found that in the presence of actin cross-linking proteins motors cannot add up their force because of steric hindrances. In the disorganized bundle, the authors identified a critical overlap of motors for cooperative force generation. This overlap is also influenced by the arrangement of the motor on the filaments and influenced by the length of the bare zone between the motor heads.
Strengths:
The strength of the study is the identification of organizational principles in myosin II filaments that influence force generation. It provides a complementary mechanistic perspective on the operation of these motor filaments. The force generation efficiency and the cooperative overlap number are quantitative ways to characterize the force generation of molecular motors in clusters and between filaments. These quantities and their conceptual implications are most likely also applicable in other systems.
Thank you for the comments about the strength of our study.
Weaknesses:
The detailed model that the authors present relies on over 20 numerical parameters that are listed in the supplement. Because of this vast amount of parameters, it is not clear how general the findings are. On the other hand, it was not obvious how specific the model is to myosin II, meaning how well it can describe experimental findings or make measurable predictions. The model seems to be quantitative, but the interpretation and connection to real experiments are rather qualitative in my point of view.
As the reviewer mentioned, all agent-based computational models for simulating the actin cytoskeleton are inevitably involved with such a large number of parameters. Some of the parameter values are not known well, so we have tuned our parameter values carefully by comparing our results with experimental observations in our previous studies since 2009.
We were aware of the importance of rigorous representation of unbinding and walking rates of myosin motors, so we implemented the parallel cluster model, which can predict those rates with consideration of the mechanochemical rates of myosin II, into our model. Thus, we are convincing that our motors represent myosin II.
In our manuscript, our results were compared with prior observations in Ref. 4 (Thoresen et al., Biophys J, 2013) several times. In particular, larger force generation with more myosin heads per thick filament was consistent between the experiment and our simulations.
Our study can make various predictions. First, our study explains why non-muscle myosin II in stress fibers shows focal distributions rather than uniform distributions; if they stay closely, they can generate much larger forces in the stress fibers via the cooperative overlap. Our study also predicts a difference between bipolar structures (found in skeletal muscle myosins and non-muscle myosins) and side polar structures (found in smooth muscle myosins) in terms of the likelihood of the cooperative overlap. As shown in Author response image 1, myosin filaments with the bipolar structure can add up their forces better than those with the side polar structure when their overlap level is the same. We will add discussion about these in the revised manuscript.
We will add more discussion about these in the revised manuscript.
It was often difficult for me to follow what parameters were changed and what parameters were set to what numerical values when inspecting the curve shown in the figures. The manuscript could be more specific by explicitly giving numbers. For example, in the caption for Figure 6, instead of saying "is varied by changing the number of motor arms, the bare zone length, the spacing between motor arms", the authors could be more specific and give the ranges: ""is varied by changing the number of motor arms form ... to .., the bare zone length from .. to..., and the spacing between motor arms from .. to ..".
This unspecificity is also reflected in the text: "We ran simulations with a variation in either L<sub>sp</sub> or L<sub>bz</sub>" What is the range of this variation? "When L<sub>M</sub> was similar" similar to what? "despite different N<sub>M</sub>." What are the different values for N<sub>M</sub>? These are only a few examples that show that the text could be way more specific and quantitative instead of qualitative descriptions.
We appreciate the comment. We will specify the range of the variation in each parameter in the revised manuscript.
In the text, after equation (2) the authors discuss assumptions about the binding of the motor to the actin filament. I think these model-related assumptions and explanations should be discussed not in the results section but rather in the "model overview" section.
Thank you for pointing this out. We will reorganize the text in the revised manuscript.
The lines with different colors in Figure 2A are not explained. What systems and parameters do they represent?
The different colors used in Fig. 2A were used for distinguishing 20 cases. We will add explanation about the colors in the figure caption in the revised manuscript.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the previous reviews.
Reviewer #2 (Recommendations for the authors):
While the authors have responded to most of the comments, a number of issues remain, most of which pertain to imprecise writing, as previously mentioned.
In the second revision of our manuscript, we tried our best to precise our writing.
For example, at high concentrations of PRG-GEF, the authors repeatedly state that RhoA is inhibited (including in the summary). While this may be functionally valid, it is imprecise. RhoA is activated (not inhibited), but its ability to promote contractility is impaired, presumably as a consequence of sequestration of the active GTPase by the PH domain of PRG-GEF. To put a finer point on this, the activity of RhoA•GTP is to bind to proteins that selectively bind active RhoA. One such protein the PH domain of PRG. In the case where PRG is overexpressed, RhoA•GTP binds to PRG. Due to the high concentrations of PRG in some cells, this outcompetes the ability of RhoA•GTP to bind other effectors such as formins or ROCK. However, there no strong evidence that RhoA is inhibited. The only hint of such evidence is a reduction in the biosensor for active RhoA, but this too is likely outcompeted by the overexpressed active GEF. There does not appear to be any disagreement about the mechanism, but rather a semantic difference.
We thank Reviewer #2 for emphasizing this semantic concern, which indeed requires clarification. We agree that RhoA is not chemically inactivated; rather, the protein remains active but is functionally sequestered. Our use of the term “inhibition” was intended to describe functional inhibition, consistent with the definition of inhibition as the act of reducing, preventing, or blocking a process, activity, or function. However, we recognize that this terminology could be interpreted as imprecise. To address this, we have clarified the text by explicitly referring to "functional inhibition of RhoA signaling" where appropriate, or by rewording to terms such as "competitive inhibition of RhoA effector binding" to more accurately reflect the mechanism.
Overall, the manuscript is written in a conversational style, not with the precision expected of a scientific manuscript.
We acknowledge Reviewer #2’s comment regarding the style of our manuscript. While our manuscript adopts a somewhat conversational tone, this was a deliberate choice. We believe this style helps engage the reader and facilitates understanding of our reasoning, guided by the philosophy that science is conducted by humans and should be communicated in a way that resonates with them. That said, we fully agree that this approach should not compromise scientific precision. In response to this feedback, we have revised the manuscript to ensure greater clarity and precision while maintaining the approachable style we have chosen.
To exemplify this, I provide an alternative phrasing of one such paragraph.
Lines 51-62:
Here, contrarily to previous optogenetic approaches, we report a serendipitous discovery where the optogenetic recruitment at the plasma membrane of GEFs of RhoA triggers both protrusion and retraction in the same cell type, polarizing the cell in opposite directions. In particular, one GEF of RhoA, PDZ-RhoGEF (PRG), also known as ARHGEF11, was most efficient in eliciting both phenotypes. We show that the outcome of the optogenetic perturbation can be predicted by the basal GEF concentration prior to activation. At high concentration, we demonstrate that Cdc42 is activated together with an inhibition of RhoA by the GEF leading to a cell protrusion. Thanks to the prediction of a minimal mathematical model, we can induce both protrusion and retraction in the same cell by modulating the frequency of light pulses. Our ability to control both phenotypes with a single protein on timescales of second provides a clear and causal demonstration of the multiplexing capacity of signaling circuits.
Here, we report that the phenotypic consequences of plasma membrane recruitment of a guanine nucleotide exchange factor (GEF), PDZ-RhoGEF (PRG, aka ARHGEF11) depends on the level of expression and degree of recruitment of the GEF. At low concentrations, recruitment of PRG induces cell retraction, consistent with the expected function of a GEF for RhoA. However, at high concentrations, Cdc42 is activated, leading to cell protrusion. A minimal mathematical predicts, and experimental observations confirm, that the extent of recruitment determines the consequences of GEF recruitment. The ability of a single GEF to induce disparate outcomes demonstrates the multiplexing capacity of signaling circuits.
We thank Reviewer #2 for providing an alternative phrasing for lines 51–62. We appreciate the effort to enhance clarity and precision in this key section of the manuscript. While we agree with many aspects of the suggested revision and have incorporated several elements to improve the text, we have also retained aspects of our original phrasing that align with the overall tone and structure of the manuscript. Specifically, we have ensured that the balance between precision and accessibility is maintained while integrating the reviewer's suggestions. We hope that the revised text now addresses the concerns raised.
Key points to correct throughout the manuscript are:
- overexpression of PRG does not "inhibit" RhoA.
- retraction and protrusion are distinct phenotypes, they are not opposite phenotypes. One results from RhoA activation, the other results from Cdc42 activation.
Regarding the term “inhibition,” we agree with the reviewer’s point and have addressed this in our earlier comment.
Regarding the terminology of "opposite phenotypes," we believe this description is valid. While protrusion and retraction arise from distinct signaling pathways (Cdc42 activation and RhoA activation, respectively), we describe them as opposite phenotypes because they represent mutually exclusive cellular behaviors. A cell cannot protrude and retract at the same location simultaneously; instead, these behaviors represent opposing ends of the dynamic spectrum of cell morphology.
Here are some other places where editing would improve the manuscript (a noncomprehensive list).
We went through the whole manuscript to improve the scientific precision according to Reviewer #2 comment on the terminology “inhibition”.
line 15 "inhibition of RhoA by the PH domain of the GEF at high concentrations."
We modified the wording: “sequestration of active RhoA by the GEF PH domain at high concentrations”
line 51 "Here, contrarily to previous optogenetic approaches"
We removed “contrarily to previous optogenetic approaches"
line 141 "We next wonder what could differ in the activated cells that lead to the two opposite phenotypes." (the state of mind of the authors is not relevant)
As explained earlier, we made the choice to keep our writing style.
line 185 "Very surprised by this ability of one protein to trigger opposite phenotypes"
As explained earlier, we made the choice to keep our writing style.
lines 206 ff "As our optogenetic tool prevented us from using FRET biosensors because of spectral overlap, we turned to a relocation biosensor that binds RhoA in its GTP form. This highly sensitive biosensor is based on the multimeric TdTomato, whose spectrum overlaps with the RFPt fluorescent protein used for quantifying optoPRG recruitment. We thus designed a new optoPRG with iRFP, which could trigger both phenotypes *but was harder to transiently express* (?? what does this have to do with the spectral overlap), giving rise to a majority of retracting phenotype. *Looking at the RhoA biosensor*, we saw very different responses for both phenotypes (Figure 3G-I). "
We have clarified.
lines 231ff "RhoA activity shows a very different behavior: it first decays, and then rises. It seems that, adding to the well-known activation of RhoA, PRG DH-PH can also negatively regulate RhoA activity." again, RhoA activity may appear to decay, but this is a limitation of the measurements. RhoA is likely activated to the GTP-bound form. PRG is not negatively regulating RhoA activity. An activity that prevents nucleotide exchange by RhoA or accelerates its hydrolysis would constitute negative regulation of RhoA.
We modified the wording to clarify the sentence.
The attempts to quantify the degree of overexpression, though rough, should be included in the version of record. It is not clear how that estimate was generated.
The estimate of absolute concentration (switch at 200nM) was obtained by comparing fluorescent intensities of purified RFPt and cells under a spinning disk microscope while keeping the exact same acquisition settings. The whole procedure will be described in a manuscript in preparation, focused on Rac1 GEFs.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
Summary and Strengths:
The ability of Wolbachia to be transmitted horizontally during parasitoid wasp infections is supported by phylogenetic data here and elsewhere. Experimental analyses have shown evidence of wasp-to-wasp transmission during coinfection (eg Huigins et al), host to wasp transmission (eg Heath et al), and mechanical ('dirty needle') transmission from host to host (Ahmed et al). To my knowledge this manuscript provides the first experimental evidence of wasp to host transmission. Given the strong phylogenetic pattern of host-parasitoid Wolbachia sharing, this may be of general importance in explaining the distribution of Wolbachia across arthropods. This is of interest as Wolbachia is extremely common in the natural world and influences many aspects of host biology.
Weaknesses:
The first observation of the manuscript is that the Wolbachia strains in hosts are more closely related to those in their parasitoids. This has been reported on multiple occasions before, dating back to the late 1990s. The introduction cites five such papers (the observation is made in other studies too that could be cited) but then dismisses them by stating "However, without quantitative tests, this observation could simply reflect a bias in research focus." As these studies include carefully collected datasets that were analysed appropriately, I felt this claim of novelty was rather strong. It is unclear why downloading every sequence in GenBank avoids any perceived biases, when presumably the authors are reanalysing the data in these papers.
Thank you for bringing this to our attention. In this study, we downloaded all wsp sequences from GenBank and conducted a systematic analysis. We acknowledge that there could still be a bias in research focus, but a systematic analysis, compared to a limited dataset, may reduce this bias. We agree with the reviewer's point, and we have revised this statement to make it more accurate. Now the new sentence reads: "However, there is still a lack of systematic statistical analyses to support this hypothesis." (Lines 69–70 in the revised manuscript)
I do not doubt the observation that host-parasitoid pairs tend to share related Wolbachia, as it is corroborated by other studies, the effect size is large, and the case study of whitefly is clearcut. It is also novel to do this analysis on such a large dataset. However, the statistical analysis used is incorrect as the observations are pseudo-replicated due to phylogenetic non-independence. When analysing comparative data like this it is essential to correct for the confounding effects of related species tending to be similar due to common ancestry. In this case, it is well-known that this is an issue as it is a repeated observation that related hosts are infected by related Wolbachia. However, the authors treat every pairwise combination of species (nearly a million pairs) as an independent observation. Addressing this issue is made more complex because there are both the host and symbiont trees to consider. The additional analysis in lines 123-124 (including shuffling species pairs) does not explicitly address this issue.
We agree with your point about the non-independence of data due to phylogenetic relationships. In the analysis of species traits, a conventional phylogenetic correction assumes that traits follow a Brownian motion model (Felsenstein, 1985). The variance of the trait values for a species i is given by:
Var[Yi]=σ2Ti,
Where Ti represents the time from the root to the tip for species i. Consequently, the covariance between traits of species i and j is:
Cov[Yij,Yj]=σ<sup>2</sup>Tii,
where Tij is the time from the root to the most recent common ancestor (MRCA) of species i and j. Linear model analysis incorporates the covariance matrix to correct for the effects of non-independence. Mathematically, this method is equivalent to the independent contrasts approach (Felsenstein, 1985).
In our analysis, we treat the minimum interspecific wsp distance between two species as a trait for the species pair (i, j). Similarly, for any two pairs of species (i, j) and (k, l), we postulate that the covariance between their traits is given by:
Cov[Y<sub>ij</sub>,Y<sub>kl</sub>]=σ2⋅(T<sub>ik</sub>+T<sub>jl</sub>),
where Tik denotes the time from the root to the MRCA of species i and k, and Tjl represents the time from the root to the MRCA of species j and l. This covariance matrix is then incorporated into our linear model analysis to account for the effects of phylogenetic non-independence.
However, when extending trait analysis to pairs of species, the computational demands increase substantially. For instance, with a dataset of 1,377 species, forming all possible pairs yields 947,376 unique species combinations. Consequently, constructing a covariance matrix for these pairs would necessitate storing 897,521,285,376 entries, a requirement that far exceeds the memory capabilities of standard computing systems.
To address this, we randomly sampled 1,000 pairs from the total of 947,376 species pairs within the 'Others' category, thereby reducing the computational load without compromising the representativeness of our analysis. Ultimately, even after accounting for phylogenetic correction using covariance, the effect of parasitism remains highly significant (p < 0.0001).
We have added a “Phylogenetic correction” section to Materials and Methods (Lines 392–405 in the revised manuscript). The corresponding results are described on lines 120–121 and in supplementary Note 1. The data and scripts for this analysis are available at https://doi.org/10.6084/m9.figshare.24718119.
REFERENCE
Felsenstein J, 1985. Phylogenies and the comparative method. The American Naturalist, 125(1), 1-15.
The sharing of Wolbachia between whitefly and their parasitoids is very striking, although this has been reported before (eg the authors recently published a paper entitled "Diversity and Phylogenetic Analyses Reveal Horizontal Transmission of Endosymbionts Between Whiteflies and Their Parasitoids"). In Lines 154-164 it is suggested that from the tree the direction of transfer between host and parasitoid can be inferred from the data. This is not obvious to me given the poor resolution of the tree due to low sequence divergence. There are established statistical approaches to test the direction of trait changes on a tree that could have been used (a common approach is to use the software BEAST).
We thank the reviewer for this constructive feedback on our interpretation of Wolbachia transfer between whiteflies and their parasitoids. Inspired by the reviewer's comments, we have now incorporated a trait-based approach, using the taxonomic order of the source species of the wsp gene as a discrete trait for ancestral state reconstruction on the wsp tree. The estimated ancestral trait state for one clade, which clusters wsp sequences from whiteflies and parasitoids, is Hymenoptera, suggesting that within this clade, the direction of Wolbachia transfer may have been from parasitoids to hosts. Conversely, in another clade characterized by the ancestral trait state of Hemiptera, the inferred direction of transfer appears to be from hosts to parasitoids. We have added a “Ancestral state reconstruction” section to Materials and Methods (Lines 406–412 in the revised manuscript). The corresponding results are described on lines 159–163 and 167–168. The data and script for this analysis is available at https://doi.org/10.6084/m9.figshare.24718119.
Reviewer #2 (Public Review):
The paper by Yan et al. aims to provide evidence for horizontal transmission of the intracellular bacterial symbiont Wolbachia from parasitoid wasps to their whitefly hosts. In my opinion, the paper in its current form consists of major flaws.
Weaknesses:
The dogma in the field is that although horizontal transmission events of Wolbachia occur, in most systems they are so rare that the chances of observing them in the lab are very slim.
For the idea of bacteria moving from a parasitoid to its host, the authors have rightfully cited the paper by Hughes, et al. (2001), which presents the main arguments against the possibility of documenting such transmissions. Thus, if the authors want to provide data that contradict the large volume of evidence showing the opposite, they should present a very strong case.
In my opinion, the paper fails to provide such concrete evidence. Moreover, it seems the work presented does not meet the basic scientific standards.
We are grateful for your critical perspective on our work. Nonetheless, we are confident in the credibility of our findings regarding the horizontal transmission of Wolbachia from En. formosa to B. tabaci. Our study has documented this phenomenon through phylogenetic tree analyses, and we have further substantiated our observations with rigorous experiments in both cages and petri dishes. The horizontal transfer of Wolbachia was confirmed via PCR, with the wsp sequences in B. tabaci showing complete concordance with those in En. formosa. Additionally, we utilized FISH, vertical transmission experiments, and phenotypic assays to demonstrate that the transferred Wolbachia could be vertically transmitted and induce significant fitness cost in B. tabaci. All experiments were conducted with strict negative controls and a sufficient number of replicates to ensure reliability, thereby meeting basic scientific standards. The collective evidence we present points to a definitive case of Wolbachia transmission from the parasitoid En. formosa to the whitefly B. tabaci.
My main reservations are:
- I think the distribution pattern of bacteria stained by the probes in the FISH pictures presented in Figure 4 looks very much like Portiera, the primary symbiont found in the bacterium of all whitefly species. In order to make a strong case, the authors need to include Portiera probes along with the Wolbachia ones.
We thank you for your critical evaluation regarding the specificity of FISH in our study. We assure the reliability of our FISH results based on several reasons.
(1) We implemented rigorous negative controls which exhibited no detectable signal, thereby affirming the specificity of our hybridization. (2) The central region of the whitefly nymphs is a typical oviposition site for En. formosa. Post-parasitism, we observed FISH signals around the introduced parasitoid eggs, distinct from bacteriocyte cells which are rich in endosymbionts including Portiera (Fig 3e-f). This observation supports the high specificity of our FISH method. (3) In the G3 whiteflies, we detected the presence of Wolbachia in bacteriocytes in nymphs and at the posterior end of eggs in adult females (Fig. 4). This distribution pattern aligns with previously reported localizations of Wolbachia in B. tabaci (Shi et al., 2016; Skaljac et al., 2013). Furthermore, the distribution of Wolbachia in the whiteflies does indeed exhibit some overlap with that of Portiera (Skaljac et al., 2013; Bing et al., 2014). 4) The primers used in our FISH assays have been widely cited (Heddi et al., 1999) and validated in studies on B. tabaci and other systems (Guo et al., 2018; Hegde et al., 2024; Krafsur et al., 2020; Rasgon et al., 2006; Uribe-Alvarez et al., 2019; Zhao et al., 2013).
Taking all these points into consideration, we stand by the reliability of our FISH results.
REFERENCES
Bing XL, Xia WQ, Gui JD, et al., 2014. Diversity and evolution of the Wolbachia endosymbionts of Bemisia (Hemiptera: Aleyrodidae) whiteflies. Ecol Evol, 4(13):2714-37.
Guo Y, Hoffmann AA, Xu XQ, et al., 2018. Wolbachia-induced apoptosis associated with increased fecundity in Laodelphax striatellus (Hemiptera: Delphacidae). Insect Mol Biol, 27:796-807.
Heddi A, Grenier AM, Khatchadourian C, Charles H, Nardon P, 1999. Four intracellular genomes direct weevil biology: nuclear, mitochondrial, principal endosymbiont, and Wolbachia. Proc Natl Acad Sci USA, 96:6814-6819.
Hegde S, Marriott AE, Pionnier N, et al., 2024. Combinations of the azaquinazoline anti-Wolbachia agent, AWZ1066S, with benzimidazole anthelmintics synergise to mediate sub-seven-day sterilising and curative efficacies in experimental models of filariasis. Front Microbiol, 15:1346068.
Krafsur AM, Ghosh A, Brelsfoard CL, 2020. Phenotypic response of Wolbachia pipientis in a cell-free medium. Microorganisms, 8.
Rasgon JL, Gamston CE, Ren X, 2006. Survival of Wolbachia pipientis in cell-free medium. Appl Environ Microbiol, 72:6934-6937.
Shi P, He Z, Li S, et al., 2016. Wolbachia has two different localization patterns in whitefly Bemisia tabaci AsiaII7 species. PLoS One, 11: e0162558.
Skaljac M, Zanić K, Hrnčić S, et al., 2013. Diversity and localization of bacterial symbionts in three whitefly species (Hemiptera: Aleyrodidae) from the east coast of the Adriatic Sea. Bull Entomol Res, 103(1):48-59.
Uribe-Alvarez C, Chiquete-Félix N, Morales-García L, et al., 2019. Wolbachia pipientis grows in Saccharomyces cerevisiae evoking early death of the host and deregulation of mitochondrial metabolism. MicrobiologyOpen, 8: e00675.
Zhao DX, Zhang XF, Chen DS, Zhang YK, Hong XY, 2013. Wolbachia-host interactions: Host mating patterns affect Wolbachia density dynamics. PLoS One, 8: e66373.
- If I understand the methods correctly, the phylogeny presented in Figure 2a is supposed to be based on a wide search for Wolbachia wsp gene done on the NCBI dataset (p. 348). However, when I checked the origin of some of the sequences used in the tree to show the similarity of Wolbachia between Bemisia tabaci and its parasitoids, I found that most of them were deposited by the authors themselves in the course of the current study (I could not find this mentioned in the text), or originated in a couple of papers that in my opinion should not have been published to begin with.
We appreciate your meticulous examination of the sources for our sequence data. All the sequences included in our phylogenetic analysis were indeed downloaded from the NCBI database as of July 2023. The sequences used to illustrate the similarity of Wolbachia between B. tabaci and its parasitoids include those from our previously published study (Qi et al., 2019), which were sequenced from field samples. Additionally, some sequences were also obtained from other laboratories (Ahmed et al., 2009; Baldo et al., 2006; Van Meer et al., 1999). We acknowledge that in our prior research (Qi et al., 2019), the sequences were directly submitted to NCBI and, regrettably, we did not update the corresponding publication information after the article were published. It is not uncommon for sequences on NCBI, with some never being followed by a published paper (e.g., FJ710487- FJ710511 and JF426137-JF426149), or not having their associated publication details updated post-publication (for instance, sequences MH918776-MH918794 from Qi et al., 2019, and KF017873-KF017878 from Fattah-Hosseini et al., 2018). We recognize that this practice can lead to confusion and apologize for the oversight in our work.
REFERENCES
Ahmed MZ, Shatters RG, Ren SX, Jin GH, Mandour NS, Qiu BL, 2009. Genetic distinctions among the Mediterranean and Chinese populations of Bemisia tabaci Q biotype and their endosymbiont Wolbachia populations. J Appl Entomol, 133:733-741.
Baldo L, Dunning Hotopp JC, Jolley KA, et al., 2006. Multilocus sequence typing system for the endosymbiont Wolbachia pipientis. Appl Environ Microbiol. 72(11):7098-110.
Fattah-Hosseini S, Karimi J, Allahyari H, 2014. Molecular characterization of Iranian Encarsia formosa Gahan populations with natural incidence of Wolbachia infection. J Entomol Res Soc, 20(1):85–100.
Qi LD, Sun JT, Hong XY, Li YX, 2019. Diversity and phylogenetic analyses reveal horizontal transmission of endosymbionts between whiteflies and their parasitoids. J Econ Entomol, 112(2):894-905.
Van Meer MM, Witteveldt J, Stouthamer R, 1999. Phylogeny of the arthropod endosymbiont Wolbachia based on the wsp gene. Insect Mol Biol, 8(3):399-408.
- The authors fail to discuss or even acknowledge a number of published studies that specifically show no horizontal transmission, such as the one claimed to be detected in the study presented.
Thank you for bringing this to our attention. We have made corresponding modifications to the discussion section (Lines 256–271 in the revised manuscript) and have discussed the published studies that report no evidence of horizontal transmission (Lines 260–263 in the revised manuscript). The added sentences read: “Experimental confirmations of Wolbachia horizontal transfer remain relatively rare, with only a limited number of documented cases (24, 27, 37, 38). Additionally, some experiments have found no evidence of horizontal transmission of Wolbachia (39-42).” (Lines 260–263 in the revised manuscript)
Reviewer #3 (Public Review):
This is a very ordinary research paper. The horizontal of endosymbionts, including Wolbachia, Rickettsia etc. has been reported in detail in the last 10 years, and parasitoid vectored as well as plant vectored horizontal transmission is the mainstream of research. For example, Ahmed et al. 2013 PLoS One, 2015 PLoS Pathogens, Chiel et al. 2014 Enviromental Entomology, Ahmed et al. 2016 BMC Evolution Biology, Qi et al. 2019 JEE, Liu et al. 2023 Frontiers in Cellular and Infection Microbiology, all of these reported the parasitoid vectored horizontal transmission of endosymbiont. While Caspi-Fluger et al. 2012 Proc Roy Soc B, Chrostek et al. 2017 Frontiers in Microbiology, Li et al. 2017 ISME Journal, Li et al. 2017 FEMS, Shi et al. 2024 mBio, all of these reported the plant vectored horizontal transmission of endosymbiont. For the effects of endosymbiont on the biology of the host, Ahmed et al. 2015 PLoS Pathogens explained the effects in detail.
Thank you for the insightful comments and for highlighting the relevant literature in the field of horizontal transmission of endosymbionts, including Wolbachia and Rickettsia. After careful consideration of the studies mentioned in the commences, we believe that our work presents significant novel contributions to the field. 1) Regarding the parasitoid-mediated horizontal transmission of Wolbachia, most of the cited articles, such as Ahmed et al. 2013 in PLoS One and Ahmed et al. 2016 in BMC Evolutionary Biology, propose hypotheses but do not provide definitive evidence. The transmission of Wolbachia within the whitefly cryptic species complex (Ahmed et al. 2013) or between moths and butterflies (Ahmed et al. 2016) could be mediated by parasitoids, plants, or other unknown pathways. 2) Chiel et al. 2014 in Environmental Entomology reported “no evidence for horizontal transmission of Wolbachia between and within trophic levels” in their study system. 3) The literature you mentioned about Rickettsia, rather than Wolbachia, indirectly reflects the relative scarcity of evidence for Wolbachia horizontal transmission. For example, the evidence for plant-mediated transmission of Wolbachia remains isolated, with Li et al. 2017 in the ISME Journal being one of the few reports supporting this mode of transmission. 4) While the effects of endosymbionts on their hosts are not the central focus of our study, the effects of transgenerational Wolbachia on whiteflies are primarily demonstrated to confirm the infection of Wolbachia into whiteflies. Furthermore, the effects we report of Wolbachia on whiteflies are notably different from those reported by Ahmed et al. 2015 in PLoS Pathogens, likely due to different whitefly species and Wolbachia strains. 6) More importantly, our study reveals a mechanism of parasitoid-mediated horizontal transmission of Wolbachia that is distinct from the mechanical transmission suggested by Ahmed et al. 2015 in PLoS Pathogens. Their study implies transmission primarily through dirty needle, without Wolbachia infection of the parasitoid, suggesting host-to-host transmission at the same trophic level, where parasitoids serve as phoretic vectors. In contrast, our findings demonstrate transmission from parasitoids to hosts through unsuccessful parasitism, which represents cross-trophic level transmission. To our knowledge, this is the first experimental evidence that Wolbachia can be transmitted from parasitoids to hosts. We believe these clarifications and the novel insights provided by our research contribute valuable knowledge to the field.
REFERENCES
Ahmed MZ, De Barro PJ, Ren SX, Greeff JM, Qiu BL, 2013. Evidence for horizontal transmission of secondary endosymbionts in the Bemisia tabaci cryptic species complex. PLoS One, 8(1):e53084.
Ahmed MZ, Li SJ, Xue X, Yin XJ, Ren SX, Jiggins FM, Greeff JM, Qiu BL, 2015. The intracellular bacterium Wolbachia uses parasitoid wasps as phoretic vectors for efficient horizontal transmission. PLoS Pathog, 10(2):e1004672.
Ahmed MZ, Breinholt JW, Kawahara AY, 2016. Evidence for common horizontal transmission of Wolbachia among butterflies and moths. BMC Evol Biol, 16(1):118.
Caspi-Fluger A, Inbar M, Mozes-Daube N, Katzir N, Portnoy V, Belausov E, Hunter MS, Zchori-Fein E, 2012. Horizontal transmission of the insect symbiont Rickettsia is plant-mediated. Proc Biol Sci, 279(1734):1791-6.
Chiel E, Kelly SE, Harris AM, Gebiola M, Li X, Zchori-Fein E, Hunter MS, 2014. Characteristics, phenotype, and transmission of Wolbachia in the sweet potato whitefly, Bemisia tabaci (Hemiptera: Aleyrodidae), and its parasitoid Eretmocerus sp. nr. emiratus (Hymenoptera: Aphelinidae). Environ Entomol, 43(2):353-62.
Chrostek E, Pelz-Stelinski K, Hurst GDD, Hughes GL, 2017. Horizontal transmission of intracellular insect symbionts via plants. Front Microbiol, 8:2237.
Li SJ, Ahmed MZ, Lv N, Shi PQ, Wang XM, Huang JL, Qiu BL, 2017. Plant-mediated horizontal transmission of Wolbachia between whiteflies. ISME J, 11(4):1019-1028.
Li YH, Ahmed MZ, Li SJ, Lv N, Shi PQ, Chen XS, Qiu BL, 2017. Plant-mediated horizontal transmission of Rickettsia endosymbiont between different whitefly species. FEMS Microbiol Ecol, 93(12).
Liu Y, He ZQ, Wen Q, Peng J, Zhou YT, Mandour N, McKenzie CL, Ahmed MZ, Qiu BL, 2023. Parasitoid-mediated horizontal transmission of Rickettsia between whiteflies. Front Cell Infect Microbiol, 12:1077494.
Qi LD, Sun JT, Hong XY, Li YX, 2019. Diversity and phylogenetic analyses reveal horizontal transmission of endosymbionts between whiteflies and their parasitoids. J Econ Entomol, 112(2):894-905.
Shi PQ, Wang L, Chen XY, Wang K, Wu QJ, Turlings TCJ, Zhang PJ, Qiu BL, 2024. Rickettsia transmission from whitefly to plants benefits herbivore insects but is detrimental to fungal and viral pathogens. mBio, 15(3):e0244823.
Weaknesses:
In the current study, the authors downloaded the MLST or wsp genes from a public database and analyzed the data using other methods, and I think the authors may not be familiar with the research progress in the field of insect symbiont transmission, and the current stage of this manuscript lacking sufficient novelty.
We appreciate your critical perspective on our study. However, we respectfully disagree with the viewpoint that our manuscript lacks sufficient novelty.
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
The data and scripts from the experimental section of the paper are not made publicly available. This would be good practice. It may well be a requirement for this journal too, but I have not read the journal policy on this matter.
Thank you for the kind reminder, we have uploaded the data and scripts to the public database at https://doi.org/10.6084/m9.figshare.24718119.
• Line 16 should read 'intertrophic' not 'intertropical'.
Corrected.
• Line 50 should not say 'the most infectious' as this is an incorrect use of the word 'infectious'. Maybe 'common'? Should also add something like 'likely' here.
Corrected. The new sentence reads “Together, these characteristics make Wolbachia likely the most common microbe on Earth in terms of the number of species it infects (7, 8).” (Lines 47–49 in the revised manuscript).
• Line 54 These references are all about mosquito disease vectors, not pests. More generally, in this paragraph, the research interest in Wolbachia relates overwhelmingly to blocking arbovirus transmission and not controlling pest populations.
To enhance consistency with our statements, we have revised the supporting references as follows:
X. Zheng et al., "Combined incompatible and sterile insect techniques eliminate mosquitoes," Nature 572, 56-61 (2019).
A. A. Hoffmann et al., "Wolbachia establishment in Aedes populations to suppress dengue transmission," Nature 476, 454-457 (2011).
J. T. Gong, T. P. Li, M. K. Wang, X. Y. Hong, "Prospects of Wolbachia in agricultural Pest Control," Current Opinion in Insect Science 57, 101039 (2023).J. T. Gong et al., "Stable integration of plant-virus-inhibiting Wolbachia into planthoppers for rice protection," Current Biology 30, 4837-4845.e4835 (2020).
Regarding the content of the articles:
Zheng et al. (2019) detail the successful suppression of wild mosquito populations through the release of male mosquitoes artificially infected with Wolbachia.
Gong et al. (2020) present the potential of releasing Wolbachia-infected brown planthoppers to inhibit plant viruses and control pest populations.
Gong et al. (2023) provide a comprehensive review on the application and future of Wolbachia in managing agricultural pests.
• Line 60-61. This sentence seems poorly supported by theory or data. I suggest it is deleted. Why should CI cause extinction, and why would it have a major effect on genetic diversity beyond mtDNA?
We have deleted the statements about extinction or genetic diversity. Now the sentence reads “It may also spread to nontarget organisms, potentially disrupting their population dynamics.” (Lines 57–58 in the revised manuscript)
• Line 66. Reword to make clear these routes are not an exhaustive list.
We have reworded these sentences. The new sentences now read “Similar to other symbionts, Wolbachia host shifts may occur through three main routes: parasitism, predation, and shared plant or other food sources (17). However, it is important to note that these are not the only routes through which transmission may occur, and the specific contributions of each to the overall process of host shift are not yet fully understood.” (Lines 62–66 in the revised manuscript).
• Line 77-79. This could do with mentioning studies of parasitoid-to-host transmission like Ahmedd et al given that it is common knowledge that insects commonly survive parasitoid attacks.
We have added sentences acknowledging the common occurrence of insects surviving parasitoid attacks and referenced and described the Ahmed et al. 2015 study. The added sentences read:
“However, it is common in nature for hosts to survive parasitoid attacks (27-29). For example, whiteflies can survive after attacks of Eretmocerus parasitoids (27). These parasitoids can act as phoretic vectors, facilitating the spread of Wolbachia within whitefly populations through the contamination of their mouthparts and ovipositors with Wolbachia during the probing process (27).” (Lines 77–82 in the revised manuscript).
• Line 173. Mention that there are three replicates of each cage. In Figures 2C and D, it is better to show each replicate as a separate line to see how consistent they are.
In accordance with the reviewer's suggestion, we have included a statement highlighting the replication of our experiments: “Notably, each cage setup was replicated three times to ensure experimental rigor.” (Lines 179–180 in the revised manuscript).
Regarding Figures 2C and D, we have revised the figures to display each replicate as a separate line, as suggested. However, we have encountered a visual clutter that may detract from the clarity of the figures. Additionally, in Figure C, the three black lines, all representing zero values, do not allow for the distinction of individual trends. Therefore, we recommend retaining the original figure format. In accordance with eLife's data policy, we have also provided the source data for all figures, ensuring that readers can access to the detailed data, thus balancing the need for visual simplicity with the provision of comprehensive data.
Author response image 1.
• The GloBI database is central to the phylogenetic analysis and it would be helpful to have a few words in the results stating where this information comes from.
The revised sentence now reads: “To investigate potential horizontal transmission of Wolbachia, we retrieved 4685 wsp sequences from the NCBI database, and species interaction relationships were extracted from the GloBI database (for details, see Methods and Materials).” (Lines 94–96 in the revised manuscript).
Reviewer #3 (Recommendations For The Authors):
To improve the quality of this manuscript, I have some questions and suggestions.
Introduction:
Line 41-42, I don't agree with this statement, as mentioned above, the ways of insect symbiont transmission have been studied in the last 10 years.
According to the reviewer’s suggestion, we have deleted this statement.
Line 75-76, Again, the statement is not correct, many studies have clearly revealed and confirmed that Wolbachia CAN be transferred from parasitoid to their insect hosts including whitefly Bemisia tabaci.
Thank you for your insightful comments. After careful consideration of the studies you have mentioned above, none of these articles provided definitive evidence supporting the transfer of Wolbachia from parasitoids to their insect hosts. A closely related study is Ahmed et al. (2015) in PLoS Pathogens. This article demonstrates that parasitoid wasps can act as phoretic vectors mediating the transmission of Wolbachia between whiteflies. However, Wolbachia did not infect the parasitoid wasps themselves. Therefore, this study does not provide evidence for intertrophic transmission of Wolbachia from parasitoids to their hosts. To avoid confusion, we have cited the Ahmed et al. (2015) reference following this statement and described its findings accordingly. (Lines 88-92 in revised manuscript).
Results:
Line 133-134, Ahmed et al. 2016 BMC Evolution Biology, clearly revealed and confirmed the "common horizontal transmission of Wolbachia between butterflies and moths".
We thank you for guiding us to the relevant study. Ahmed et al. 2016 BMC Evolution Biology suggested common horizontal transmission of Wolbachia between butterflies and moths and proposed that this horizontal transmission might be caused by parasitoid wasps. Here, we present the potential Wolbachia transfer between Trichogramma and their lepidopteran hosts (Lines 135–136 in revised manuscript). Integrating the results from Ahmed et al. 2016, our result also suggests that Trichogramma wasps may be the vectors for horizontal transmission of Wolbachia among lepidopteran hosts. We have discussed this point in the discussion section and cited Ahmed et al. 2016 BMC Evolution Biology (Lines 239–246 in revised manuscript).
Line 176-177, as we know Wolbachia in Encarsia formosa is a strain of parthenogenesis, why did it reduce the female ratio of whitefly progeny after it was transmitted to whitefly B. tabaci, it needs a convincing explanation.
Wolbachia induces parthenogenesis in En. formosa. However, we observed that Wolbachia from En. formosa failed to induce parthenogenesis in B. tabaci, possibly due to the requirement for host gene compatibility. Additionally, we noted a reduced female ratio in B. tabaci infected with En. formosa Wolbachia. We speculate that this might result from the burden imposed by En. formosa Wolbachia on the new host, potentially reducing fertilization success rates and indirectly leading to a decrease in the female ratio. Similarly, we observed a decline in female fecundity, egg hatching rate, and immature survival rate in B. tabaci infected with En. formosa Wolbachia. The mechanisms underlying these fitness costs remain unclear and warrant further in-depth research.
Line 189-190, do the authors have convincing evidence that the 60Gy irradiation only has effects on the reproduction of En. formosa, but does not have any negative effects on the activity of Wolbachia? I think there may be.
We observed that after irradiation, the titer of Wolbachia within En. formosa significantly decreased (Fig S3). We agree that the irradiation may cause other negative effects on Wolbachia which is worth of close investigation. However, even with a significant reduction in Wolbachia titer, irradiation increased the infection rate of Wolbachia in surviving B. tabaci after wasp attacks (Fig 3C). We speculate that this may be due to irradiation of En. formosa increasing the rate of parasitic failure. While the full extent of the effects of irradiation on Wolbachia is not yet clear in our experiments, it does not alter our conclusion that Wolbachia can be transmitted from En. formosa to whitefly hosts through failed parasitism.
Discussion:
Line 289-290, I don't understand, why the authors think from parasitoid Eretmocerus to whitefly, and from Trichogramma to moth, are the same trophic level, they are indeed two different trophic levels.
Thank you for your feedback. We have conducted a thorough search but were unable to locate the specific statement you are referring to. If there has been any ambiguity in our manuscript that has led to confusion, we sincerely apologize for any misunderstanding it may have caused. We agree with your perspective and have always considered the parasitoid Eretmocerus and whitefly, as well as Trichogramma and moth, to be at different trophic levels. However, in the context of specific references, such as Ahmed et al. 2015 in PLoS Pathogens, we believe that Wolbachia is transmitted within the same trophic level without infecting the parasitoid Eretmocerus, merely serving as a phoretic vector to facilitate the spread of Wolbachia among whitefly hosts. Similarly, in the case of Huigens et al. 2000 in Nature, Wolbachia uses lepidopteran hosts as vectors to promote its transmission among Trichogramma without the need to infect the lepidopteran hosts themselves.
Materials and Methods
Line 348, what is tblastn?
We have corrected tblastn to TBLASTN. We are grateful to the reviewer for pointing this out. Here, we utilized TBLASTN instead of BLASTN, to avoid missing the rapidly evolving wsp sequences. Because alignment at the protein level is generally more sensitive than at the nucleotide level. TBLASTN is a bioinformatics tool within the BLAST (Basic Local Alignment Search Tool) suite used for comparing a protein query sequence against a nucleotide database. Specifically, TBLASTN aligns a given protein sequence with nucleotide sequences in a database by translating the nucleotide sequences into all possible protein sequences (considering different reading frames) and comparing them to the query protein sequence.
Line 383, how was the Wolbachia-free line of B. tabaci established, by antibiotics? If so, how do we ensure the antibiotic does not have any negative to other symbionts in whitefly B. tabaci?
The Wolbachia-free line of B. tabaci was collected from field, without the treatment of antibiotics. We have made revisions in the Materials and Methods section to clarify this, stating, "An iso-female line of B. tabaci, which is naturally Wolbachia-free and has not been treated with antibiotics, was established." (Lines 417–418 in the revised manuscript)
Line 419-421 as I mentioned before, the irradiation may have negative effects on Wolbachia too, so change the biology of both Encarsia and whitefly host.
We observed that after irradiation, the titer of Wolbachia within En. formosa significantly decreased (Fig S3). However, even with a significant reduction in Wolbachia titer, irradiation increased the infection rate of Wolbachia in surviving B. tabaci after wasp attacks (Fig 3C). We speculate that this may be due to irradiation of En. formosa increasing the rate of parasitic failure. While the full extent of the effects of irradiation on Wolbachia is not yet clear in our experiments, it does not alter our conclusion that Wolbachia can be transmitted from En. formosa to whitefly hosts through failed parasitism.
Line 452-453, From egg to eclosion, it needs about 21 days to understand suitable temperature and other conditions, during this period, the egg and nymphs can not move, so how to keep the cut-leaf fresh enough in a Petri dish for 21 days?
We apologize for not clearly describing the materials and methods. By using wet cotton to wrap the end of petiole of the leaf, we can keep the leaves fresh for up to a month. We have included this detail in the materials and methods to enhance the reproducibility of the experiment. “A single irradiated wasp was subsequently introduced into a Petri dish, which contained a tomato leaf infested with Wolbachia-free third or fourth instar whitefly nymphs, and wet cotton was used to wrap the end of the leaf petiole to keep the leaf fresh.” (Lines 455–458 in the revised manuscript)
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public review):
Summary:
Ghone et al show that HIV-1 Vif causes a pseudo-metaphase arrest rather than a G2 arrest. The metaphase arrest correlates with misregulation of the kinetochore which could be explained by the loss of phosphatase functions that determine chromosome-microtubule interactions.
Strengths:
The single-cell imaging using different reporters of cell cycle progression is very elegant and the quantitation is convincing. The authors clearly show that what others have characterized as a G2 arrest by flow cytometry is somewhat later in metaphase and correlates with kinetochore misregulation.
We sincerely appreciate the reviewer recognizing the quality and precision of our study, particularly our use of long-term live cell imaging combined with single-cell resolution analysis.
Weaknesses:
(1) The major problem with the paper is trying to connect what is observed in tumor cell lines with actual infections in primary T cells. While all of the descriptive work in cell lines is convincing, none of these cells are relevant targets and tumor cells have different cell death and cell cycle regulation than primary T cells. Thus, while Vif might well do all of the things described in the manuscript, it is a stretch to connect any of it to what happens in vivo.
We fully agree with this point. It is indeed technically challenging to perform 48-120 hours of live-cell imaging at high magnification at short intervals using primary T cells because of their non-adherent nature. We also agree that Vif’s functions in pseudo-metaphase arrest and the consequent induction of cell death, observed in cancer cells (e.g., Cal51, HeLa, and MDA-MB-231 cell lines) or normal non-transformed epithelial cells (e.g., the RPE1 cell line), may differ in T cells. Further studies and refined approaches will be required to address this important question. We have revised the manuscript to include a discussion of this issue in the section of Limitation of this study.
(2) Line 109 and elsewhere. The ability of Vif to cause cell cycle arrest and bind PP2A subunits is not a completely conserved feature. Rather, it is quite variable in different HIV-1 strains. (e.g. https://doi.org/10.1016/j.bbrc.2020.04.123 and https://elifesciences.org/articles/53036). Therefore, it is necessary for the authors to quite clearly use strain designations in the manuscript rather than a generic "Vif", and to more clearly describe the viruses being used.
Thank you for raising this important point. We utilized the NL4-3 strain in our study and have revised the manuscript to specify this detail. While this study uncovered part of the mechanism by which Vif modulates phosphatase regulation during mitosis, further research is required to elucidate the full mechanism, particularly how this degradation induces a robust pseudo-metaphase arrest.
(3) Figure 5: This figure shows disruption of PP2A-B56 at the kinetochores. However, is this specific to the kinetochores? Since Vif has been described to more broadly degrade PP2A-B56, could this not be a result of a more general decrease in PP2A activity throughout the cell?
Thank you for highlighting this critical point. PP2A is a major serine/threonine phosphatase that regulates numerous essential cell cycle processes. To the best of our knowledge, Vif selectively targets the degradation of the B56 family of PP2A regulatory subunits, without affecting other three B-type subunits or the catalytic core of PP2A itself. During early mitosis, all five members of the B56 family (B56α, B56β, B56γ, B56δ, and B56ε) accumulate at kinetochores and centromeres, where they play critical roles in chromosome alignment. Many PP2A-B56 substrates are also localized to kinetochores and chromosomes during mitosis. Depletion of specific B56 isoforms or introduction of phosphorylation-deficient mutants of PP2A-B56 substrates at kinetochores has been shown to result in mitotic defects, underscoring the crucial roles of PP2A-B56 in regulating kinetochore, centromere, and chromosomal functions during mitosis. Interestingly, we observed no significant cell cycle arrest during G1, S, or G2 phases in Vif-expressing cells. While PP2A-B56 likely has important roles outside of mitosis, Vif-mediated degradation of PP2A-B56 appears to selectively disrupt its mitotic functions, particularly at the kinetochore. This finding highlights a targeted mechanism by which Vif interferes with PP2A-B56-mediated regulation of mitotic processes. However, further experiments are required to elucidate the precise mechanisms underlying Vif's inhibition of the specific mitotic roles of PP2A-B56.
Reviewer #2 (Public review):
Summary
The authors characterize the cell-cycle arrest induced by HIV-1 Vif in infected cells. They show this arrest is not at G2/M as previously thought but during metaphase. They show that the metaphase plate forms normally but progression to anaphase is massively delayed, and chromosome segregation is dysregulated in a manner consistent with impaired assembly of microtubules at the kinetochore. This correlates with the lack of recruitment of B56-subunits of PP2 phosphatase which are known degradation targets of Vif, suggesting that this weakens and unbalances the microtubule-mediated forces on the separating chromosomes.
Strengths
The authors present a very well-performed set of quantitative live cell imaging experiments that convincingly show a difference between Vif and Vpr-mediated cell cycle arrests. Through an in-depth characterization of the Vif-mediated block in metaphase, they make a strong case for this phenotype being tied to the degradation of PP2-B56 by Vif. Furthermore, it is important that they have performed most of these experiments with virally infected cells, meaning that their observations are observable at relevant viral expression levels of Vif.
We appreciate the reviewer’s recognition of the importance and significance of our study.
Weaknesses
Experimentally there is very little to criticize with respect to the cellular systems used. Data from 10.1016/j.bbrc.2020.04.123 has identified selective mutants that fail to degrade B56 while maintaining A3G degradation by Cul5, and it would be nice to confirm that such a mutant behaves like the delta-Vif virus when examining metaphase, but selective ablation of B56 during mitosis to mimic Vif is would expect to be very challenging and beyond the scope.
Thank you for your valuable suggestion. As also highlighted by Reviewer #1, it is true that certain variants of Vif, as discussed in 10.1016/j.bbrc.2020.04.123, differentially impact B56 degradation. Notably, some variants degrade A3G without inducing cell cycle arrest. We agree that investigating whether Vif's effects on B56 are directly linked to the mitotic arrest phenotype is an important direction for future research. Equipped with our advanced imaging tools, we are now preparing to extend our studies to include Vif variants from additional HIV-1 subtypes, including primary isolates. As you rightly pointed out, depletion of B56 is expected to be challenging as the B56 family comprises multiple isoforms, each with distinct and partially redundant roles in mitosis, particularly in microtubule assembly and spindle assembly checkpoint regulation. The functions of PP2A-B56 in mitosis are well-documented compared to the relatively new studies on Vif’s role in PP2A-B56 degradation. In human cells, the B56 family comprises 5 isoforms (B56α, B56β, B56γ, B56δ, and B56ε). While all B56 isoforms localize to kinetochores or centromeres during early mitosis, the reasons for their slightly different localization patterns (to either kinetochores or centromeres) remain unclear (Vallardi et al., eLife, 2019). Notably, these isoforms exhibit functional redundancy; thus, the depletion of any single isoform does not result in severe mitotic defects (Foley et al., Nature Cell Biology, 2011; Neumann et al., Nature, 2010). Supporting this redundancy, the overexpression of a single isoform (tested only B56α and B56γ) can rescue kinetochore function when all other isoforms are depleted (Foley et al., Nature Cell Biology, 2011; Vallardi et al., eLife, 2019). This complexity poses significant challenges to modulating the relative levels of individual B56 isoforms experimentally. While these specific experiments are beyond the current scope of our study, we remain committed to advancing our understanding of the mechanisms driving Vif-induced pseudo-metaphase arrest. Your suggestion aligns with our ongoing efforts, and we will consider these experiments as we further explore this fascinating area.
Where I would raise some criticism is in the relevance of these observations to the replication and pathogenesis of the virus itself, which the authors do not address or discuss. Firstly, despite clear data that both Vpr and Vif can lead to a cell cycle arrest in cycling cells, it has never been particularly clear why the virus does this. While I would agree with the authors that Vif results in the metaphase arrest through targeting B56-PP2A, this may not be the reason WHY the virus targets one of the cell's major phosphatases, but rather a knock-on effect of doing so. I appreciate that this is beyond the scope of the study, but it is something I feel should be discussed rather than the narrow mechanistic points made in the discussion. Secondly, the authors suggest that this activity of Vif is a major cause of apoptosis in infected cells and perhaps CD4+ T cell depletion in vivo. It would be good to quantify how much apoptosis is Vif-dependent in infected primary human CD4+ T cells rather than transformed tumor cells, and whether this correlates with the Vif-mediated induction of a pseudometaphase.
Thank you for highlighting this important point. We completely agree that the full scope of Vif’s bi-functional roles, in both degrading the APOBEC3 family, which is essential for HIV-1 infection, and inducing cell cycle arrest, is not yet fully understood. The connection between Vif’s role in cell cycle arrest and the HIV-1 life cycle remains unclear. One possible explanation, as discussed in our study, is that Vif-induced pseudo-metaphase arrest may contribute to cell death, suggesting that Vif could play a role in the reduction of CD4+ T cells. Alternatively, Vif’s impact on cell cycle arrest, or its disruption of phosphatase activity, could facilitate HIV-1 virus production. However, further experiments, especially using primary human CD4+ T cells with similar approaches as in this study, are essential to gain deeper insights. This discussion has been included in the Limitations section of our study.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
(1) The first paragraph of the Introduction is not necessary and anyway is quite outdated about the current state of HIV pathogenesis. Likewise, the discussion implies that HIV pathogenesis is due to virally-induced cell death, which is also outdated by more than a decade of work demonstrating that chronic immune activation is the driver of CD4 cell decline rather than direct cytotoxicity due to viral proteins.
We have revised the first paragraph of the Introduction.
(2) Line 134. I do not know what are Cal51 cells, and why they are being used for an HIV study here. Some rationale for being the cell of choice for this study should be included.
Thank you for this suggestion. We have revised the text to clearly articulate the rationale for selecting the Cal51 cell line in this study. Briefly, this study focuses on the robust mitotic arrest induced by Vif. To capture this phenomenon, long-term live-cell imaging was required with a range of 48–120 hours, with imaging intervals of 6–12 minutes and 3–4 z-stacks per time point. These parameters presented considerable technical challenges. The Cal51 cell line was chosen as it has been genetically engineered by the CRISPR-Cas9 method to express mScarlet-tagged Histone H2B and mNeonGreen-tagged Tubulin, enabling extended live-cell imaging. Furthermore, the Cal51 cell line exhibits wild-type p53 expression and maintains a stable near-diploid karyotype, making it an ideal model for studying cell cycle progression.
(3) A description of the viruses being used is necessary. Although the authors cite a previous paper, the names in that paper do not exactly match the names used here. I presume that is the NL4.3 strain?
Thank you for raising this important point. We utilized the B type HIV-1 NL4-3 strain in our study and have revised the manuscript to specify this detail.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public Reviews):
Summary:
This study examines to what extent this phenomenon varies based on the visibility of the saccade target. Visibility is defined as the contrast level of the target with respect to the noise background, and it is related to the signal-to-noise ratio of the target. A more visible target facilitates the oculomotor behavior planning and execution, however, as speculated by the authors, it can also benefit foveal prediction even if the foveal stimulus visibility is maintained constant. Remarkably, the authors show that presenting a highly visible saccade target is beneficial for foveal vision as the detection of stimuli with an orientation similar to that of the saccade target is improved, the lower the saccade target visibility, the less prominent the effect.
Strengths:
The results are convincing and the research methodology is technically sound.
Weaknesses:
Discussion on how this phenomenon may unfold in natural viewing conditions when the foveal and saccade target stimuli are complex and are constituted by different visual properties is lacking. Some speculations regarding feedforward vs feedback neural processing involved in the phenomenon and the speed of the feedforward signal in relation to the visibility of the target, are not well justified and not clearly supported by the data.
We thank the reviewer for their comment. In general, we tried to address conceptual points only briefly in this Research Advance if we had discussed them in depth in our main article which this advance will be linked to (Kroell & Rolfs, 2022: https://elifesciences.org/articles/78106). However, the reviews showed us that this rendered our theoretical reasoning in the current manuscript appear incomplete. In the revised Discussion section, we have elaborated on several conceptual questions. In particular, we expand on the transferability of our findings to natural viewing conditions:
“Foveal prediction in natural visual environments
As noted above, human observers typically move their eyes towards the most conspicuous objects in their environment (‘t Hart, Schmidt, Roth, & Einhäuser, 2013). Foveal prediction seems to benefit from this strategy as the strength of the predicted signal increases with the conspicuity of the eye movement target. Nonetheless, natural visual environments as well as naturalistic viewing behavior pose several challenges for the foveal prediction mechanism (see Kroell & Rolfs, 2022, for an initial discussion).
First, naturalistic saccade target stimuli will likely exhibit complex shapes and, more often than not, will include feature conjunctions rather than isolated features. Previous findings suggest that the foveal feedback mechanism is capable of operating at this level of complexity: High-level peripheral information such as the category of novel, rendered objects (Williams et al., 2008) has been successfully decoded from activation in foveal retinotopic cortex. If, indeed, temporal objectspecific areas such as area TE send feedback, the foveal prediction mechanism may even be specialized for the transfer of complex visual properties.
Second, foveal input will often be of high contrast in natural visual environments. If fed-back predictive signals can influence foveal perception in the presence of high-contrast feedforward input remains to be established. In our main investigation (Kroell & Rolfs, 2022; Figure 2B) as well as in previous studies (Hanning & Deubel, 2022b), pre-saccadic foveal detection performance decreased markedly in the course of saccade preparation, presumably because visuospatial attention gradually shifted towards the saccade target and away from the foveal location. This presaccadic decrease in foveal sensitivity may boost the relative weight of fed-back signals by attenuating the conspicuity of high-contrast feedforward input. In other words, the strength of feedforward input to the fovea is reduced gradually across saccade preparation. At the same time, the strength of the fed-back predictive signal should profit from the high contrast of naturalistic saccade targets.
Third, while foveal and peripheral information was congruent on 50% of all ‘probe present’ trials in our investigation, peripheral and foveal features will often be weakly correlated or even uncorrelated in natural environments (see Samonds, Geisler, & Priebe, 2018). Again, the presaccadic attenuation of foveal feedforward processing may allow fed-back peripheral signals to influence perception even if they are uncorrelated with foveal information. Moreover, in piloting variations of our paradigm, we observed that the subjective impression of perceiving the saccade target at the pre-saccadic foveal location is most pronounced if the foveal noise region is replaced with a black Gaussian blob at certain time points before saccade onset (unpublished phenomenological accounts). In consequence, fed-back signals do not seem to require correlated feedforward input to influence perception. Quantitative evidence, however, remains to be established.
Lastly, pre-saccadic foveal input is likely less relevant during natural viewing behavior than it is in our task. It is possible that this task-induced prioritization of the foveal location facilitated the emergence of congruency effects. In a previous experiment (Kroell & Rolfs, 2022; Figure 1D), however, the perceptual probe could appear anywhere on a horizontal axis of 9 dva length around the fixation location. Despite this spatial unpredictability, congruency effects peaked at the presaccadic foveal location, even after peripheral baseline performances had been raised to a foveal level through an adaptive increase in probe opacity. On a similar note, the orientation of the saccade target is irrelevant to the behavioral task in our design, mirroring naturalistic situations: The eye movement can be planned and executed based on local contrast variations alone, and observers are never required to report on the orientation of the peripheral target stimulus. Ultimately, however, an influence of task demands on visual processing can only be fully excluded through techniques that provide a direct readout of perceptual contents without requiring overt responses. In psychophysical investigations, a prediction of saccade target motion may be read out from observers’ eye velocities (Kroell, Mitchell, & Rolfs, 2023; Kwon, Rolfs, & Mitchell, 2019). In electroencephalographic (EEG) and electrophysiological studies, foveal predictions should manifest in early visually evoked potentials (e.g., Creel, 2019) and increased firing rates of featureselective foveal neurons in early visual areas, respectively. In conclusion, previous findings (Williams et al., 2008), the assumed properties of the neuronal feedback mechanism (Williams et al., 2008; Bullier, 2001) and characteristics of our current and previous experimental paradigms collectively suggest that foveal feature predictions are likely to transfer to naturalistic environments and viewing situations. Experimental evidence remains to be established.”
We have furthermore modified the Abstract to emphasize the connection of the current manuscript to the main article.
With respect to the reviewer’s point that “speculations regarding feedforward vs feedback neural processing involved in the phenomenon and the speed of the feedforward signal in relation to the visibility of the target, are not well justified”:
Again, we understand that we should have elaborated on our theoretical reasoning in this Research Advance. The assumption that our initial findings rely on neuronal feedback to foveal retinotopic cortex is derived from Williams et al.’s (2008) seminal findings: In an fMRI study, the category of peripherally presented objects could be decoded from voxels in foveal retinotopic cortex, suggesting that peripheral visual information was available to neurons with strictly foveal receptive fields. We extended these findings to saccade preparation, suggesting that feedback from higher-order, non-retinotopically organized visual areas may transmit information without the requirement of efference copies (see Kroell, 2023; Dissertation; https://doi.org/10.18452/27204, pp. 54-59): Irrespective of the vector of the upcoming saccade, the features of the attended saccade target would invariably be relayed to foveal retinotopic cortex. Ultimately, only anatomical and functional studies in non-human primates can conclusively establish the role of feedback connections in the observed foveal prediction effects. At present, however, this parsimonious model could account for all of our current and previous findings, that is, a temporally, spatially and feature-specific anticipation of saccade target properties in the presaccadic center of gaze. Nonetheless, we are open to considering any other mechanism that may account for our findings, and have integrated the explanation provided by the reviewer into the paragraph on potential thalamic mechanisms (see the reviewer’s Major Point 1).
Concerning the point that the “some speculations regarding feedforward vs feedback neural processing […] and the speed of the feedforward signal in relation to the visibility of the target are not well justified and not clearly supported by the data”:
Theoretical considerations on the impact of peripheral target contrast on feedforward processing speed were a main motivation for the current study. We apologize if our theoretical reasoning was incomplete and have added additional references and elaborations to the Introduction:
“In particular, neuronal response latencies decrease systematically as the contrast of visual input increases. While this phenomenon is reliably observed at varying stages of the visual processing hierarchy—such as the lateral geniculate nucleus (Lee, Elepfandt, & Virsu, 1981b), primary visual cortex (e.g., Albrecht, 1995; Carandini & Heeger, 1994; Carandini, Heeger, & Movshon, 1997; Carandini, Heeger, & Senn, 2002), and anterior superior temporal sulcus (STSa; Oram, Xiao, Dritschel, & Payne, 2002; van Rossum, van der Meer, Xiao, & Oram, 2008)—influences of contrast on neuronal response latency are particularly pronounced in higher-order visual areas: A doubling of stimulus contrast has been shown to decrease the latency of V1 neurons by 8 ms, compared to a reduction of 33 ms in area STSa (Oram et al., 2002; van Rossum et al., 2008). Assuming that the peripheral target is processed in a bottom-up fashion until it reaches higher-order object processing areas, the time point at which peripheral signals are available for feedback should be dictated by the temporal dynamics of visual feedforward processing.”
Concerning the interpretation of the observed time courses, and regarding the reviewer’s Major points 3 & 6, we substantially revised the Results and Discussion section. In brief, we deemphasized the claim/interpretation of faster enhancement with increasing target opacity and instead focus on describing the oscillatory pattern mentioned by the reviewer. We provide a more temporally resolved pre-saccadic time course using a moving-window analysis and discuss all suggested and further alternative explanations (i.e., saccade-locked perceptual or attentional oscillations, longer signal accumulation intervals for low-contrast information, oscillatory nature of feedback signaling). Details and full revised paragraphs are provided in the response to this reviewer’s Major points 3 & 6.
Unfortunately, there is no line numbering in the manuscript version I downloaded so I cannot refer to the specific lines of text here.
We apologize for the inconvenience and have added line numbers.
Major:
(1) The authors speculate that the phenomenon of pre-saccadic foveal prediction arises from feedback connections from higher-order visual areas, which relay relevant saccade target features to the foveal retinotopic cortex. These feedback signals are then presumably combined with feedforward foveal input to the early visual cortex and facilitate the detection of target-congruent features at the center of gaze. This interpretation is sensible, however, it may not be the only plausible scenario. The thalamus receives copies of feedforward and feedback connections between all visual areas and is a likely candidate hub for combining information across visual space. In this latter case, the phenomenon of pre-saccadic foveal prediction may not arise from feedback from higher-order visual areas, but rather from a combination of signals occurring at the level of the thalamus. The authors should either acknowledge this possibility and the fact that this phenomenon is not necessarily the result of a feedback loop, or they should explain their rationale for excluding this scenario.
We thank the reviewer for their highly thoughtful suggestion, and for alerting us to relevant literature. We have added the following paragraph to the Discussion section. In brief, we discuss the thalamic pulvinar as either an intermediate modulatory region or as the final receiver of the fed-back signal. Yet, we assume that—to solve the combinatorial issue associated with a transfer of feature information before saccades with any possible direction and amplitude—the contribution of non-retinotopic, higherorder object processing areas is likely required.
“Neural implementation of foveal prediction
Based on the body of our findings as well as previous literature, we suggested a parsimonious feedback mechanism to underly the observed effects: the preparation of a saccadic eye movement, and the concomitant shift of pre-saccadic attention (e.g., Kowler, Anderson, Dosher, & Blaser, 1995; Deubel & Schneider, 1996), selects the peripheral target stimulus among competing information. Higher-order visual areas feed selected feature input back to early retinotopic areas— specifically, to neurons with foveal receptive fields. Fed-back feature information combines with congruent, foveal feedforward input, resulting in the enhancement effects we observe. Especially in the context of active vision, this feedback mechanism is appealing as it resolves a combinatorial issue associated with feature-specific information transfer before saccades. Consider a simplified case in which, right before a saccadic eye movement, the activation of a feature-selective neuron that encodes a certain retinal location is transferred to a neuron within the same brain area that will encode said retinal location after saccade landing. For this mechanism to function for any possible saccade direction and amplitude, most neurons would need to be connected to most other neurons (or, in a simplified version, to neurons with foveal receptive fields) in a given brain area. Assuming an information transmission via feedback rather than horizontal connections significantly reduces this dimensionality: Higher-order visual areas that encode object properties (largely) detached from retinotopic or spatiotopic reference frames selectively transfer feature information to neurons with foveal receptive fields, irrespective of the vector of the upcoming saccade. This parsimonious mechanism would have shortcomings. In particular, foveal feedback should become less effective during saccade sequences where several peripheral targets are simultaneously attended. Feature information at both attended target locations may be fed back in temporal succession or weighted and erroneously combined into a single fed-back signal. In most cases, however, foveal feedback may reasonably achieve what established transsaccadic mechanisms struggle to explain: An anticipation of the features of a single saccade target—which typically constitutes the currently most relevant object in the visual field—in foveal vision.
While direct feedback connections from higher-order to early visual areas would constitute the most straightforward implementation, it is conceivable that feedback signals are relayed through and modulated by subcortical areas. In particular, the thalamic pulvinar has been identified as a connection hub for visual processing that receives copies of feedforward and feedback connections from different visual areas and may even combine information across visual space (Cortes, Ladret, Abbas-Farishta, & Casanova, 2024). In the case of foveal prediction, thalamic neurons may receive fed-back signals from higher-order areas and enhance those signals before passing them on to cortical neurons with foveal receptive fields. Perhaps, a modification of foveal activation within the thalamic pulvinar itself is sufficient to influence perception. To the best of our understanding, however, the fed-back signal must originate in non-retinotopic, higher-order object processing areas to reduce the number of necessary neuronal connections.”
(2) The results presented are very compelling. I wonder to which extent they generalize to situations in which the foveal input and the peripheral input are more heterogenous (e.g., faces or complex objects composed of many different features, orientations, and other visual properties). I think the current research raises a number of interesting questions. In general, it would be important for the readers to elaborate more on how the mechanism of pre-saccadic foveal prediction may play out in normal viewing conditions or in conditions in which the foveal input is completely irrelevant to the task.
We agree and have reiterated this point in the current manuscript (see our first reply to “Weaknesses”). We also explicitly refer to Kroell & Rolfs (2022) for an extensive initial discussion of this question.
(3) On page 10 the authors state that their data suggest that foveal enhancement emerges in earlier stages of saccade preparation as target opacity increases. However, this is not clear from the figures, when performance is locked to saccade onset (Fig 3 C), for the highest opacity targets performance seems to oscillate, however, the authors do not comment on that. There is literature showing how saccades can reset perceptual oscillations, and maybe what is observed here is just a stronger performance oscillation when the saccade target is more visible. Why would performance drop systematically 75 ms before saccade onset and then increase again 25 ms before the onset? Can the authors elaborate more on this?
In response to this comment, we inspected the pre-saccadic time course of enhancement effects in a more temporally resolved fashion and, indeed, observed pronounced oscillations for the two higher target opacity conditions (see Results):
“Especially at higher target opacities, the temporal development of foveal enhancement appears to exhibit an oscillatory pattern. To inspect this incidental observation in a more temporally resolved fashion, we determined mean enhancement values in a boxcar window of 50 ms duration sliding along all saccade-locked probe offset time points (step size = 10 ms; x-axis values in Figure 4 indicate the latest time point in a certain window). We then fitted 6th order polynomials (with no constraints on parameters) to the resulting time courses and compared the fitted values against zero using bootstrapping (see Methods). The average foveal enhancement across target opacities reached significance starting 115 ms before saccade onset (gray curve in Figure 4; all ps < .046). For every individual target opacity condition, we observed significant enhancement immediately before saccade onset, although only very briefly for the lowest opacity (-2–0 ms for 25%; -39–0 ms for 39%, -106–0 ms for 59% & -13–0 ms for 90%; all ps < .050; yellow to dark red curves in Figure 4). Especially for the higher two target opacities, we observed a local maximum preceding eye movement onset by approximately 80 ms. Interestingly, assuming a peak in enhancement in approximately 80 ms intervals (i.e., at x-axis values of -80 and 0 ms in Figure 4) would correspond to an oscillation frequency of 12.5 Hz. In contrast to rapid feedforward processing, feedback signaling is associated with neural oscillations in the alpha and beta range (i.e., between 7 and 30 Hz; Bastos et al., 2015; Jensen, Bonnefond, Marshall, & Tiesinga, 2015; van Kerkoerle et al., 2015).”
We had observed an oscillatory pattern in multiple previous investigations, and in both Hit Rates to foveal orientation content and reflexive gaze velocities in response to peripheral motion information. So far, we have been unsure how to explain it. The literature on thalamic visual processing mentioned by the reviewer alerted us to the oscillatory nature of feedback signaling itself. Interestingly, the temporal frequency range of feedback oscillations includes the frequency of ~12.5 Hz observed in our data. We have included this and alternative explanations in the Discussion section (see below). Throughout, we highlight that we are aware that our analysis approach is purely descriptive and that the potential explanations we give are speculative.
“Moreover, foveal congruency effects appear to exhibit an oscillatory pattern, with peaks in a medium saccade preparation stage (~80 ms before the eye movement) and immediately before saccade onset. We have noticed this pattern in several investigations with substantially different visual stimuli and behavioral readouts. For instance, using a full-screen dot motion paradigm, we observed a pre-saccadic, small-gain ocular following response to coherent motion in the saccade target region (Kroell, Rolfs, & Mitchell, 2023, conference abstract; Kroell, 2023, dissertation). Predictive ocular following first reached significance ~125 ms before the eye movement, then decreased and subsequently ramped up again ~25 ms before saccade onset. Several explanatory mechanisms appear conceivable. Unlike rapid feedforward processing, feedback propagation has been shown to follow an oscillatory rhythm in the alpha and beta range, that is, between 7 and 30 Hz (Bastos et al., 2015; Jensen, Bonnefond, Marshall, & Tiesinga, 2015; van Kerkoerle, et al., 2015). In our case, it is possible that the object-processing areas that send feedback to retinotopic visual cortex do so at a temporal frequency of ~12.5 Hz. At higher stimulus contrasts, feedforward signals may be fed back instantaneously and without the need for signal accumulation in feedbackgenerating areas. The resulting perceptual time courses may reflect innate temporal feedback properties most veridically. Alternatively, the initial enhancement peak may be related to the sudden onset of the saccade target stimulus and not to movement preparation itself. In this case, the initial peak should become particularly apparent if enhancement is aligned to the onset of the target stimulus. Yet, Figure 3 and Figure 4 suggest more prominent oscillations in saccade-locked time courses. In accordance with this, perceptual and attentional processes have been shown to exhibit oscillatory modulations that are phase-locked to action onset (e.g., Tomassini, Spinelli, Jacono, Sandini, & Morrone, 2015; Hogendoorn, 2016; Wutz, Muschter, van Koningsbruggen, Weisz, & Melcher, 2016; Benedetto & Morrone, 2017; Tomassini, Ambrogioni, Medendorp, & Maris, 2017; Benedetto, Morrone & Tomassini, 2019). Whether the oscillatory pattern of foveal enhancement, as well as its increased prominence at higher target contrasts, relies on innate temporal properties of feedback signaling, signal accumulation, saccade-locked oscillatory modulations of feedforward processing or attention, or a combination of these factors, one conclusion remains: task-induced cognitive influences suggested to underlie the considerable variability in temporal characteristics of foveal feedback during passive fixation (e.g., Fan et al., 2016; Weldon et al., 2016; 2020) are not the only possible explanation. Low-level target properties such as its luminance contrast modulate the resulting time course and should be equally considered, at least in our paradigm.”
In the revised Abstract, we removed our claim on an earlier emergence of enhancement at higher opacities and have added this summary instead:
“Second, the time course of foveal enhancement appeared to show an oscillatory pattern that was particularly pronounced at higher target opacities. Interestingly, the temporal frequency of these oscillations corresponded to the frequency range typically associated with neural feedback signaling.”
(4) What was the average difference in latency between short and long latencies? It would be good to report it in the main text.
We apologize for the oversight. The difference was 61 ms, with latencies of md = 247±18 ms for short- and md = 308±18 ms for long-latency saccades. We have added this information to the main text.
(5) From the saccade latency graphs in Figure S1 it seems there is some variability in the latency of saccades across subjects, I wonder if there is a correlation between saccade latency and the magnitude of the foveal prediction effect across subjects.
We had inspected a connection between saccade latency and congruency in our first investigation (Kroell & Rolfs, 2022; not reported) and observed that participants with lower latencies tended to show more enhancement, albeit non-significantly. Likewise, we observed a non-significant negative correlation between the median saccade latency and the mean foveal prediction effect (across opacities and time points) in the current investigation, r \= -0.22, p \= .572. While our study involved a small number of observers (n = 9), the analysis approach illustrated in Figure 2 A-C instead makes use of the large number of trials collected per participant (mean n = 2841 trials per observer) and demonstrates a reliable influence of saccade latency on an individual-observer level.
(6) Page 14, the authors state that their findings suggest that the feedforward processing of the peripheral saccade target is accelerated when it is presented at high contrast. I find this a bit too speculative, both in terms of assuming that there is a feedforward vs a feedback process (see my point 1) and in terms of speculating that the feedforward process is accelerated as I do not see a clear hint of this in the data (see my point 3) and it is a bit of a stretch to speculate on delays or accelerations of neural processing. It is possible that the feedforward signal is always delivered at the same speed but it is weaker in one case and the effect needs more time to build up.
We fully agree and hope to have addressed the reviewer’s arguments in the sections preceding this point. We included the reviewer’s last sentence in the Discussion section as well:
“Alternatively, or in addition, it is conceivable that weaker feedforward signals require a longer accumulation interval before the feedback process can be initiated.”
Minor:
(1) I think the description of the linear mixed-effects model can go in the supplemental methods, if possible, and its results can be briefly mentioned in the text.
In previous work, we have been asked to move linear mixed-effects model descriptions from supplemental to main method (or even results) sections for clarity. We have followed this suggestion ever since and, due to the relevance of the models for the interpretation of the presented results, would like to keep their description in the methods section.
(2) This is just a minor point, but I would suggest using a different word instead of opacity (maybe visibility?).
We had gone back and forth on this. We decided to use the term ‘conspicuity’ when we discuss our findings conceptually and the term ‘opacity’ when we refer to the experimental manipulation (since we directly manipulate the transparency, i.e., 1-opacity, of the target patch against the background). To compute the slopes in Figures 2 and 5, we ordered observers’ performances by the linearly spaced opacity conditions. Since the term ‘opacity’ is closest to both the experimental manipulation and the variable entered into analysis, we would like to adhere to this terminology. However, we have added an explicit note to the end of our introduction to avoid confusion:
“Throughout the paper, we use the term ‘opacity’ when we refer to the experimental manipulation (that is, a variation of the transparency, i.e., 1-opacity of the target patch against the background noise) and the term ‘conspicuity’ when we discuss our findings conceptually.”
Reviewer #2 (Public Review):
Summary:
In this manuscript, the authors ran a dual task. Subjects monitored a peripheral location for a target onset (to generate a saccade to), and they also monitored a foveal location for a foveal probe. The foveal probe could be congruent or incongruent with the orientation of the peripheral target. In this study, the authors manipulated the conspicuity of the peripheral target, and they saw changes in performance in the foveal task. However, the changes were somewhat counterintuitive.
Strengths:
The authors use solid analysis methods and careful experimental design.
Weaknesses:
I have some issues with the interpretation of the results, as explained below. In general, I feel that a lot of effects are being explained by attention and target-probe onset asynchrony etc, but this seems to be against the idea put forth by the authors of "foveal prediction for visual continuity across saccades". Why would foveal prediction be so dependent on such other processes? This needs to be better clarified and justified.
We address the described weaknesses in the respective sections below. In general, as we point out in response to Reviewer 1 as well, the current submission is a Research Advance article meant to supplement our main article (Kroell & Rolfs, 2022, https://doi.org/10.7554/eLife.78106). To comply with the eLife recommendations for Research Advance submissions, we addressed conceptual points only briefly, especially if they had been explained in detail in our main article. To make the nature and format of the current submission as explicit as possible, and to emphasize its connection to our previous work, we refer to the submission format in our abstract and introduction now.
Specifics:
The explanation of decreased hit rates with increased peripheral target opacity is not convincing. The authors suggest that higher contrast stimuli in the periphery attract attention. But, then, why are the foveal results occurring earlier (as per the later descriptions in the manuscript)? And, more importantly, why would foveal prediction need to be weaker with stronger pre-saccadic attention to the periphery? What is the function of foveal prediction? What of the other interpretation that could be invoked in general for this type of task used by the authors: that the dual task is challenging and that subjects somehow misattribute what they saw in the peripheral task when planning the saccade. i.e. foveal hit rates are misperceptions of the peripheral target. When the peripheral target is easier to see, then the foveal hit rate drops.
We will address these comments one by one:
The authors suggest that higher contrast stimuli in the periphery attract attention. But, then, why are the foveal results occurring earlier (as per the later descriptions in the manuscript)?
We consider these observations to rely on separate processes. Already in the main publication (Kroell & Rolfs, 2022), we had observed a continuous decrease of target-congruent and target-incongruent foveal Hit Rates (HRs) during saccade preparation, and suggested that this decrease (similarly observed in Hanning & Deubel, 2022b is likely caused by the pre-saccadic shift of visuospatial attention to the target. In other words, as attentional resources shift towards the periphery, foveal detection performance is hampered, irrespective of peripheral and foveal feature (in-)congruency. In the current investigation, we again observed a pronounced pre-saccadic decrease of foveal HRs, irrespective of foveal probe orientation. Our argument that high-contrast peripheral saccade targets attract more attention relies on the clear observation that this decrease becomes more pronounced as the contrast of the saccade target increases. To the best of our judgment and experience with doing the task ourselves, this interpretation appears very conceivable. We explain this rationale in the Abstract and the Results sections of the manuscript (see below).
Our hypotheses and interpretations concerning the time course of foveal prediction refer to the difference between target-congruent and target-incongruent foveal HRs (i.e., to predictive foveal feature enhancement). Irrespective of the general, feature-unspecific decrease of foveal detection performances, we had hypothesized that the peripheral target is processed faster if it exhibits a high contrast. This assumption is based on temporal processing properties of many visual neurons that we have expanded on in our revision:
“In particular, neuronal response latencies decrease systematically as the contrast of visual input increases. While this phenomenon is reliably observed at varying stages of the visual processing hierarchy—such as the lateral geniculate nucleus (Lee et al., 1981b), primary visual cortex (e.g., Albrecht, 1995; Carandini et al., 1997, 2002; Carandini and Heeger, 1994), and anterior superior temporal sulcus (STSa; Oram et al., 2002; van Rossum et al., 2008)— influences of contrast on neuronal response latency are particularly pronounced in higher-order visual areas: A doubling of stimulus contrast has been shown to decrease the latency of V1 neurons by 8 ms, compared to a reduction of 33 ms in area STSa (Oram et al., 2002; van Rossum et al., 2008). Assuming that the peripheral target is processed in a bottom-up fashion until it reaches higher-order object processing areas, the time point at which peripheral signals are available for feedback should be dictated by the temporal dynamics of visual feedforward processing.”
Of note, both reviewers asked us to explore the oscillatory nature of the difference between targetcongruent and target-incongruent HRs. We will post our changes in response to the reviewer’s remark below.
And, more importantly, why would foveal prediction need to be weaker with stronger pre-saccadic attention to the periphery?
We hope that our previous reply has cleared up that the opposite is true: In general, and irrespective of the feature congruency of target and foveal probe, foveal HRs decrease as target contrast increases. As we have stated in our Abstract and Results, “foveal Hit Rates for target-congruent and incongruent probes decreased as target opacity increased, presumably since attention was increasingly drawn to the target the more salient it became. Crucially, foveal enhancement defined as the difference between congruent and incongruent Hit Rates increased with opacity”. This finding did not appear counterintuitive to us and was, in fact pre-registered as a main hypothesis (see https://osf.io/wceba).
We are unsure if this goes beyond the reviewer’s concern but we, in fact, speculate in the revised Discussion section as well as in our original eLife article that the overall, feature-unspecific decrease in foveal detection performances may aid feature-specific foveal prediction:
“This pre-saccadic decrease in foveal sensitivity may boost the relative weight of fed-back signals by attenuating the conspicuity of high-contrast feedforward input. In other words, the strength of feedforward input to the fovea is reduced gradually across saccade preparation. At the same time, the strength of the fed-back predictive signal should profit from the high contrast of naturalistic saccade targets.”
What is the function of foveal prediction?
Please refer to the section ‘What is the function of foveal prediction?’ in our main article. We have pasted this paragraph below for the reviewer’s convenience.
“What is the function of foveal prediction?
As stated above, previous investigations on foveal feedback required observers to make peripheral discrimination judgments. We, in contrast, did not ask observers to generate a perceptual judgment on the orientation of the saccade target. Instead, detecting the target was necessary to perform the oculomotor task. While the identification of local contrast changes would have sufficed to direct the eye movement, the orientation of the target enhanced foveal processing of congruent orientations. The automatic nature of foveal enhancement showcases that perceptual and oculomotor processing are tightly intertwined in active visual settings: planning an eye movement appears to prioritize the features of its target; commencing the processing of these features before the eye movement is executed may accelerate post- saccadic target identification and ultimately provide a head start for corrective gaze behavior (Deubel et al., 1982; Ohl and Kliegl, 2016; Tian et al., 2013).”
What of the other interpretation that could be invoked in general for this type of task used by the authors: that the dual task is challenging and that subjects somehow misattribute what they saw in the peripheral task when planning the saccade. i.e. foveal hit rates are misperceptions of the peripheral target. When the peripheral target is easier to see, then the foveal hit rate drops.
Alternative explanations in general: In our main article, we ruled out—either through direct experimentation or by considering relevant properties of our findings—the following alternative explanations: i) spatially global feature-based attention to the target orientation, ii) a multiplicative combination of spatial and feature-based attention, and iii) shifts of decision criterion. While dual tasks (i.e., simultaneous oculomotor planning and perceptual detection) are standard in psychophysical investigations of active vision, we acknowledge the potential influence of an explicit foveal task in the revised manuscript, and in response to both reviewers:
“Lastly, pre-saccadic foveal input is likely less relevant during natural viewing behavior than it is in our task. It is possible that this task-induced prioritization of the foveal location facilitated the emergence of congruency effects. In a previous experiment (Kroell & Rolfs, 2022; Figure 2D), the perceptual probe could appear anywhere on a horizontal axis of 9 dva length around the screen center. Despite this spatial unpredictability, however, congruency effects peaked at the pre-saccadic foveal location, even after peripheral baseline performances had been raised to a foveal level through an adaptive increase in probe opacity. Ultimately, an influence of task demands on visual processing can only be fully excluded through techniques that provide a direct readout of perceptual contents without requiring keyboard responses. In psychophysical investigations, a prediction of saccade target motion may be read out from observers’ eye velocities (Kroell, Mitchell, & Rolfs, 2023; Kwon, Rolfs, & Mitchell, 2019). In electroencephalographic (EEG) and neurophysiological studies, foveal predictions should manifest in early visual evoked potentials (e.g., Creel, 2019) and increased firing rates of feature-selective foveal neurons in early visual areas, respectively.”
Difficulty of the task: Concerning the perceptual detection task, every experimental session was preceded by an adaptive staircase procedure that adjusted the transparancy of the foveal probe—and, thus, task difficulty—depending on the respective observer’s performance (see Methods for details). Concerning the oculomotor task, observers were able to perform accurate saccades with typical movement latencies for all target opacity conditions (see Results, Supplements & Figure S1). In general, we are unsure how high task difficulty could produce a feature-, temporally and spatially specific enhancement of both filtered and incidental target-congruent foveal orientation information. In fact, a main finding of our current submission is that foveal HRs decrease as the target becomes easier to see and the oculomotor task thus becomes easier to perform.
Perceptual confusion of target and probe stimulus: We observe a specific increase in HRs for foveal probes that exhibit the same orientation as the peripheral saccade target. Just like in our main article, a response is defined as a ‘Hit’ if a foveal probe is presented and the observer generates a ‘present’ judgment. To our understanding, the suggestion that a confusion of target and probe stimuli may account for these effects necessarily implies that this confusion hinges on the congruency between peripheral and foveal feature inputs. In other words, peripheral and foveal signals should be more readily “confused” if they exhibit similar features. We assume that peripheral feature information is fed back to neurons with foveal receptive field and combines with feature-congruent feedforward input. Whether this combination of signals can be described as low-level perceptual “confusion” likely depends on individual linguistic judgments (it would certainly be a novel description of feedback-feedforward interactions). Perhaps a defining difference between the reviewer’s concern and our assumed mechanism is the spatial specificity of the resulting congruency effects. We suggest that only neurons with foveal receptive fields receive feature information via feedback. And indeed, we demonstrate a clear spatial specificity of congruency effects around the pre-saccadic foveal location, even after parafoveal performances had been raised to a foveal level by an adaptive increase in probe opacity (see Kroell & Rolfs, 2022; Figure 2C & Figure 3). In other words, observers’ perception is altered in their pre-saccadic center of gaze while the target is presented peripherally. We struggle to conceive a
scenario in which a confusion of signals should be feature-specific as well as specific to an interaction between peripheral and foveal signals without being meaningful at the same time. If the reviewer is referring to confusions on the response or decision level, we would like to point them towards the Discussion section ‘Can our findings be explained by established mechanisms other than foveal prediction?’ in our main article. In this paragraph, we provide detailed arguments for a dissociation between our findings and shifts in decision criterion that would exceed the scope of a Research Advance.
When the peripheral target is easier to see, then the foveal hit rate drops.
We agree. Target-congruent and incongruent foveal HRs decreased as the contrast of the probe increased. However, and as we stated in response to the reviewer’s first comment, the difference between target-congruent and target-incongruent foveal HRs (and, thus, foveal enhancement of the target orientation) increased with peripheral target contrast.
The analyses of Fig. 3C appear to be overly convoluted. They also imply an acknowledgment by the authors that target-probe temporal difference matters. Doesn't this already negate the idea that the foveal effects are associated with the saccade generation process itself? If the effect is related to target onset, how is it interpreted as related to a foveal prediction that is associated with the saccade itself?
We indeed conducted analyses that can reveal an influence of target presentation duration at probe onset, the saccade preparation stage at probe offset, as well as a combination of both factors. The fact that target presentation duration may have an influence on foveal prediction would not negate a simultanous influence of saccade preparation and vice versa. In the main article, we directly investigated the influence of saccade preparation on foveal enhancement by introducing a passive fixation condition (Kroell & Rolfs, 2022; Figure 5). At identical target-probe offset durations, pre-saccadic foveal enhancement was significantly more pronounced and accelerated compared to enhancement during passive fixation. We have added a purely saccade-locked time course (uncorrected by targetprobe interval) to our Results section and to Figure 3 (second row). We still believe that the target-locked, saccade-locked and combined analysis are informative for future investigations and would like to present them all for completeness.
Also, the oscillatory nature of the effect in Fig. 3C for 59% and 90% opacity is quite confusing and not addressed. The authors simply state that enhancement occurs earlier before the saccade for higher contrasts. But, this is not entirely true. The enhancement emerges then disappears and then emerges again leading up to the saccade. Why would foveal prediction do that?
In response to this comment and a suggestion by Reviewer 1, we inspected the pre-saccadic time course of enhancement effects in a more temporally resolved fashion and, indeed, observed pronounced oscillations for the two higher target opacity conditions (see Results):
“Especially at higher target opacities, the temporal development of foveal enhancement appears to exhibit an oscillatory pattern. To inspect this incidental observation in a more temporally resolved fashion, we determined mean enhancement values in a boxcar window of 50 ms duration sliding along all saccade-locked probe offset time points (step size = 10 ms; x-axis values in Figure 4 indicate the latest time point in a certain window). We then fitted 6th order polynomials to the resulting time courses and compared the fitted values against zero using bootstrapping (see Methods). The average foveal enhancement across target opacities reached significance starting 115 ms before saccade onset (gray curve in Figure 4; all ps < .046). For every individual target opacity condition, we observed significant enhancement immediately before saccade onset, although only very briefly for the lowest opacity (-2–0 ms for 25%; -39–0 ms for 39%, -106–0 ms for 59% & -13–0 ms for 90%; all ps < .050; yellow to dark red curves in Figure 4). Especially for the higher two target opacities, we observed a local maximum preceding eye movement onset by approximately 80 ms. Interestingly, assuming a peak in enhancement in approximately 80 ms intervals (i.e., at x-axis values of -80 and 0 ms in Figure 4) would correspond to an oscillation frequency of 12.5 Hz. In contrast to rapid feedforward processing, feedback signaling is associated with neural oscillations in the alpha and beta range (i.e., between 7 and 30 Hz; Bastos et al., 2015; Jensen, Bonnefond, Marshall, & Tiesinga, 2015; van Kerkoerle et al., 2015).”
We had observed an oscillatory pattern in multiple previous investigations, and in both Hit Rates to foveal orientation content and reflexive gaze velocities in response to peripheral motion information. So far, we have been unsure how to explain it. The literature on thalamic visual processing mentioned by the reviewer alerted us to the oscillatory nature of feedback signaling itself. Interestingly, the temporal frequency range of feedback oscillations includes the frequency of ~12.5 Hz observed in our data. We have included this and alternative explanations in the Discussion section (see below). We are aware, and acknowledge in the manuscript, that our analysis approach is purely descriptive, and that the potential explanations we give are speculative.
“Moreover, foveal congruency effects appeared to exhibit an oscillatory pattern, with peaks in a medium saccade preparation stage (~80 ms before the eye movement) and immediately before saccade onset. We have noticed this pattern in several investigations with substantially different visual stimuli and behavioral readouts. For instance, using a full-screen dot motion paradigm, we observed a pre-saccadic, small-gain ocular following response to coherent motion in the saccade target region (Kroell, Rolfs, & Mitchell, 2023, conference abstract; Kroell, 2023, dissertation). Predictive ocular following first reached significance ~125 ms before the eye movement, then decreased and subsequently ramped up again ~25 ms before saccade onset. Several explanatory mechanisms appear conceivable. Unlike rapid feedforward processing, feedback propagation has been shown to follow an oscillatory rhythm in the alpha and beta range, that is, between 7 and 30 Hz (Bastos et al., 2015; Jensen, Bonnefond, Marshall, & Tiesinga, 2015; van Kerkoerle, et al., 2015). In our case, it is possible that the object-processing areas that send feedback to retinotopic visual cortex do so at a temporal frequency of ~12.5 Hz. At higher stimulus contrasts, feedforward signals may be fed back instantaneously and without the need for signal accumulation in feedback-generating areas. The resulting perceptual time courses may reflect innate temporal feedback properties most veridically. Alternatively, the initial enhancement peak may be related to the sudden onset of the saccade target stimulus and not to movement preparation itself. In this case, the initial peak should become particularly apparent if enhancement is aligned to the onset of the target stimulus. Yet, Figure 3 and Figure 4 suggest more prominent oscillations in saccade-locked time courses. In accordance with this, perceptual and attention processes have been shown to exhibit oscillatory modulations that are phase-locked to action onset (e.g., Tomassini, Spinelli, Jacono, Sandini, & Morrone, 2015; Hogendoorn, 2016; Wutz, Muschter, van Koningsbruggen, Weisz, & Melcher, 2016; Benedetto & Morrone, 2017; Tomassini, Ambrogioni, Medendorp, & Maris, 2017; Benedetto, Morrone & Tomassini, 2019). Whether the oscillatory pattern of foveal enhancement, as well as its increased prominence at higher target contrasts, relies on innate temporal properties of feedback signaling, signal accumulation, saccade-locked oscillatory modulations of feedforward processing or attention, or a combination of these factors, one conclusion remains: task-induced cognitive influences suggested to underlie the considerable variability in temporal characteristics of foveal feedback during passive fixation (e.g., Fan et al., 2016; Weldon et al., 2016; 2020) are not the only possible explanation. Low-level target properties such as its luminance contrast modulate the resulting time course and should be equally considered, at least in our paradigm.”
The interpretation of Fig. 4 is also confusing. Doesn't the longer latency already account for the lapse in attention, such that visual continuity can proceed normally now that the saccade is actually eventually made? In all results, it seems that the effects are all related to the dual nature of the task and/or attention, rather than to the act of making the saccade itself. Why should visual continuity (when a saccade is actually made, whether with short or long latency) have different "fidelity"? And, isn't this disruptive to the whole idea of visual continuity in the first place?
We are unsure if we grasp the unifying concern behind these remarks. For the reviewer’s point on the dual-task nature of our paradigm, please consider our answer above. Perhaps it is important to note that we do not (and would never) claim that foveal prediction is the only mechanism underlying visual continuity. We believe that multiple mechanisms, including but not limited to pre-saccadic shifts of attention, predictive remapping of attention pointers and the perception of intra-saccadic signals interact and jointly contribute to visual continuity. It appears highly conceivable that, like most processes in biological systems, motor and perceptual performances are subject to fluctuations. We argue that saccade latencies as well as the magnitude of foveal prediction constitute read-outs of these variations. We also suggest that those read-outs are innately correlated beyond their common moderator of, perhaps, attentional state; we have previously presented clear evidence for a link between eye movement preparation and foveal prediciton (Kroell & Rolfs, 2022; Figure 2). To the best of our judgment, we consider it reasonable that the effectiveness of movement-contingent perceptual processes varies with the effectiveness (in programming or execution) of the very movement motivating them. We present evidence for this assumption in our submission. We would also like to make clear that we do not assume our vision to fail entirely, even if every single well-known mechanism of visual continuity were to break down at once. Upon saccade landing, the visual system receives reliable visual input. Nonetheless, the visual system has undeniably developed mechanisms to optimize this process. We believe foveal prediciton to rank among them.
Small question: is it just me or does the data in general seem to be too excessively smoothed?
We did not apply any smoothing to either the analysis or visualization of our data in the initial manuscript.
Every observer completed a large number of trials (mean n = 2841 trials per observer; total trial number > 25,500), which likely contributes to the clarity of our data. To inspect the oscillatory pattern of enhancement in a more temporally resolved fashion (in response to the reviewer’s point above), we applied a moving window analysis in this revision. Due to overlapping window borders, this analysis introduces a certain degree of smoothing. Nonetheless, data patterns are comparable to the time course with only few non-overlapping time bins (Figure 3B; second row). In general, we have described all steps of our analysis routine extensively in the Methods section and will make our data publicly available upon publication of the Reviewed Preprint.
General comment: it is important to include line numbers in manuscripts, to help reviewers point to specific parts of the text when writing their comments. Otherwise, the peer review process is rendered unnecessarily complicated for the reviewers.
We apologize and have added line numbers.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the previous reviews.
Public Reviews:
Reviewer #1 (Public review):
Summary:
In this work, the authors investigate the functional difference between the most commonly expressed form of PTH, and a novel point mutation in PTH identified in a patient with chronic hypocalcemia and hyperphosphatemia. The value of this mutant form of PTH as a potential anabolic agent for bone is investigated alongside PTH(1-84), which is a current anabolic therapy. The authors have achieved the aims of the study. Their conclusion that this suggests a "new path of therapeutic PTH analog development" seems unfounded; the benefit of this PTH variant is not clear, but the work is still interesting.
The work does not identify why the patient with this mutation has hypocalcemia and hyperphosphatemia; this was not the goal of the study, but the data is useful for helping to understand it.
Thank you for your valuable feedback. In this study, we confirmed that <sup>R25C</sup>PTH can form a dimer, and our in vivo experiments in the mouse model demonstrated that dimeric <sup>R25C</sup>PTH can stimulate bone formation similarly to normal PTH. Furthermore, patients with the <sup>R25C</sup>PTH mutation, who have been exposed to high levels of this variant over an extended period, were reported to have high bone mineral density. Based on these observations, we hypothesized that dimeric <sup>R25C</sup>PTH might have potential as a new therapeutic PTH analog, particularly as a bone anabolic agent. However, we acknowledge that it is premature to make definitive claims regarding its therapeutic utility. Thus, we are currently conducting follow-up research to further investigate the subsignaling pathway changes induced by dimeric <sup>R25C</sup>PTH and their impact on bone metabolism.
Moreover, to fully understand the patient’s symptoms, it is crucial to determine the form in which <sup>R25C</sup>PTH exists in vivo. While our in vitro experiments demonstrated that <sup>R25C</sup>PTH is secreted primarily in its dimeric form, we do not yet know whether this dimeric structure is maintained in vivo. We are actively conducting experiments to analyze the circulating form of <sup>R25C</sup>PTH in patients through blood sample collection (Andersen et al., 2022; Lee et al., 2015). Should the mutation predominantly exist in its monomeric form in vivo, this would align with clinical findings reported by Lee et al. (2015), which could help explain the patient’s hypocalcemia and hyperphosphatemia. However, if <sup>R25C</sup>PTH primarily exists in its dimeric form, additional research will be necessary to uncover the underlying mechanisms. Based on our experimental results, the dimeric <sup>R25C</sup>PTH exhibits a reduced binding affinity to PTH1R compared to the monomeric form. Furthermore, our in vitro experiments revealed that dimeric <sup>R25C</sup>PTH induces lower levels of cAMP production upon PTH1R activation. Accordingly, we can assume that this reduction in receptor signaling is likely to account for the impaired regulation of calcium and phosphate in patients with the mutation. However, despite this diminished signaling in calcium and phosphate homeostasis, dimeric <sup>R25C</sup>PTH was still capable of promoting bone formation at levels comparable to wild-type PTH. This apparent paradox warrants further investigation, and we are actively pursuing studies to elucidate how the dimeric form exerts its effects on bone metabolism.
References
Andersen, S. L., Frederiksen, A. L., Rasmussen, A. B., Madsen, M., & Christensen, A. R. (2022). Homozygous missense variant of PTH (c.166C>T, p.(Arg56Cys)) as the cause of familial isolated hypoparathyroidism in a three-year-old child. J Pediatr Endocrinol Metab, 35(5), 691-694. https://doi.org/10.1515/jpem-2021-0752
Lee, S., Mannstadt, M., Guo, J., Kim, S. M., Yi, H. S., Khatri, A., Dean, T., Okazaki, M., Gardella, T. J., & Juppner, H. (2015). A Homozygous [Cys25]PTH(1-84) Mutation That Impairs PTH/PTHrP Receptor Activation Defines a Novel Form of Hypoparathyroidism. J Bone Miner Res, 30(10), 1803-1813. https://doi.org/10.1002/jbmr.2532
Strengths:
The work is novel, as it describes the function of a novel, naturally occurring, variant of PTH in terms of its ability to dimerise, to lead to cAMP activation, to increase serum calcium, and its pharmacological action compared to normal PTH.
Weaknesses:
(1) The use of very young, 10 week old, mice as a model of postmenopausal osteoporosis remains a limitation of this study, but this is now quite clearly described as a limitation, including justifying the use of the primary spongiosa as a measurement site.
We appreciate the reviewer’s comment.
(2) Methods have been clarified. It is still necessary to properly define the micro-CT threshold in mm HA/cc^3. I think it might be acat about 200mg HA/cc^3 in this study.
Thank you for your insightful comment. To address this, we utilized hydroxyapatite (HA) phantom with HA content ranging from 0 to 1200 mg/cm<sup>3</sup>, with calibration points at 0, 50, 200, 800, 1000, and 1200 mg CaHA/cm<sup>3</sup>, to measure grayscale values via µ-CT. Based on these measurements, the trabecular bone BMD in our study was determined to range from 100 to 200 mg/cm<sup>3</sup>.
Author response image 1.
(3) The apparent contradiction between the cortical thickness data (where there is no difference between the two PTH formulations) and the mechanical testing data (where there is a difference) remains unresolved. It is still not clear whether there is a material defect in the bone, which can be partially assessed by reporting the 3-point bending test, corrected for the diameters of the bone (i.e. as stress / strain curves).
Thank you for your comment. First, we ensured that the bones sampled during the experiment showed no defects, and we carefully separated the femur bones from the mice to preserve their integrity. In the 3-point bending test, PTH treatment significantly increased the maximum load of the femur bone compared to the OVX-control group. Additionally, the maximum load in the PTH treatment group was significantly greater than that observed in the PTH dimer group. Furthermore, structural factors influencing bone strength, such as the perosteal perimeter and the endocortical bone perimeter, were also increased in the PTH treatment group compared to the PTH dimer group (data only for reviewer).
Author response image 2.
(4) It is also puzzling that both dimeric and monomeric PTH lead to a reduction in total bone area (cross sectional area?). This would suggest a reduction in bone growth. This should be discussed in the work.
In our experiment, the data showed an increase in cortical bone area in the PTH treatment group, but not in the PTH dimer treatment group. However, both dimeric and monomeric PTH treatments resulted in a reduction in total tissue area. We added revised sentence in page 13 line 317 and page 14 line 333 as follows:
“In addition, the data showed an increase in cortical bone area (Ct.Ar) in the PTH treatment group but not in the PTH dimer treatment group. However, both dimeric and monomeric PTH treatments reduced total tissue area (Tt.Ar), suggesting potential effects on bone growth in the width of mice or humans.”
“This study has several limitations. First, it is urgently necessary to determine whether dimeric <sup>R25C</sup>PTH is present in human patient serum. Second, TRAP staining showed an inhibitory effect of PTH treatment on the primary spongiosa area. However, the secondary spongiosa, which more accurately reflects bone remodeling (55), was not examined due to the barely detectable bone in this area in OVX-induced osteoporosis mouse models. Third, it is unclear whether similar bone phenotypes exist between human <sup>R25C</sup>PTH patients and dimeric <sup>R25C</sup>PTH-treated mice, particularly regarding low bone strength. Although the dimeric <sup>R25C</sup>PTH-treated group showed higher cortical BMD compared to WT-Sham or PTH groups, there was no difference in bone strength compared to the osteoporotic mouse model. Fourth, our study demonstrated that PTH or <sup>R25C</sup>PTH treatment decreased circumferential length, which could affect bone growth in width. However, whether this phenotype is also observed in patients treated with PTH or <sup>R25C</sup>PTH remains uncertain.”
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
eLife Assessment
The study presents important findings on inositol-requiring enzyme (IRE1α) inhibition on diet-induced obesity (overnutrition) and insulin resistance where IRE1α inhibition enhances thermogenesis and reduces the metabolically active and M1-like macrophages in adipose tissue. The evidence supporting the conclusions is convincing but can be enhanced with information/data on the validity, specificity, selectivity, and toxicity of the IRE1α inhibitor and supported with more detail on the mechanisms by which adipose tissue macrophages influence adipocyte metabolism. The work will be of interest to cell biologists and biochemists working in metabolism, insulin resistance, and inflammation.
We thank the editors for the assessment and appreciation of our findings in this study. In the revision, we have added the information on the validity, selectivity and toxicity of IRE1α inhibitor. In addition, we also discussed the likelihood that suppression of metabolically activated proinflammatory macrophage population in adipose tissue on the reversal of adipose remodeling and thermogenesis. In the revision, we have improved the manuscript significantly throughout the text and figures following the recommends by the reviewers.
Public Reviews:
Reviewer #1 (Public review):
First, the authors confirm the up-regulation of the main genes involved in the three branches of the Unfolded Protein Response (UPR) system in diet-induced obese mice in AT, observations that have been extensively reported before. Not surprisingly, IRE1a inhibition with STF led to an amelioration of the obesity and insulin resistance of the animals. Moreover, non-alcoholic fatty liver disease was also improved by the treatment. More novel are their results in terms of thermogenesis and energy expenditure, where IRE1a seems to act via activation of brown AT. Finally, mice treated with STF exhibited significantly fewer metabolically active and M1-like macrophages in the AT compared to those under vehicle conditions. Overall, the authors conclude that targeting IRE1a has therapeutical potential for treating obesity and insulin resistance.
The study has some strengths, such as the detailed characterization of the effect of STF in different fat depots and a thorough analysis of macrophage populations. However, the lack of novelty in the findings somewhat limits the study´s impact on the field.
We thank the reviewer for the appreciation of our findings and the comments about the novelty. Regarding the novelty, we would emphasize several novelties presented in this manuscript. First, as the reviewer correctly pointed out, we discovered that IRE1 inhibition by STF activates brown AT and promotes thermogenesis and that IRE1 inhibition not only significantly attenuated the newly discovered CD9+ ATMs and the “M1-like” CD11c+ ATMs but also diminished the M2 ATMs for the first time. These discoveries are very important and novel. In obesity, it was originally proposed that ATM undergoes M1/M2 polarization from an anti-inflammatory M2 to a classical pro-inflammatory M1 state. It was further reported that IRE1 deletion improves thermogenesis by boosting M2 population which then synthesize and secrete catecholamines to promote thermogenesis. It is now known that M2 macrophages do not synthesize catecholamines or promote thermogenesis. In this study, we discovered that IRE1 inhibition doesn’t increase (but instead decrease) the M2 population and that IRE1 inhibition promotes thermogenesis likely by suppressing pro-inflammatory macrophage populations including the M1-like ATMs and most importantly the newly identified metabolically active macrophages, given that ATM inflammation has been reported to suppress thermogenesis. Second, this study presented the first characterization of relationship between the more classical M1-like ATMs and the newly discovered metabolically active ATMs, showing that the CD11c+ M1-like ATMs are largely overlapping with but yet non-identical to CD9+ ATMs in the eWAT under HFD. Third, although upregulation of ER stress response genes in the adipose tissues of diet-induced obese mice have been extensively reported, it doesn’t necessarily mean that targeting IRE1a or ER stress can reverse existing insulin resistance and obesity. It is not uncommon that a therapy doesn’t yield the desired effect as expected. For instance, amyloid plaques are a hallmark of Alzheimer's disease (AD), interventions that prevent or reverse beta amyloid deposition have been expected to prevent progression or even reverse cognitive impairment in AD patients. However, clinical trials on such therapies have been disappointing. In essence, experimental demonstration of effectiveness or feasibility for any potential therapeutic targets is a first step for any future clinical implementation.
Reviewer #2 (Public review):
The manuscript by Wu et al demonstrated that IRE1a inhibition mitigated insulin resistance and other comorbidities through increased energy expenditure in DIO mice. In this reviewer's opinion, this timely study has high significance in the field of metabolism research for the following reasons.
(1) The authors' findings are significant and may offer a new therapeutic target to treat metabolic diseases, including diabetes, obesity, NAFLD, etc.
(2) The authors carefully profiled the ATMs and examined the changes in gene expression after STF treatment.
(3) The authors presented evidence collected from both systemic indirect calorimetry and individual tissue gene expression to support the notion of increased energy expenditure.
Overall, the authors have presented sufficient background in a clear and logically organized structure, clearly stated the key question to be addressed, used the appropriate methodology, produced significant and innovative main findings, and made a justified conclusion.
We thank the reviewer for the appreciation of our work.
Reviewer #3 (Public review):
Summary:
The manuscript by Wu D. et al. explores an innovative approach to immunometabolism and obesity by investigating the potential of targeting macrophage Inositol-requiring enzyme 1α (IRE1α) in cases of overnutrition. Their findings suggest that pharmacological inhibition of IRE1α could influence key aspects such as adipose tissue inflammation, insulin resistance, and thermogenesis. Notable discoveries include the identification of High-Fat Diet (HFD)-induced CD9+ Trem2+ macrophages and the reversal of metabolically active macrophages' activity with IRE1α inhibition using STF. These insights could significantly impact future obesity treatments.
Strengths:
The study's key strengths lie in its identification of specific macrophage subsets and the demonstration that inhibiting IRE1α can reverse the activity of these macrophages. This provides a potential new avenue for developing obesity treatments and contributes valuable knowledge to the field.
Weaknesses:
The research lacks an in-depth exploration of the broader metabolic mechanisms involved in controlling diet-induced obesity (DIO). Addressing this gap would strengthen the understanding of how targeting IRE1α might fit into the larger metabolic landscape.
Impact and Utility:
The findings have the potential to advance the field of obesity treatment by offering a novel target for intervention. However, further research is needed to fully elucidate the metabolic pathways involved and to confirm the long-term efficacy and safety of this approach. The methods and data presented are useful, but additional context and exploration are required for broader application and understanding.
We thank the reviewer for the appreciation of strengths in our manuscript. In particular, we appreciate the reviewer’s recommendation on the exploration of broader metabolic landscape, such as the effect of IRE1 inhibition on non-adipose tissue macrophages and metabolism. We agree that achieving these will certainly broaden the therapeutic potential of IRE1 inhibition to larger metabolic disorders and we will pursue these explorations in future studies.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
A list of recommendations for the authors is presented below:
(1) Please, update the literature review to include more recent studies relevant to the topic.
We thank the reviewer’s suggestions. We have added more references from recent studies.
(2) Please, provide a detailed explanation of how STF functions, including potential off-target effects or issues related to specificity.
We thank the reviewer’s suggestions. STF is a small-molecule inhibitor designed to selectively inhibit the RNase activity of IRE1a. Once IRE1a is activated (e.g., in obesity), its RNase domain initiates the unconventional splicing of the transcription factor X-box binding protein 1 (XBP1) mRNA and the Regulated IRE1-Dependent Decay (RIDD) of microRNAs, which is detrimental if prolonged. IRE1a RNase inhibitors including STF engage the RNase-active site of IRE1a with high affinity and specificity by exploiting a shallow complementary pocket through pi-stacking interactions with His910 and Phe889 and an essential Schiff base interaction between the aldehyde moiety of the inhibitor and the side chain amino group of Lys907 (Sanches et al., NComm 2014, PMID: 25164867). This specific and high affinity binding blocks the IRE1a RNase activity, preventing the splicing of XBP1 mRNA and RIDD. As IRE1a has been shown to be activated in multiple tissues under various pathological conditions and to be responsible for the progression of the pathological conditions, inhibition of IRE1a by pharmacological agents including STF has the great potential for the treatment of various pathological disorders. Several studies have reported that STF shows no overt toxicity when administered systemically (Madhavan, Aparajita, et al.2022, PMID 35105890; Herlea-Pana et al., 2021, PMID 34675883; Papandreou et al., 2011, PMID 21081713; Tufanli et al., 2017, PMID 28137856).
(3) Lines 263-266 require a reference.
We thank the reviewer’s suggestion. A reference has been added.
(4) Stromal vascular fraction (SVF) also contains a significant amount of preadipocytes and stem cells, not only macrophages, which might affect the conclusions reached by the authors.
We thank the reviewer’s comments. It is true that SVF consists of multiple cell types, including endothelial cells, macrophages, preadipocytes, and various stem cell populations. In HFD-induced obesity, adipose tissue undergoes significant remodeling, and the percentage of macrophages in the SVF of obese adipose tissue increases significantly relative to other cell types. In our studies, SVFs from adipose tissues of obese mice were isolated, cultured, and treated with STF for overnight. We observed that IRE1 RNase activity in SVFs was inhibited by STF treatment, and that ATM population and the expression of pro-inflammatory genes were downregulated by STF. Given the short-term treatment, the parsimonious interpretation of the data would be that STF directly acts on ATMs. However, we note that the possibility that the effect of STF on other cell types might influence the ATM and inflammatory gene expression can’t be totally ruled out. As such, we have modified our conclusion from “these results indicate that STF acts directly on ATMs to regulate inflammation” to “these results indicate that STF likely acts directly on ATMs to regulate inflammation”.
(5) Figures 1A and G: It is common practice to present the XBP1s/XBP1u ratio; consider using this standard measure.
We thank the reviewer’s comments. Regarding the XBP1 mRNA splicing, we see both ways of presentation in publications. There are quite a number of papers, for instance, PMID25018104, 2014, Cell; PMID23086298, 2012, NCB, that used the XBP1s/ (XBP1s+XBP1u) ratio. We preferred this way of presentation as it shows the ratio of spliced XBP1 (XBP1s) relative to the total XBP1 mRNA (XBP1s+XBP1u).
(6) Figure 1F: please indicate the type of AKT phosphorylation assessed.
We thank the reviewer’s comments. We have added Ser473 as the phosphorylation site at in both figure legend and figure.
(7) Figures 2E-H: please clearly indicate the specific fat depots analyzed in each figure.
We thank the reviewer’s comments. We have added the information in the figure legends and figures.
(8) Figures 1I and 3A, and Supplementary Figures 6D-E: please include a quantification analysis of the images presented.
We thank the reviewer’s suggestion. We have added the quantifications of the images.
(9) In Figure 3D the image corresponding to the merge for the STF condition is a duplication of the control, please correct this.
We thank the reviewer for pointing this out. We have replaced it with the correct image.
(10) Figures 4B-F: please provide individual data points in the graphs to show variability and sample distribution.
We thank the reviewer’s suggestion. We have re-plotted the graphs in Fig. 4B-F with the individual data points.
(11) Figure 4I: it is rather unusual to have such a strong signal of UCP1 in ND conditions, please explain.
We thank the reviewer for the comment. We wish to point out that the images were taken from BAT slides. UCP1 is expected to show strong staining in BAT under DN condition, which as expected is weakened under HFD condition. STF treatment was able to correct the HFD-induced weakening of UCP1 staining in BAT.
(12) Supplementary Figures 2C-D: please provide representative images for better clarity and interpretation.
We thank the reviewer for the comment. The representative images for Supplementary Figures 2C-D were actually shown in Figures 2C and F. Supplementary Figures 2C-D were the mere quantification for adipocyte areas for Figures 2C and F.
(13) Supplementary Table 3 is repeated, please remove.
We thank the reviewer for the comment. We have deleted this repetition.
Reviewer #2 (Recommendations for the authors):
The manuscript can be further strengthened with more clarification on the following points.
(1) The use of IRE1a pharmacological inhibitor STF-083010 (STF) needs to be validated. How was the dose determined? Were there any dose-dependent studies? Under the current dosing regimen, what are the specificity, selectivity, and toxicity of STF? Also, were the serine/threonine kinase and RNase activities measured in the adipocytes and ATMs of the animals dosed with the compound? What's the PK data?
We thank the reviewer for the comments. In the animal study, we used STF 10 mg/kg for intraperitoneal injection. This dose was adopted from several recent studies (Madhavan, Aparajita, et al.2022, PMID 35105890; Herlea-Pana et al., 2021, PMID 34675883; Papandreou et al., 2011, PMID 21081713; Tufanli et al., 2017, PMID 28137856), in which STF treatment showed beneficial effect in their respective disease models. STF didn’t compromise cell viability or induce any other toxicity at the dose or concentration used in these studies (Papandreou I, et al., 2011; Upton JP, et al., 2012; Lerner AG, et al., 2012; Kemp KL, et al., 2013; Cross BC, et al., 2012). In our study, we didn’t observe any apparent toxicity on mice at this dose. Importantly, we did observe that STF inhibited IRE1 RNase activity in adipose tissues (F1G, S1D) and ATMs (F6Q, S8C, G, I) of the animals at this dose. As the IRE1 inhibitors including STF has been extensively examined and shown to have no effect on the kinase function of IRE1 (Cross et al., 2012, PMID: 22315414; Tufanli et al., 2017, PMID 28137856), we didn’t perform the assay on Ire1 kinase activity. Additionally, as the chemical has been administered into several animal models, with significant beneficial effects, one would assume decent pharmacokinetic parameters being achieved with the current dose. It would be important and necessary to have systematic PK studies in the future if clinical trials are to be considered.
(2) The statistical method for individual panels in each figure needs to be specified.
We thank the reviewer for the suggestion. We have specified the statistical method in the figure legends.
(3) In Figure 1E, there's no difference in fasting insulin levels, though a difference was detected after the glucose load. This suggests an effect on insulin secretion but not insulin sensitivity.
We thank the reviewer for the comments. The insulin levels are still different between Veh and STF groups at fasting, just not reaching statistically significant. Under glucose stimulation, the insulin levels all showed the same trend, which is, the STF group is lower than the Veh group. Even if the fasting insulin levels showed no difference between the two groups regardless of glucose stimulation, the fact that the blood glucose levels at all the time points are lower in STF group than Veh group (Fig. 1C) indicates that insulin sensitivity is improved. In our study, the insulin levels were lower in STF group, but the blood glucose levels were still lowered by STF, further strengthening the notion that STF treatment improves insulin sensitivity. This is indeed further corroborated by the ITT results (Fig. 1D).
(4) Figure 2 and S2A did not show a decrease in BW but rather BW gain. The statement (line 308) needs to be edited. As a result of this, the relative fat mass measurement (% of BW) needs to be presented in addition to Figure 2B.
We thank the reviewer for the comments/suggestions. As shown in Figs. 2A and S2A, we observed a slight decrease in body weight (~2g reduction) in STF-treated mice while Veh group increased body weight by ~3.5g, at the end of 4 weeks of treatment. As shown in Fig. 2B, this difference in body weight between Veh and STF groups was primarily due to a reduction in fat tissue. In the revision, we also added the percentages of fat and lean masses over total body weight in Supplemental Fig. 2B, which show the similar trend.
(5) The measurement of blood lipid levels in Figure 3F-H is informative. More importantly, hepatic lipid content needs to be measured.
We thank the reviewer for and agree on the comments. As this study is more focused on the insulin resistance and adipose tissue remodeling, we didn’t go deep into the comorbidities beyond the reported observations. It will be interesting to explore the effects of IRE1 inhibition on the obesity/insulin resistance comorbidities including hepatic lipid content measurement in future study.
Minor corrections:
(1) Line 261: "(spliced".
Done. We have corrected it.
(2) Line 334: spell out "PEPCK".
We have added the full name “Phosphoenolpyruvate carboxykinase”. Thanks!
(3) Line 478: please rephrase.
We thank the reviewer for the comment. We have rephrased the sentence as following: “These results reveal that STF treatment suppresses the adipose tissue inflammation and the accumulation of pro-inflammatory ATM with augmenting (suppressing instead) M2-like ATMs.”
(4) Figure 4L: "pGC1-a".
We thank the reviewer for pointing this out. We have corrected the name.
(5) Figure 4O: missing Y-axis label.
We have added the label. Thanks!
Reviewer #3 (Recommendations for the authors):
The observations presented by Wu D. et al. in the manuscript are potentially interesting and relevant. The current study seeks to build upon previous findings, specifically from the work titled, "Silencing IRE1α using myeloid-specific cre suppresses alternative activation of macrophages and impairs energy expenditure in obesity." By using a pharmacological inhibitor to modulate IRE1α activity in adipose tissue macrophages (ATMs), the authors aim to develop therapeutics that could significantly impact the treatment of obesity and metabolic disease.
The authors have performed some satisfactory experiments related to liver steatosis. However, the manuscript would benefit from a more comprehensive exploration of the mechanisms by which ATMs influence adipocyte metabolism, particularly in epididymal white adipose tissue (eWAT). In particular, the study should investigate how adiposity and lipid droplet size change in response to alterations in lipolysis and adipogenesis, as this could provide insights into how these processes contribute to the amelioration of the obesity phenotype.
Several issues should be addressed to strengthen the manuscript and make the study more convincing. Below are specific comments and recommendations:
Major:
(1) The indirect calorimetric data should be normalized for dependent variables such as body weight, lean mass, and fat mass+ lean mass to accurately interpret the results. The results for 24-hour energy expenditure should be included in Figure 4B-F to provide a more comprehensive analysis. It is recommended to plot bar graphs with all individual data points for the energy expenditure (EE) results shown in Figure 4B-F, to offer a clearer and more detailed presentation of the data (Figure 4B-F).
We thank the reviewer for the comments. Data analysis on the indirect calorimetric studies has evolved over the years. One common practice was/is to normalize the data by body weight. However, this approach was deemed improper some years ago (Tschop et al Nature Methods 2012, PMID: 22205519). Tschop paper also pointed out the shortcomings associated with normalization by lean mass. Instead, it concludes that “generalized linear model is the most appropriate statistical approach to accommodate discrete (genotype) and continuous (body mass) traits, rather than using a simple division by BW or lean BW”. In our study, we used CalR, an improved generalized linear model (which includes ANOVA and ANCOVA) (Mina et al Cell Metabolism 2018, PMID: 30017358) for all our energy expenditure data analysis (shown in Fig. 4A-E). In the revision, we also included data analysis normalized by BW (Fig. S2F-H’), which actually shows even wider difference between Veh and STF groups than the data shown in Fig. 4A-F. As STF decreased the fat mass and had little effect on lean mass, the difference would be more drastic for normalization with fat mass and with fat mass+ lean mass than the data shown in Fig. 4A-E and would be similar to the data shown in Fig. 4A-E for normalization with lean mass. In addition, we replotted the graphs in Fig. 4B, D, F-H with the individual data points.
(2) At the thermoneutral point (30{degree sign}C), the study could benefit from testing the indirect calorimetric models of human energy physiology. Future studies could also explore this to evaluate the implications for drug development.
We agree with the reviewer on the comments. In the future study, it will be very informative to investigate the effects of STF under thermoneutral conditions, which could provide more consistent data on how drugs affect metabolic processes in humans, improving translational research.
(3) The current study missed the opportunity to investigate the effects of STF on non-adipose tissue (non-AT) resident macrophage populations, such as those in bone marrow or lymph-node macrophages. Understanding how STF modulates macrophage metabolism in these contexts would be valuable.
We thank the reviewer for and agree on the comments. As this study is more focused on the insulin resistance and adipose tissue remodeling, we were mostly restricted to adipose tissue macrophage populations. In the future, it would be interesting to investigate the effect of STF on macrophages in other non-adipose tissues, which will provide a more comprehensive understanding of STF's effects on immune cell metabolism, which could inform its application in various therapeutic areas.
(4) The study should explore how STF influences the expression of CD9, Trem2, (positive subpopulations), and the secretion of pro-inflammatory cytokines by macrophages, particularly in response to LPS and IFNγ activation in stromal vascular fraction (SVF) cells and bone marrow-derived macrophages (BM-Macrophages).
We appreciate the reviewer for the comments. Under obesity, the ATM does not undergo the classical M1/M2 polarization; instead, both M1-like/pro-inflammatory macrophages and M2 macrophages increase drastically in obesity. It will be interesting to investigate the effects of STF on the newly identified CD9- and Trem2-positive macrophage subpopulations in SVF and bone marrow macrophages in response to LPS and IFNγ stimulation in the future, although these studies might not faithfully reflect the changes in adipose tissue under obesity as these stressors typically induce classical M1/M2 polarization.
(5) Additional macrophage gating is necessary better to understand adipose tissue macrophage (ATM) inflammation. Specifically, CD11c−MHC2 low macrophages represent a newly identified inflammatory and dynamic subset in murine adipose tissue. These ATMs accumulate rapidly after ten days of a high-fat diet (HFD) and should increase further with prolonged HFD. For this study, CD11c−MHC2 low ATMs could be subdivided for flow cytometry analysis based on their MHC2 expression, distinguishing them from CD11c−MHC2 high ATMs. All macrophage subtypes categorized here can be studied for metabolic health using seahorse analysis as well.
We appreciate the reviewer for the comments. It will be interesting to investigate the effects of STF on the newly identified CD11c−MHC2 low macrophage subpopulation in the future. Future studies certainly can include metabolic analysis with Seahorse which can corroborate the energy metabolism at the cellular level with organismal thermogenesis.
(6) All flow cytometry histograms - are they showing mean fluorescence intensity or cell# per population? Please specify. All flow cytometry dot plots - It would be helpful for readers to see populations plotted as bar graphs next to respective flow plots, as opposed to being shown as supplemental tables. Additionally, labeling dot plots with the parent population from which cells were gated on would also help readers understand faster what we're looking at.
We appreciate the reviewer for the comments. In flow cytometry histograms, we used “normalized to mode”. The mode is often used to compare the distribution of fluorescence intensity between different samples. It focuses on the shape of the distribution (with a max of 100%) rather than the absolute cell counts, which helps remove variations caused by different cell numbers or sample sizes, making it easier to compare populations based on fluorescence intensity. When normalizing to the mode, the highest peak in the histogram is scaled to 100%, and all other values are scaled relative to that peak. This allows for easy comparison of multiple histograms, even if the total number of cells (or events) differs between samples.
(7) The results appear to confuse the actual sample size and p-value. Please carefully review the statistical analyses to ensure that biological replicates are accurately represented. Additionally, include p-values alongside fold change data in the text for clarity represented.
We appreciate the reviewer for the comments. We have rechecked the statistical analyses confirming that the biological replicates are now properly represented. The exact number of biological replicates for each experiment is now clearly specified in both the methods section and figure legends.
(8) To further validate the findings, consider using Seahorse analysis at the cellular level in future experiments. This could confirm indirect calorimetric data and thermogenesis responses to cold stimulation.
We appreciate the reviewer for the comments. Yes, Seahorse analysis at the cellular level will be conducted in future experiments.
(9) Please ensure the use of person-first language, avoiding labels or adjectives that define individuals based on a condition or characteristic.
We appreciate the reviewer for the comments. We have changed the descriptions by using person-first language.
(10) The manuscript does not demonstrate how STF inhibition of IRE1α in ATM, specifically through CD9 and Trem2, controls diet-induced obesity. This aspect should be further elucidated.
We appreciate the reviewer for the comment. In this study, we observed that STF inhibits IRE1α RNase activity in SVF and in sorted ATMs as well as in adipose tissue. The improvement in diet-induced obesity can be attributable to IRE1α inhibition in both adipocytes and macrophages as shown previously by myeloid and adipocyte-specific knockouts of IRE1α. To conclude whether the IRE1α in CD9- and/or Trem2-positive ATMs controls diet-induced obesity, genetic means would be needed to generate CD9- and/or Trem2-positive ATMs-specific deletion of IRE1α, which will be technically challenging at this moment as there is no CD9 or Trem2-specific Cre lines available.
Minor:
(1) Line 43-44: Update terminology to "MASLD" instead of "NAFLD."
We thank the reviewer for pointing these out. We have changed the terminology in the revision.
(2) Line 58-59: Add a reference for the mentioned text.
We thank the reviewer for the comment. Added a reference in the text in the revision.
(3) Was the antibody used to detect CD9 and Trem2 validated for FACS and other analyses?
We thank the reviewer for the comment. In our studies, we determined CD9 and Trem2 expression through flow cytometry and immunostaining staining. In flow experiment, CD9 and Trem2 were acquired from Biolegend: PE/Dazzle™ 594 anti-mouse CD9 (BioLegend Cat# 124821, RRID:AB_2800601); APC-conjugated Trem2 (R&D Systems Cat# FAB17291N, RRID:AB_3646995), which were validated for FACS. For immunostaining: CD9 (Abcam Cat# ab223052, RRID:AB_2922392). and Trem2 (R&D Systems Cat# MAB17291, RRID:AB_2208679).
(4) Studies were limited to male mice; this should be noted in the title and discussed as a limitation.
We thank the reviewer for the comment. We have modified the wording in the revision.
(5) Ensure all reagents are fully described with preparation details and identifiable numbers for reproducibility and/or submit the FACS protocol to any protocol archives.
We thank the reviewer for the suggestions. Yes, we have modified the wording in the revision.
(6) Provide the correct version numbers for all software used (FlowJo, Prism, etc.).
We thank the reviewer for the suggestions. We have provided the correct version numbers for softwares for FlowJo and Prism.
(7) Specify section size (µm) and blocking agent used for eWAT immunofluorescence (Line 207).
We thank the reviewer for the suggestions. We have added this information.
(8) Add gene accession numbers to Supplementary Table 3.
We thank the reviewer for the suggestions. We have added this information.
(9) Figure 2: Clarify HFD and treatment timelines with a schematic diagram.
We thank the reviewer for the suggestions. We have added a schematic diagram in Supplemental Figure 1C.
(10) For histology analysis, the minimum combined data from triplicate images is shown in Figure 2C-2H. For Figures 2E and H, provide complete methods for histology analysis.
We thank the reviewer for the comments. For the histology analysis shown in Figures 2C–2H, we used a minimum of three mice per treatment group. For each mouse, 3–5 images were taken for analysis. All histology analyses were conducted using ImageJ for image quantification, and the data were processed and organized using Excel and Graphpad.
(11) Figure 3D Macrophage markers F4/80 stained differently in Figure 5B; to avoid false positive staining, show isotype control to confirm actual staining. For eWAT immunofluorescence (Figures 3D, 5B, 6E)., counterstaining is needed in addition to macrophages, such as for adipocytes-perilipin, and phalloidin for total cells.
We thank the reviewer for the comments. Yes, Figures 3D macrophage marker F4/80 stained is differently from that of Figure 5B, as they are in different tissues, with Figure 3D in liver samples while Figure 5B in adipose tissues. In the liver, subsets of macrophages are known as Kupffer cells. Kupffer cells have distinct morphology and behavior compared to other tissue-resident macrophages. When stained with F4/80 in the liver, the pattern may reflect the specialized role of Kupffer cells, typically showing a more diffuse or localized staining around blood vessels and sinusoids. In adipose tissue, macrophages tend to accumulate around dead or dying adipocytes, forming what is known as "crown-like structures" (CLS). The F4/80 staining in adipose tissue shows a more clustered pattern, particularly around areas of fat tissue undergoing remodeling or inflammation. In adipose tissue, you can still see clear, defined cells even without counterstaining like perilipin, and importantly, adipocytes are generally way larger than macrophages in size. Yes, we agree that if with counterstaining it would enhance the accuracy. In the future study, we will use perilipin staining to make it easier to differentiate adipocytes from other structures and provide stronger data.
(12) Insert scale bars in the original images for Figures 3D, 4I, 4M, 5B, 6E, S3B, S6D-E, and S7A-B. All images added a scale bar not inserted while acquiring the image or using imaging software.
We thank the reviewer for the suggestions. The resolution for the scale bars in the images obtained during acquisition, somehow, isn’t sufficient enough to be clearly visible and requires the enlargement of the images to be seen clearly. In the revision, we have manually added the scale bars for clarity.
(13) Figure 5E: Please label X-axis as F4/80.
We thank the reviewer for pointing this out. The label has been added in the revision.
(14) Figure 5F: It is specified in the legend that cells were gated on F4/80+CD11b+CD11c+, but there is a CD11c- population shown in the histogram...How is this population appearing if all cells should be CD11c+?
We thank the reviewer for pointing this out. We gated against CD11c in F4/80+CD11b+ population. As such, we have corrected the description in the legend.
(15) Figure 5G: What is the F4/80+CD11b+CD11c-CD206- population gated in quadrants?
We thank the reviewer for the comment. The F4/80+CD11b+CD11c-CD206- population was shown in Figure 5G on the lower left side, with the percentages being 15.7% for ND, 5.54% for Veh-HFD, and 26% for STF-HFD.
(16) Figure 6J: Flow cytometry gates seem slightly misplaced and the sample appears to be overcompensated - were FMOs included in this experiment to establish proper gates? If so, please include.
We thank the reviewer for the comment. In the study, we did include Fluorescence Minus One (FMO) control in the experiment to establish proper gating. We have included this information in the methods section.
(17) Table 1-3: Indicate the number of replicates (n=) used in all tables.
We thank the reviewer for the suggestion. We have provided the specific number of mice used in the study within the figure legends.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1:
The analysis of the dormancy rates is interesting and offers some intriguing questions related to the higher dormancy rate found for the L2 isolates and lower for the L3 ones. It will be interesting in the future to expand the data generated in this advanced in vitro plaAorm to in vivo studies.
Indeed, an increased dormancy propensity of L2 isolates was previously reported in broth culture and associated to specific genetic polymorphisms. The opposite phenotype observed in the L3 isolates is indeed particularly intriguing and was not described to date. Hence, we fully agree that it would be very interesting to find out whether these phenotypes are also observed in vivo.
The authors propose that ‘strains exhibiting greater proliferative capacity are more prone to induce macrophage apoptosis, thereby contributing to the extent of the granulomatous response.’ It would be interesting to know what happens if the macrophage apoptotic response is blocked.
This is an interesting suggestion that would deserve a dedicated comprehensive investigation covering other cell death pathways. Even though the trend is significant, the correlation coefficient is rather low in this interaction, which looks a fortiori due to substantial inter-host variability in the apoptotic propensity of macrophages from individual donors to a given strain. In addition, such blocking experiments may require performing isolated macrophage infections that would fall outside of the scope of this study, or considering the extent and the contribution of the apoptosis of other cell subsets.
In contrast to macrophage apoptosis, T cell activation correlated with less replicative bacteria. Are these two findings related, ie, are the granulomas showing more (apoptotic) macrophages the ones with a lower percentage of activated T cells? This would shed light on what distinguishes granulomas that are protective from those that support bacterial growth.
Indeed, a significant negative correlation between macrophage apoptosis induction and T cell activation can be observed, specifically with activated CD4 T cells expressing CD38 (rS \= -0.36, p < 0.05) or CD69 (rS = -0.40, p < 0.01). We have added this additional result in the manuscript text (line 217).
It would also be interesting to know the functional impact of blocking early CXCL9 or IL1b on the outcome of granulomatous response/bacteria growth.
We have performed the suggested early blocking experiments and added the expected negative effect on granuloma formation upon neutralization of IL-1b (current Fig. 6E) in the revised version of the manuscript, and furthermore discussed the null effect on bacterial growth of the treatment with an anti-CXCL-9 specific antibody (current Fig. 6H).
The authors acknowledge the absence of neutrophils in this model. However, this could be discussed in more detail, as neutrophils play an important part in TB pathogenesis as shown in different models of infection and human TB.
We concur and have expanded the importance of neutrophils in TB pathogenesis (including references) in the discussion section (line 260).
Related to neutrophils and TB pathogenesis, another important player is type I IFN. The multiplex assay used included IFN-alpha, was this molecule detected? If so, was there any difference in the levels of type I IFN detected among the different infections?
We agree and that is why we had originally included IFN-α in our screen. However, this cytokine remained under the limit of quantification at both studied time points, preventing us to draw conclusions on the effect of Mtb strain diversity on the secretion of type I IFNs in in vitro granulomas.
Reviewer #2:
In Figure 1b/c, it is not clear what comparisons are being made to give the p-value annotations.
In Figure 2a/b, it is not clear what comparisons are being made to give the p-value annotations.
In Figure 3a, again it is not clear what comparisons are being made to give the p-value annotation.
The p-values formerly present on the upper le] corner of the panels were resulting from either Friedman (Figures 1C, 2A and 3A) or Kruskal-Wallis (Figures 1B and 2B) tests and indicated whether there was a significant difference between the analyzed groups overall. To avoid confusion, those values have been removed to only leave the post-test comparison between specific groups.
In the results narrative related to Figure 1 (lines 93-103), the authors refer to lineage heterogeneity without providing any objective quantification of this - I suggest they do so, by providing variance or standard deviations.
Thank you very much for this relevant suggestion, we have now included the coefficients of variation as a quantitative measure of the within-lineage heterogeneity in the manuscript (line 97).
I also suggest the authors explain what the data points actually represent in this figure - do I assume each data point = cfu from a well of 'granuloma'? Are they all from the same donor PBMC? What is the sample N for each lineage? If the data are not from the same donor PBMC, I think more informative to present the results of paired statistical analyses, stratified by donor cells. In addition, the authors should include a summary table of the demographic characteristics of the donors (at least sex, ethnicity, and age). If the data are derived from a single donor, I'd advocate providing data from at least one further donor.
In the new supplementary figure requested by Reviewer 3 Figure 1—figure supplement 1 (actual CFU data on days 1 and 8 p.i. used to calculate the growth rate) it is now indicated that bacterial load was quantified as CFU per well.
Regarding the number of donors used, as stated in the Material and Methods section (current line 418) and depicted by the four different shapes used when data are grouped by individual infecting strain, all figures in our manuscript have been generated using PBMCs from 4 independent donors. For greater clarity, “n = 4” has now been included in the figure legends. Regarding the statistical analyses, paired statistical analyses stratified by donor were already performed in the original version of the manuscript whenever appropriate.
As stated in the methods section, the buffy coats used for PBMC isolation are anonymized so demographic data are unavailable.
The premise of the analysis in Figure tic and the results narrative ("This finding suggests that an increased ability to enter dormancy is not necessarily associated with a more pronounced growth phenotype", line 132) is not clear to me. Why would increased dormancy relate to increased growth in the same context? I suggest this analysis be removed.
We apologize for the confusion in our original statement. We now rephrased it as “This finding suggests that an increased tendency to remain in a metabolically active state is not necessarily associated with a more pronounced growth phenotype”.
In Figure 3b, I think it may be more informative if the data points from the same donor were linked. Likewise in Figure 3c, I'd like to see a donor-paired statistical analysis.
For all figures, the choice of using individual symbols to identify data points from the same donor but not connecting lines was made to provide a neater image. Nevertheless, we have now modified the figure linking the data points from the same donor. The statistical analysis performed is always donor-paired whenever appropriate.
The casual inference suggested in the results narrative between ‘macrophage apoptosis’ and granulomatous response line 173-175) is not tested directly by the experiment – I suggest the authors exclude this statement.
Fair point, the statement has been removed.
To what extent have the authors considered whether variation in T cell responses between lineages may be confounded by variation in Mtb reactive T cell frequencies in donor PBMC. Can this be disentangled at all? This should be acknowledged as a potential limitation of the study.
We did characterize the presence of mycobacterial antigen-specific reactive T cells in the PBMCs from the investigated donors. To do so, we performed in vitro stimulations with purified protein derivative (PPD) or an ESAT-6/CFP-10 peptide pool and quantified the frequency of IFN-γ-positive CD4 T cells by flow cytometry. The percentage of IFN-γg-positive CD4 T cells recalled by PPD stimulation ranged from 0.02% to 0.13%, while no ESAT6/CFP-10 reactive T cells were detected. As such, we can akest that the PBMC donors never encountered Mtb even though some levels of memory recalled by PPD may be due to cross-reactivity with BCG or pre-exposure to non-tuberculous mycobacteria. We have now added a panel in Figure 5—figure supplement 2 representing the frequency of mycobacteria-specific CD4 T cells and, as suggested, discussed the impact on the extent of the T cell responses observed in granulomas in the revised version of the manuscript. Nevertheless, the observed MTBC strain-specific trends are consistent across the donors, as depicted in Figure 5B and Figure 5—figure supplement 2A-B.
Moreover, the experimental design does not really test cause and effect for the relationship between T cell proliferation/activation and bacterial growth. What is the impact of T-cell depletion from PBMC on bacterial growth?
The increased TB susceptibility of HIV patients demonstrated that T cells play a critical part in the control of Mtb infection. We agree and did envisage such a depletion experiment. However, depleting T cells from PBMCs would imply removing up to 70% of the cells present in the specimen, which would lead to a situation from which results cannot be compared to the original sample and therefore would not be interpretable.
Reviewer #3:
Data presentation:
- In Figure 1 (replication rate), actual cumulative CFU means from each strain for both days 1 and 8 with statistical analysis should be presented as panels in this figure.
Agreed. We are providing the requested representation of the data and the corresponding paired statistical analysis as supplementary material Figure 1—figure supplement 1.
- In Figure 2 (dormancy), a panel comparing the mean number of bacteria that are single positive for either Auramine-O, Nile Red, or are double positive should be included for each strain, with statistical analysis. Representative photomicrographs of phenotypes from the staining should also be included. Electron microscopy could be conducted to compare the presence of intermediate lipid inclusions within organoidbound mycobacteria.
As requested, percentages of single stained as well as double positive bacilli in each sample are now represented in Figure 2—figure supplement 1. In addition, we have now also followed the request and included a photomicrograph picturing representative Mtb staining phenotypes. Lastly, it would certainly be very elegant to visualize the presence of Mtb lipid inclusions within cellular aggregates by electron microscopy. However, we do not currently have the means for such investigations and the implementation of such a protocol under BSL3 conditions appears unrealistic in the context of this study.
- In Figure 3 (granulomatous response), the number, circularity, and size of immune aggregates are presented as "granuloma score" in which the mean ratio of size to circularity is divided by the number of inclusions. To their credit, in Supplementary Figure 2, the authors provide the data in a straighAorward manner. However, the granuloma score metric is reduced as the number of observed "granulomas" increases, which is counterintuitive. Additionally, circularity is not a definitive aspect of human granulomas (Wells et al., Am J Respir Crit Care Med, 2021, PMID: 34015247). I am skeptical that the "granuloma score" is an accurate predictor granulomatous inflammation. Is there precedent for this metric in the literature? If so, a reference should be provided. A high magnification inset of 1 representative granuloma from each strain should be included in Figure 3A.
As requested, insets of a representative average granuloma for each strain have been included in Figure 3A. The formulation of the “granuloma score” has no precedent and cannot be referenced. By doing so, we meant to integrate within one single parameter the visual differences represented in the current Figure 3— figure supplement 2. We intentionally sought to assign the highest score to the massive aggregation that some strains may promote unlike some that trigger several small, dispersed and diffused aggregates.
- In Figure 4 (macrophage apoptosis), a panel showing the percentage of dual Annexin V and 7-AAD positive cells should be included to provide the reader with the relative scope of ongoing apoptotic vs necrotic/secondary necrotic death in the model. If the data is readily available, including a control of uninfected PBMCs would also allow the reader to evaluate donor-dependent differences of in vitro cell death at baseline.
No significant differences were observed in the percentage of dual Annexin V- and 7-AAD-positive macrophages (necrosis/secondary necrosis) between the MTBC strains at this time-point. Nevertheless, we have disclosed this result in the revised manuscript as Figure 4—figure supplement 2.
- In Figures 5 and 6 (lymphocyte activation and soluble mediator secretion), panels showing unscaled data should be included. Panels depicting the unscaled immunoassay protein readings (pg/mL) by strain for CXCL9, granzyme B, and TNF with statistical analysis should be included in Figure 6.
As requested, unscaled lymphocyte activation and soluble mediator data have been included as Figure 5— figure supplement 2 and Figure 6—figure supplement 1, respectively (replacing former supplementary figures 5 and 7). In addition, updated Figure 6G panel now depicts correlation analysis with the unscaled cytokine concentrations.
The DosR-regulon:
The authors hypothesize that differences in the prevalence of the dormancy metrics (acid-fastness or lipid inclusion prevalence, are due to strain-specific increases in expression of the DosR regulon within the model's hypoxic conditions (lines 107-114, 126-127). The claim that their model is equipped to evaluate dosR-dependent mycobacterial phenotypes was also previously proposed (Arbués et el., 2021) and should be tested. A comparison of the dosR-dependent gene expression of each strain in PBMC aggregates and broth culture by qRT-PCR would test this idea at a very basic level.
We agree. Actually, a similar request was made during the revision of our first in vitro granuloma study for which such qPCR data were generated and presented in Fig. 1 D (PMID: 32069329). In addition, the work of Kapoor et al., who originally developed the in vitro granuloma model also demonstrated the induction of most of the DosR regulated genes by qPCR (PMID: 23308269). We trust that the reviewer will agree that this does not need to be repeated.
The modern Beijing lineage strain L2C:
The authors claim (Line 101-102) that the results of Figure 1 "confirm the higher virulence propensities of strains from modern lineages". From the data presented, it appears that strain L2C (Modern-Beijing) dominates the modern vs ancestral and inter/intra-lineage phenotypes of replication, dormancy, and apoptosis. Are significant differences between modern and ancestral lineages or between strains simply a facet of the distinct profile of L2C? Do the statistical differences disappear when the L2C group is excluded?
Indeed, among the modern lineages’ isolates, L2C exhibits a hypervirulent profile in terms of bacterial replication. However, the difference between modern and ancestral strains remains statistically significant when L2C is excluded from the analysis (p = 0.002). That is also the case when we analyze the proportion of dormant bacteria. Exclusion of L2C strain results in a Kruskal-Wallis overall p = 0.005, and p = 0.0002 when we compare L2 vs. L3. Lastly, regarding the percentage of apoptotic macrophages, if we use L2B (instead of L2C) to compare, the difference is still significant vs. L1A (p = 0.008) although there is no longer a trend for L2A (p = 0.1).
"Dormancy":
Dormancy is definitively a non-replicative state, where bacterial growth is absent. The authors' findings and claims appear to be incompatible with that definition, which they acknowledge (Lines 130-135). The lack of correlation between growth and dormancy in their model is supported with reference to Figure 2C, a Spearman's analysis of dormancy ratio with growth rate (inclusive of all strains under consideration). The figure supports a model where "dormancy" and "growth rate" are disjunct but also appears to show high "dormancy" accompanying increasing "growth" in the L2C group. How are strains able to grow if they are in a non-replicative state? Are the "growth rate" assays actually measures of survival? Are there different rates of infectivity? Are the bacteria growing cellularly in the serum-rich ECM, etc. etc? We need to see the hard CFU and Nile Red, and Auramine-O data to contextualize these findings. Alternatively, could the accumulation of inclusions in the model not be a reliable dormancy metric (Fines et al., BioRxiv [Preprint], 2023, PMID: 37609245)?
We fully agree. The Nile red profiles are always relative and only depict the proportion of the population that has entered a dormant state. Nevertheless, dormancy can be dynamic and bacteria may swi]ly resuscitate in that model. Furthermore, and as depicted in Figure 2—figure supplement 1, despite showing an increased tendency to enter a dormant-like state, a considerable population of lineage 2 bacilli still remains metabolically active and in a replicative state. The referred preprint is very interesting and we will follow it up closely.
Specificity of responses to PBMC aggregation:
The authors claim that their results "reveal a broad spectrum of granulomatous responses" (Line 73) but do not show any aggregation specificity of PBMC responses beyond the model's intrinsic metrics of area and circularity. To establish that their phenotypes such as lymphocyte activation, cytokine release, cell death, or mycobacterial acid-fastness/lipid inclusion prevalence, are aspects of the granulomatous response the authors could infect PBMCs from the same donors with the same strains and perform the same assays using established Mtb-PBMC models in which the cells do not aggregate. This would answer many important questions, for example, does the rate of macrophage infection account for variability in apoptosis percentage? Phagocytosis assay and quantification of stained intracellular mycobacteria within recently infected PBMCs could be conducted to determine if phenotypes are an aspect of granulomatous aggregation or due to strain-specific differences in cellintrinsic macrophage immunity. It would also be very informative to know what percentage of PBMCs and mycobacteria are granuloma-bound in the ECM.
We are not aware of Mtb-PBMC models in which the cells do not aggregate. We previously compared PBMC infection models in the presence or absence of the collagen matrix and cells also spontaneously coalesced around infection foci (PMID: 34603299). Regarding the last point, the melting step of the collagen matrix requires enzymatic digestion and pipetting that dislocate the aggregates. Accordingly, we cannot distinguish the bacteria that would remain within the matrix compared to those replicating within cellular aggregates. However, we did resolve this question by demonstrating that the bacteria were not able to grow in the absence of cells in this culture condition (Supplementary material, PMID: 34603299)
Minor recommendations
- The term TNF-a should be replaced with TNF throughout the manuscript.
We acknowledge that the term TNF-a can be interchangeable with TNF. However, we chose to use the TNFα terminology to differentiate it from lymphotoxin α, which is also referred to as TNF-β.
- The authors cite studies conducted in murine and NHP models to support the claim that "understanding of immune protective traits in TB remains insufficient and yet dominated by data from mouse and non-human primate studies" (Lines 63-64) but ignore an abundance of data from other in vivo and in vitro models that have provided numerous valuable insights in the field of TB immunology. This line should be revised or omired.
For us, the term “dominate” implies that these models are widely used, not that they are the only ones. Other models indeed provided additional relevant data. We are citing the lung-on-chip model of McKinney’lab and the in vitro granuloma model of Elkigton’s lab (line 66). We would be very happy to include more references upon further specifications even though we cannot build an extensive review here.
- The authors claim that their model "encompasses, with the exception of neutrophils, all immune cell types involved in TB" (Lines 67-68). To support this claim, they should provide additional references or data demonstrating that the PBMC aggregates include, eosinophils, mast cells, dendritic cells, yolk-sac-derived alveolar macrophages, and Langhan's giant cells.
With the aim of providing a more accurate and detailed information regarding the cell types present in the model, the sentence has been reformulated as: “The model encompasses all PBMC-derived cell types involved in TB immune responses, but lacks granulocytes (i.e. neutrophils, eosinophils, basophils and mast cells)” (line 260). Noteworthy, the presence of multinucleated giant cells was reported in Kapoor’s paper describing the in vitro granuloma model for the first time (PMID: 23308269).
- As an additional note, the title can be improved and made more broadly accessible by revising the use of the acronyms CXCL9, granzyme B, and TNF-α.
To render the title more broadly accessible we propose to replace the listed acronyms by “soluble immune mediators”, but we remain opened to more appropriate and specific suggestions.
Answers to the reviewers’ public comments
Reviewer #1:
First of all, we would like to thank the reviewers for their feedback and suggestions to improve our manuscript. To strengthen the findings of our study, we have performed and added results from IL-1b and CXCL9 blocking experiments evaluating the impact on the granulomatous response and bacterial load, respectively. In the revised version of the manuscript, while we discuss the null effect on bacterial growth of the treatment with an anti-CXCL-9 antibody and the potential reason behind it, we are now reporting a negative effect on the magnitude of granuloma formation upon neutralization of IL-1b that the correlation analysis had initially suggested.
Reviewer #2:
The revised version of our manuscript incorporates now all the points detailed in the private answers to the reviewer, including clarifications on the statistical tests performed, additional supplementary materials to transparently disclose the raw data behind the normalization approach, as well as flow cytometry data on the immune memory status of the blood donors. In addition, and as stated in the answer to reviewer #1, to test causal relationship between some host and pathogen traits, we have now performed and provided data and interpretation of IL-1b and CXCL9 blocking experiments.
Reviewer #3:
We are thankful and concur with these constructive comments and insights. We have now consistently revisited the statistics in the figures to improve clarity and included new supplementary figures reporting the raw data that were missing in the initial version of the manuscript. In addition, and as mentioned in the answers to reviewers #1 and #2, we have now performed and added IL-1β and CXCL9 blocking experiments to test causal relationship between specific host and pathogen traits. In particular, we are now reporting a negative effect on the magnitude of granuloma formation upon neutralization of IL-1β that the correlation analysis had initially suggested.
More specifically, regarding the point that our method for bacterial collection calls into question whether all Mtb plated for CFU assay resided within granulomatous aggregates, we previously reported that Mtb growth strictly required the presence of human cells in our culture conditions (Supplementary material, Arbués et al, 2021, PMID: 34603299). In the presence of cells, our microscopy read-out does allow us to observe extra-cellular growth if infections are carried on beyond an 8-day limit, which we applied in the current study to exclude this particular caveat.
Concerning the apparently conflicting observation that those strains displaying an increased tendency to enter a dormant-like state are the ones exhibiting the highest replication rates, we would like to point out that a considerable population of bacilli still remains metabolically active and in a replicative state. For instance, and as depicted in Figure 2—figure supplement 1, despite showing an increased tendency to enter a dormant-like state, a considerable population of lineage 2 bacilli does remain metabolically active. Moreover, dormancy can be dynamic and bacteria may swi]ly resuscitate.
Regarding the mentioned limitations of our study that we have discussed in the revised version of our manuscript, we fully concur that PBMC-based in vitro granuloma models lack tissue structure as well as some important stromal and immune cellular players. Nevertheless, we and others demonstrated the particular relevance of the 3-dimensional infection approach within a matrix of collagen and fibronectin by providing mechanistical insights into Mtb resuscitation previously associated to treatment with various immunomodulatory drugs (Arbués et al., 2020, PMID: 32069329; Tezera et al., 2020, PMID: 32091388).
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
eLife Assessment
This manuscript describes the impact of modulating signaling by a key regulatory enzyme, Dual Leucine Zipper Kinase (DLK), on hippocampal neurons. The results are interesting and will be important for scientists interested in synapse formation, axon specification, and cell death. The methods and interpretation of the data are solid, but the study can be further strengthened with some additional studies and controls.
We greatly appreciate the thorough review and thoughtful suggestions from the reviewers and editors on our original manuscript. We provide point-to-point response below. We added new studies on P10 mice and controls as suggested, and made revision of figures and texts for clarification. The revised manuscript includes three new supplemental figures; major text revision is copied under response.
Reviewer #1 (Public Review):
Summary:
In this work, Ritchie and colleagues explore functional consequences of neuronal over-expression or deletion of the MAP3K DLK that their labs and others have strongly implicated in both axon degeneration, neuronal cell death, and axon regeneration. Their recent work in eLife (Li, 2021) showed that inducible over-expression of DLK (or the related LZK) induces neuronal death in the cerebellum. Here, they extend this work to show that inducible over-expression in Vglut1+ neurons also kills excitatory neurons in hippocampal CA1, but not CA3. They complement this very interesting finding with translatomics to quantify genes whose mRNAs are differentially translated in the context of DLK over-expression or knockout, the latter manipulation having little to no effect on the phenotypes measured. The authors note that several genes and pathways are differentially regulated according to whether DLK is over-expressed or knocked out. They note DLK-dependent changes in genes related to synaptic function and the cytoskeleton and ultimately relate this in cultured neurons to findings that DLK over-expression negatively impacts synapse number and changes microtubules and neurites, though with a less obvious correlation.
Strengths:
This work represents a conceptual advance in defining DLK-dependent changes in translation. Moreover, the finding that DLK may differentially impact neuronal death will become the basis for future studies exploring whether DLK contributes to differential neuronal susceptibility to death, which is a broadly important topic.
We thank the reviewer for the comments on the value of our work.
Weaknesses:
This seems like two works in parallel that the authors have not yet connected. First is that DLK affects the translation of an interesting set of genes, and second, that DLK(OE) kills some neurons, disrupts their synapses, and affects neurite growth in culture.
Specific questions:
(1) Is DLK effectively knocked out? The authors reference the floxxed allele in their 2016 work (PMID: 27511108), however, the methods of this paper say that the mouse will be characterized in a future publication. Has this ever been published? The major concern is that here the authors show that Cre-mediated deletion results in a smaller molecular weight protein and the maintenance of mRNA levels.
We apologize for out-of-date citation of the DLK(cKO)<sup>fl/fl</sup> mice. The DLK(cKO)<sup>fl/fl</sup> mice have been published in (Li et al., 2021; Saikia et al., 2022); excision of the flox-ed exon was verified using several Cre drivers (Pv-Cre, AAV-Cre, and VGlut1-Cre in this study). The flox-ed exon contains the initiation ATG and 148 amino acids. By western blot analysis using antibodies against C-terminal peptides of DLK on cerebellar extracts (in Li et al., 2021) and hippocampal extracts (this study), the full-length DLK protein was significantly reduced (Fig 1A-B); DLK is expressed in other hippocampal cells, in addition to glutamatergic neurons, explaining remaining full-length DLK detected.
Our Ribo-seq of VGlut1-Cre; DLK(cKO)<sup>fl/fl</sup> detected remaining Dlk mRNAs lacking the floxed exon (Fig.S1C), which has several candidate ATG at amino acid 223 and after (Fig.S1C1). We detected a very faint band for smaller molecular weight proteins on western blots, only when the membrane was exposed under 5X longer exposure using Pico PLUS Chemiluminescent Substrate (Thermo Scientific, 34580) and a Licor Odyssey XF Imager (revised Fig. S1B). This smaller molecular weight protein might be produced using any candidate ATGs, but would represent an N-terminal truncated DLK protein lacking the ATP binding site and ~1/4 of the kinase domain, i.e. not a functional kinase.
The revised manuscript has updated citation for DLK(cKO)<sup>fl/fl</sup>. Revised Fig.S1B includes images of a western blot under normal exposure vs longer exposure of western blots using anti-DLK antibodies. New Fig.S1C1 shows effects of floxed exon on DLK.
(2) Why does DLK(OE) not kill CA3 neurons? The phenomenon is clear but there is no link to gene expression changes. In fact, the highlighted transcript in this work, Stmn4, changes in a DLK-dependent manner in CA3.
We agree that this is a very interesting question not answered by our gene expression analysis. While we verified Stmn4 expression levels to correlate to the levels of DLK, we do not think that increased Stmn4 per se in DLK(iOE) is a major factor accounting for CA1 death vs CA3 survival. Several published studies have also reported regulation of Stmn4 mRNAs in other cell types, in the contexts of cell death (Watkins et al., 2013; Le Pichon et al., 2017) and axon regeneration and cytoskeleton disruption (Asghari Adib et al., 2024; DeVault et al., 2024; Hu et al., 2019; Shin et al., 2019). As Stmns have significant expression and function redundancy, conventional knockdown or overexpression of individual Stmn generally does not lead to detectable effects on cellular function. As CA3 neurons are widely known for their dense connections and show resilience to NMDA-mediated neurotoxicity (Sammons et al., 2024; Vornov et al., 1991), we speculate that the differential vulnerability of CA1 and CA3 under DLK(iOE) is a reflection of both the intrinsic property, such as gene expression, and also their circuit connection.
In the revised manuscript, we have included following statement on pg 18:
‘While our data does not pinpoint the molecular changes explaining why CA3 would show less vulnerability to increased DLK, we may speculate that DLK(iOE) induced signal transduction amplification may differ in CA1 vs CA3. CA1 genes appear to be more strongly regulated than CA3 genes, consistent with our observation that increased c-Jun expression in CA1 is greater than that in CA3. Other parallel molecular factors may also contribute to resilience of CA3 neurons to DLK(iOE), such as HSP70 chaperones, different JNK isoforms, and phosphatases, some of which showed differential expression in our RiboTag analysis of DLK(iOE) vs WT (shown in File S2. WT vs DLK(iOE) DEGs). Together with other genes that show dependency on DLK, the DLK and Jun regulatory network contributes to the regional differences in hippocampal neuronal vulnerability under pathological conditions.’
Further we state in ‘Limitation of our study’ on pg 20:
‘Our analysis also does not directly address why CA3 neurons are less vulnerable to increased DLK expression. Future studies using cell-type specific RiboTag profiling and other methods at a refined time window will be required to address how DLK dependent signaling interacts with other networks underlying hippocampal regional neuron vulnerability to pathological insults.’
We hope our data will stimulate continued interests for testable hypothesis in future studies.
(3) Why are whole hippocampi analyzed to IP ribosome-associated mRNAs? The authors nicely show a differential effect of DLK on CA1 vs CA3, but then - at least according to their methods ¬- lyse whole hippocampi to perform IP/sequencing. Their data are therefore a mix of cells where DLK does and does not change cell death. The key issue is whether DLK does/does not have an effect based on the expression changes it drives.
At the time of planning the Ribo-Tag experiment several years ago, we focused on the hippocampal glutamatergic neurons. Due to technical difficulty in micro-dissecting individual hippocampal regions from this early timepoint, we opted to use whole hippocampi to isolate ribosome-associated mRNAs. We agree with the reviewer that it is important to sort out DLK-dependent general gene expression changes vs those specific to a particular cell type where DLK impacts its survival. With emerging CA1, CA3 and other cell-type specific Cre drivers and advanced RNAseq technology, we hope that our work will stimulate broad interest in these questions in future studies.
In the revised manuscript, we have included new analysis comparing our Vglut1-RiboTag profiling (P15) with CamK2-RiboTag (for CA1) and Grik4-RiboTag (for CA3) (P42) published in Traunmüller et al., 2023 (GSE209870). We find that >80% of the top ranked genes in their CamK2-RiboTag (for CA1) and Girk4-RiboTag (for CA3) were detected in our VGlut1-RiboTag (revised methods and Supplemental Excel File S3). CA1-enriched genes tended to be expressed higher in DLK(cKO), compared to control, whereas CA3-enriched genes showed less significant correlation to DLK expression levels. Additionally, many genes known to specify CA1 fate do not show significant downregulation in DLK(iOE). This analysis, along with other data in our manuscript, is consistent with an idea that DLK does not regulate neuronal fate.
In the revised manuscript, we presented this additional analysis in Fig. S6K-L, and expanded text description on page 9:
‘Additionally, we compared our Vglut1-RiboTag datasets with CamK2-RiboTag and Grik4-RiboTag datasets from 6-week-old wild type mice reported by (Traunmüller et al., 2023; GSE209870). We defined a list of genes enriched in CamK2-expressing CA1 neurons relative to Grik4-expressing CA3 neurons (CA1 genes), and those enriched in Grik4-expressing CA3 neurons (CA3 genes) (File S3). When compared with the entire list of Vglut1-RiboTag profiling in our control and DLK(cKO), we found CA1 genes tended to be expressed more in DLK(cKO) mice, compared to control (Fig.S6K), while CA3 genes showed a slight enrichment in control though the trend was less significant, and were less clustered towards one genotype (Fig.S6L). Moreover, many CA1 genes related to cell-type specification, such as FoxP1, Satb2, Wfs1, Gpr161, Adcy8, Ndst3, Chrna5, Ldb2, Ptpru, and Ntm, did not show significant downregulation when DLK was overexpressed. These observations imply that DLK likely specifically down-regulates CA1 genes both under normal conditions and when overexpressed, with a stronger effect on CA1 genes, compared to CA3 genes. Overall, the informatic analysis suggests that decreased expression of CA1 enriched genes may contribute to CA1 neuron vulnerability to elevated DLK, although it is also possible that the observed down-regulation of these genes is a secondary effect associated with CA1 neuron degeneration’.
(4) Is the subtle decrease in synapse number (Basson/Homer co-loc.) in the DLK (OE) simply a function of neurons (and their synapses, presumably) having died? At the P15 time point that the authors choose because cell death is minimal, there is still a ~25% reduction in CA1 thickness (Figure 2B), which is larger than the ~15% change in synapses (Figure 5H) they describe.
We thank reviewer for the question. To address this, we have analyzed synapses in the CA1 region at P10 in DLK(iOE) mice when there was no detectable loss of neurons. At P10, we did not detect significant changes in Bassoon, Homer1, or colocalized puncta in CA1 (Fig.S11A-F). In P15 DLK(iOE) mice, Homer1 puncta were slightly smaller (Fig.5L) and showed a significant decrease in CA1 SR (Fig.5I).
In the revised manuscript we have also redone our statistical analysis of synapses, using mice rather than ROIs (revised Fig. 5), as recommended by R3. We also analyzed synapses in CA3, and found no significant differences in P10 or P15 (Fig.S12). We would interpret the data to mean that the effects of DLK(OE) on synapses in CA1 may represent an early step in neuronal death. We hope that future studies will shed clarity on this question.
Reviewer #2 (Public Review):
This manuscript describes the impact of deleting or enhancing the expression of the neuronal-specific kinase DLK in glutamatergic hippocampal neurons using clever genetic strategies, which demonstrates that DLK deletion had minimal effects while overexpression resulted in neurodegeneration in vivo. To determine the molecular mechanisms underlying this effect, ribotag mice were used to determine changes in active translation which identified Jun and STMN4 as DLK-dependent genes that may contribute to this effect. Finally, experiments in cultured neurons were conducted to better understand the in vivo effects. These experiments demonstrated that DLK overexpression resulted in morphological and synaptic abnormalities.
Strengths:
This study provides interesting new insights into the role of DLK in the normal function of hippocampal neurons. Specifically, the study identifies:
(1) CA1 vs CA3 hippocampal neurons have differing sensitivity to increased DLK signaling.
(2) DLK-dependent signaling in these neurons is similar to but distinct from the downstream factors identified in other cell types, highlighted by the identification of STMN4 as a downstream signal.
(3) DLK overexpression in hippocampal neurons results in signaling that is similar to that induced by neuronal injury.
The study also provides confirmatory evidence that supports previously published work through orthogonal methods, which adds additional confidence to our understanding of DLK signaling in neurons. Taken together, this is a useful addition to our understanding of DLK function.
We thank the reviewer for careful reading and positive comments.
Weaknesses:
There are a few weaknesses that limit the impact of this manuscript, most of which are pointed out by the authors in the discussion. Namely:
(1) It is difficult to distinguish whether the changes in the translatome identified by the authors are DLK-dependent transcriptional changes, DLK-dependent post-transcriptional changes or secondary gene expression changes that occur as a result of the neurodegeneration that occurs in vivo. Additional expression analysis at earlier time points could be one method to address this concern.
We appreciate the reviewer’s comment, and have performed new analysis on c-Jun and p-c-Jun levels in CA1, CA3, and DG in P10 DLK(OE) mice. Our data suggest that in CA3 elevations in p-c-Jun and c-Jun occur separately from cell death in a DLK-dependent manner, though the high elevation of both p-c-Jun and c-Jun in CA1 correlates with cell death.
The data is presented in revised Fig.S7A,B, and described in revised text on pg 9-10:
‘In control mice, glutamatergic neurons in CA1 had low but detectable c-Jun immunostaining at P10 and P15, but reduced intensity at P60; those in CA3 showed an overall low level of c-Jun immunostaining at P10, P15 and P60; and those in DG showed a low level of c-Jun immunostaining at P10 and P15, and an increased intensity at P60 (Fig.S7A,C,E). In Vglut1<sup>Cre/+</sup>;H11-DLK<sup>iOE/+</sup> mice at P10 when no discernable neuron degeneration was seen in any regions of hippocampus, only CA3 neurons showed a significant increase of immunostaining intensity of c-Jun, compared to control (Fig.S7A). In P15 mice, we observed further increased immunostaining intensity of c-Jun in CA1, CA3, and DG, with the strongest increase (~4-fold) in CA1, compared to age-matched control mice (Fig.S7C). The overall increased c-Jun staining is consistent with RiboTag analysis.’
Also, on pg.10:
In Vglut1<sup>Cre/+</sup>;H11-DLK<sup>iOE/+</sup> mice, we observed increased p-c-Jun positive nuclei in CA1 at P10, and strong increase in CA1 (~10-fold), CA3 (~6-fold), and DG (~8-fold) at P15 (Fig.S7B,D).
(2) Related to the above, it is difficult to conclusively determine from the current data whether the changes in synaptic proteins observed in vivo are a secondary result of neuronal degeneration or a primary impact on synapse formation. The in vitro studies suggest this has the potential to be a primary effect, though the difference in experimental paradigm makes it impossible to determine whether the same mechanisms are present in vitro and in vivo.
We appreciate the comment, which is related to R1 point 4. We have performed further analysis and revised the text on pg.12 with the following text:
‘To assess effects of DLK overexpression on synapses, we immunostained hippocampal sections from both P10 and P15, with age-matched littermate controls. Quantification of Bassoon and Homer1 immunostaining revealed no significant differences in CA1 SR and CA3 SR and SL in P10 mice of _<_i>Vglut1<sup>Cre/+</sup>;H11-DLK<sup>iOE/+</sup> and control (Fig.S11A-F, S12A-J). In P15, Bassoon density and size in CA1 SR were comparable in both mice (Fig 5G, H, K), while Homer1 density and size were reduced in DLK(iOE) (Fig.5G,I, L). Overall synapse number in CA1 SR was similar in DLK(iOE) and control mice (Fig.5J). Similar analysis on CA3 SR and SL detected no significant difference from control (Fig.S12M-V).’
We would interpret the data to mean that the effects of DLK(OE) on synapses in CA1 may represent an early step in neuronal death. We hope that future studies will shed clarity on this question.
Additionally, to address whether the same mechanisms are present in vitro, we have performed further analysis on cultured hippocampal neurons. As described in the Methods, we made hippocampal neuron cultures from P1 pups of the following crosses:
For control: Vglut1<sup>Cre/+</sup> X Rosa26<sup>tdT/+</sup>
For DLKcKO: Vglut1<sup>Cre/+</sup>;DLK(cKO)<sup>fl/fl</sup> X Vglut1<sup>Cre/+</sup>;DLK(cKO)<sup>fl/fl</sup>;Rosa26<sup>tdT/+</sup>
For DLKiOE: H11-DLK<sup>iOE/iOE</sup> X Vglut1<sup>Cre/+</sup>;Rosa26<sup>tdT/+</sup>
Dissociated cells from a given litter were pooled into the same culture. Because there were different proportions of neurons with our genotype of interest in each culture, it is not simple to know whether DLK was causing significant cell death.
On pg 13, we stated our observation:
‘We did not notice an obvious effect of DLK(iOE) or DLK(cKO) on neuron density in cultures at DIV2. To assess neuronal type distribution in our cultures, we immunostained DIV14 neurons with antibodies for Satb2, as a CA1 marker (Nielsen et al., 2010), and Prox1, as a marker of DG neurons (Iwano et al., 2012). We did not observe significant differences in the proportion of cells labeled with each marker in DLK(cKO) or DLK(iOE) cultures (Fig.S13E). These data are consistent with the idea that DLK signaling does not have a strong role in neuron-type specification both in vivo and in vitro’.
(3) The phenotype of DLK cKO mice is very subtle (consistent with previous reports) and while the outcome of increased DLK levels is interesting, the relevance to physiological DLK signaling is less clear. What does seem possible is that increased DLK may phenocopy other neuronal injuries but there are no real comparisons to directly address this in the manuscript. It would be helpful for the authors to provide this analysis as well as a table with all of the translational changes along with fold changes.
Thank you for the suggestion. The fold changes of genes showing significantly altered expression in DLK(cKO) and DLK(iOE) are provided in the excel files (Supplementary excel File S1 WT vs DLK(cKO) DEGs and File S2. WT vs DLK(iOE) DEGs, highlighted columns B and F).
On pg 6, we revised the text as following to include comparison of DLK levels in other physiological conditions and our mice:
‘Several studies have reported that DLK protein levels increase under a variety of conditions, including optic nerve crush (Watkins et al., 2013), NGF withdrawal (~2 fold) (Huntwork-Rodriguez et al., 2013; Larhammar et al., 2017), and sciatic nerve injury (Larhammar et al., 2017). Induced human neurons show increased DLK abundance about ~4 fold in response to ApoE4 treatment (Huang et al., 2019). Increased expression of DLK can lead to its activation through dimerization and autophosphorylation (Nihalani et al., 2000)’.
And,
‘Additional analysis at the mRNA level (supplemental excel, File S2. WT vs DLK(iOE) DEGs) and at the protein level (Fig.S8E) suggest that the increase in DLK abundance was around 3 times the control level. The localization patterns of DLK protein appeared to vary depending on region of hippocampus and age of animals in both control and Vglut1<sup>Cre/+</sup>;H11-DLK<sup>iOE/+</sup> mice (Fig.S3C).’
In Discussion, we state (pg. 16): ‘The levels of DLK in our DLK(iOE) mice model appear comparable to those reported under traumatic injury and chronic stress.’
(4) For the in vivo experiments, it is unclear whether multiple sections from each animal were quantified for each condition. More information here would be helpful and it is important that any quantification takes multiple sections from each animal into account to account for natural variability.
We apologize this was unclear in the original manuscript.
In the revised methods, under Confocal imaging and quantification (pg 33), we stated: “For brain tissue, three sections per mouse were imaged with a minimum of three mice per genotype for data analysis.”
In revised figure legends, we made it clear that multiple sections from each animal have been used for quantification in all instances, i.e. “Each dot represents averaged thickness from 3 sections per mouse, N≥4 mice/genotype per timepoint.”
In Fig.1F-H: “Each dot represents averaged intensity from 3 sections per mouse”
In Fig.S3B “Data points represent individual mice, averages taken across 3 sections per mouse”
Reviewer #3 (Public Review):
Dr Jin and colleagues revisit DLK and its established multifactorial roles in neuronal development, axonal injury, and neurodegeneration. The ambitious aim here is to understand the DLK-dependent gene network in the brain and, to pursue this, they explore the role of DLK in hippocampal glutamatergic neurons using conditional knockout and induced overexpression mice. They produce evidence that dorsal CA1 and dentate gyrus neurons are vulnerable to elevated expression of DLK, while CA3 neurons appear unaffected. Then they identify the DLK-dependent translatome featured by conserved molecular signatures and cell-type specificity. Their evidence suggests that increased DLK signaling is associated with possible STMN4 disruptions to microtubules, among else. They also produce evidence on cultured hippocampal neurons showing that expression levels of DLK are associated with changes in neurite outgrowth, axon specification, and synapse formation. They posit that downstream translational events related to DLK signaling in hippocampal glutamatergic neurons are a generalizable paradigm for understanding neurodegenerative diseases.
Strengths
This is an interesting paper based on a lot of work and a high number of diverse experiments that point to the pervasive roles of DLK in the development of select glutamatergic hippocampal neurons. One should applaud the authors for their work in constructing sophisticated molecular cre-lox tools and their expert Ribotag analysis, as well as technical skill and scholarly treatment of the literature. I am somewhat more skeptical of interpretations and conclusions on spatial anatomical selectivity without stereological approaches and also going directly from (extremely complex) Ribotag profiling patterns to relevance based on immunohistochemistry and no additional interventions to manipulate (e.g. by knocking down or blocking) their top Ribotag profile hits. Also, it seems to this reviewer that major developmental claims in the paper are based on gene translational profiling dependent on DLK expression, not DLK activation, despite some evidence in the paper that there is a correlation between the two. Therefore, observed patterns and correlations may or may not be physiologically or pathologically relevant. Generalizability to neurodegenerative diseases is an overreach not justified by the scope, approach, and findings of the paper.
We thank the reviewer for the encouraging and constructive comments on the manuscript.
Weaknesses and Suggestions:
The authors state that the rationale for the translatomic studies is to "to gain molecular understanding of gene expression associated with DLK in glutamatergic neurons" and to characterize the "DLK-dependent molecular and cellular network", However, a problem with the experimental design is the selection of an anatomical region at a time point featured by active neurodegeneration. Therefore, it is not straightforward that the differentially expressed genes or pathways caused by DLK overexpression changes could be due to processes related to neurodegeneration. Indeed, the authors find enrichment of signals related to pathways involved in extracellular matrix organization, apoptosis, unfolded protein responses, the complement cascade, DNA damage responses, and depletion of signals related to mitochondrial electron transport, etc., all of which could be the consequence of neurodegeneration regardless of cause. A more appropriate design to discover DLK-dependent pathways might be to look at a region and/or a time point that is not confounded by neurodegeneration.
We appreciate reviewer’s comment. We included our thoughts in ‘Limitation of the study’ (pg 20):
‘Future studies using cell-type specific RiboTag profiling and other methods at a refined time window will be required to address how DLK dependent signaling interacts with other networks underlying hippocampal regional neuron vulnerability to pathological insults.’
In a related vein, the authors ask "if the differentially expressed genes associated with DLK(iOE) might show correlation to neuronal vulnerability" and, to answer this question, they select the set of differentially expressed genes after DLK overexpression and assess their expression patterns in various regions under normal conditions. It looks to me that this selection is already confounded by neurodegeneration which could be the cause for their downregulation. Therefore, such gene profiles may not be directly linked to neuronal vulnerability. A similar issue also relates to the conclusion that "...the enrichment of DLK-dependent translation of genes in CA1 suggests that the decreased expression of these genes may contribute to CA1 neuron vulnerability to elevated DLK".
We agree with the reviewer’s concern that it is difficult to separate neurodegenerative consequences from changes caused by DLK solely based on our translatomics studies on P15 DLK(iOE) mice. As responded to reviewer 1 (point 4) and reviewer 2 (point 1), we have included new analysis of P10 mice (Fig.S7A,B) when neurons did not show detectable sign of degeneration.
We consider several lines of evidence supporting that some differentially expressed genes in DLK(iOE) vs control may likely be specific for increased DLK signaling.
First, the genes identified in DLK(iOE) vs control represent a small set of genes (260), which is comparable to other DLK dependent datasets (Asghari Adib et al., 2024) but shows cell-type specificity.
Second, our analysis using rank-rank hypergeometric overlap (RRHO) detects a significant correlation between upregulated genes from DLK(iOE) vs downregulated genes in DLK(cKO), and vice versa, suggesting that expression of a similar set of genes is depended on DLK (Fig.3C, S6C-E). Consistently, GO term analysis using the list of genes coordinately regulated by DLK, derived from our RRHO analysis, leads to identification of similar GO terms related to up- and downregulated genes as using DLK(iOE)-RiboTag data alone. SynGO analysis of DLK(iOE) regulated genes and DLK(cKO) regulated genes also identified similar synaptic processes regulated by significantly regulated genes (Fig.3F and S6J).
Third, we performed additional analysis comparing our Vglut1-RiboTag dataset with CamK2-RiboTag and Grik4-RiboTag datasets from 6-week-old wild type mice reported by (Traunmüller et al., 2023; GSE209870). We observed >80% overlap among the top ranked genes (revised Methods). We described this analysis on pg 9 and Fig. S6K-L (and Supplemental Excel File S3):
‘Additionally, we compared our Vglut1-RiboTag datasets with CamK2-RiboTag and Grik4-RiboTag datasets from 6-week-old wild type mice reported by (Traunmüller et al., 2023; GSE209870). We defined a list of genes enriched in CamK2-expressing CA1 neurons relative to Grik4-expressing CA3 neurons (CA1 genes), and those enriched in Grik4-expressing CA3 neurons (CA3 genes) (File S3). When compared with the entire list of Vglut1-RiboTag profiling in our control and DLK(cKO), we found CA1 genes tended to be expressed more in DLK(cKO) mice, compared to control (Fig.S6K), while CA3 genes showed a slight enrichment in control though the trend was less significant, and were less clustered towards one genotype (Fig.S6L). Moreover, many CA1 genes related to cell-type specification, such as FoxP1, Satb2, Wfs1, Gpr161, Adcy8, Ndst3, Chrna5, Ldb2, Ptpru, and Ntm, did not show significant downregulation when DLK was overexpressed. These observations imply that DLK likely specifically down-regulates CA1 genes both under normal conditions and when overexpressed, with a stronger effect on CA1 genes, compared to CA3 genes. Overall, the informatic analysis suggests that decreased expression of CA1 enriched genes may contribute to CA1 neuron vulnerability to elevated DLK, although it is also possible that the observed down-regulation of these genes is a secondary effect associated with CA1 neuron degeneration.’
To understand the role and relevance of the DLK overexpression model, there should be a discussion of to what extent it corresponds to endogenous levels of DLK expression or DLK-MAPK pathway activation under baseline or pathological conditions.
We appreciate the suggestion, which is similar to R2 point 3. We have revised the text and discussion to include how DLK levels may be altered in other physiological conditions vs our mice.
Pg. 6: ‘Several studies have reported that DLK protein levels increase under a variety of conditions, including optic nerve crush (Watkins et al., 2013), NGF withdrawal (~2 fold) (Huntwork-Rodriguez et al., 2013; Larhammar et al., 2017), and sciatic nerve injury (Larhammar et al., 2017). Induced human neurons show increased DLK abundance about ~4 fold in response to ApoE4 treatment (Huang et al., 2019). Increased expression of DLK can lead to its activation through dimerization and autophosphorylation (Nihalani et al., 2000)’.
And,
‘Additional analysis at the mRNA level (supplemental excel, File S2. WT vs DLK(iOE) DEGs) and at the protein level (Fig.S8E) suggest that the increase in DLK abundance was around 3 times the control level. The localization patterns of DLK protein appeared to vary depending on region of hippocampus and age of animals in both control and Vglut1<sup>Cre/+</sup>;H11-DLK<sup>iOE/+</sup> mice (Fig.S3C).’
In Discussion (pg. 16): ‘The levels of DLK in our DLK(iOE) mice model appear comparable to those reported under traumatic injury and chronic stress.’
The authors posit that "dorsal CA1 neurons are vulnerable to elevated DLK expression, while neurons in CA3 appear largely resistant to DLK overexpression". This statement assumes that DLK expression levels start at a similar baseline among regions. Do the authors have any such data? Ideally, they should show whether DLK expression and p-c-Jun (as a marker of downstream DLK signaling) are the same or different across regions in both WT and overexpression mice. For example, what are the DLK/p-c-Jun expression levels in regions other than CA1 in Supplementary Figures 2-3 and how do they compare with each other? Normalization to baseline for each region does not allow such a comparison. Also, in Supplementary Figure 6, analyses and comparisons between regions are done at a time point when degeneration has already started. Ideally, these should be done at P10.
We thank the reviewer for raising these points. In the revised manuscript we have included protein expression analysis of DLK (Fig S3), c-Jun, and p-c-Jun at P10 (Fig. S7).
We provided a quantification of DLK immunostaining intensity in CA1 and CA3 in Fig.S3D,E and find roughly comparable levels between regions.
Pg. 6: ‘Additional analysis at the mRNA level (supplemental excel, File S2. WT vs DLK(iOE) DEGs) and at the protein level (Fig.S8E) suggest that the increase in DLK abundance was around 3 times the control level. The localization patterns of DLK protein appeared to vary depending on region of hippocampus and age of animals in both control and Vglut1<sup>Cre/+</sup>;H11-DLK<sup>iOE/+</sup> mice (Fig.S3C).’
We provided our quantifications without normalization to baseline in each region for c-Jun and p-c-Jun, and revised the text accordingly:
Pg. 9-10: ‘In control mice, glutamatergic neurons in CA1 had low but detectable c-Jun immunostaining at P10 and P15, but reduced intensity at P60; those in CA3 showed an overall low level of c-Jun immunostaining at P10, P15 and P60; and those in DG showed a low level of c-Jun immunostaining at P10 and P15, and an increased intensity at P60 (Fig.S7A,C,E). In Vglut1<sup>Cre/+</sup>;H11-DLK<sup>iOE/+</sup> mice at P10 when no discernable neuron degeneration was seen in any regions of hippocampus, only CA3 neurons showed a significant increase of immunostaining intensity of c-Jun, compared to control (Fig.S7A). In P15 mice, we observed further increased immunostaining intensity of c-Jun in CA1, CA3, and DG, with the strongest increase (~4-fold) in CA1, compared to age-matched control mice (Fig.S7C). The overall increased c-Jun staining is consistent with RiboTag analysis’.
Pg. 10: ‘In Vglut1<sup>Cre/+</sup>;H11-DLK<sup>iOE/+</sup> mice, we observed increased p-c-Jun positive nuclei in CA1 at P10, and strong increase in CA1 (~10-fold), CA3 (~6-fold), and DG (~8-fold) at P15 (Fig.S7B,D).
Illustration of proposed selective changes in hippocampal sector volume needs to be very carefully prepared in view of the substantial claims on selective vulnerability. In 2A under P15 and especially P60, it is difficult to see the difference - this needs lower magnification and a lot of care that anteroposterior levels are identical because hippocampal sector anatomy and volumes of sectors vary from level to level. One wonders if the cortex shrinks, too. This is important.
Thank you for raising the point. We have provided images to view the anteroposterior level in Fig.S2A-C. We have noticed cortex in DLK(OE) mice to become thinner, along with expansion of ventricles in some animals at later timepoints (Fig.S2C).
One cannot be sure that there is selective death of hippocampal sectors with DLK overexpression versus, say, rearrangement of hippocampal architecture. One may need stereological analysis, otherwise this substantial claim appears overinterpreted.
We appreciate the comment.
In the revised manuscript, we included a new supplemental figure (Fig. S2) showing lower magnification images of coronal sections, and used cautionary wording, such as ‘CA3 is less vulnerable, compared to CA1’, to minimize the impression of over-interpretation. By NeuN staining, at P10, P15, P60, we did not observe detectable difference in overall hippocampus architecture, apart from noted cell death of CA1 and DG and associated thinning of each of the layers. At 46 weeks, some animals showed differences in the overall shape of dorsal hippocampus, though this appeared to reflect a disproportionately large CA3 region compared to other regions (Fig S2). Increased GFAP staining (Fig.S5A-C) was detected in CA1 but not in CA3, and microglia by IBA1 staining (Fig.S5E) also displayed less reactivity in CA3, compared to CA1. Thus, based on NeuN staining, GFAP staining, IBA1 staining and analysis of the differentially regulated genes, we infer that the effect of DLK(iOE) in CA1 is different than the effect on CA3.
Is the GFAP excess reflective of neuroinflammation? What do microglial markers show? The presence of neuroinflammation does not bode well with apoptosis. Speaking of which, TUNEL in one cell in Supplementary Figure 4E is not strong evidence of a more widespread apoptotic event in CA1.
We have included staining data for the microglia marker IBA1. Both GFAP and IBA1 showed evidence of reactivity particularly in the CA1 region (S5A-E), supporting the differential vulnerability in different regions, though whether cell death is primarily due to apoptosis is unclear.
We agree that our data of sparse TUNEL staining at P15 (Fig S5F,G) do not rule out whether other mechanisms of cell death may also occur. We have included this in our limitations (pg.20) “While we find evidence for apoptosis, other forms of cell death may also occur.”
In several places in the paper (as illustrated in Figure 4B, Supplementary Figure 2B, etc.): the unit of biological observation in animal models is typically not a cell, but an organism, in which averaged measures are generated. This is a significant methodological problem because it is not easy to sample neurons without involving stereological methods. With the approach taken here, there is a risk that significance may be overblown.
We appreciate the reviewer’s point. We used same region for quantification of RNAscope, genotype-blind when possible. We revised the graphs to show mean values for individual mice in Fig.4B, 4C, and Fig.S3B (previously Fig.S2B).
Other Comments and Questions:
Supplementary Figure 9: The authors state that data points are shown for individual ROIs - ideally, they should also show averages for biological replicates. Can the authors confirm that statistical analyses are based on biological replicates (mice) and not ROIs?
We have revised the graphs to show averages from individual mice in Fig.5B-D, F5E-F (previously Fig.S9G-I), Fig.5H-J, and Fig.5K-L (previously Fig.S9J-L) and Fig.S10B,C,E,F (previously Fig.S9B,C, E,F). The statistical analyses are based on biological replicates of mice.
For in vitro experiments, what is the effect of DLK overexpression on neuronal viability and density? Could these variables confound effects on synaptogenesis/synapse maturation?
As described in the Methods, we made hippocampal neuron cultures from P1 pups of the following crosses:
For control: Vglut1<sup>Cre/+</sup> X Rosa26<sup>tdT/+</sup>
For DLKcKO: Vglut1<sup>Cre/+</sup>;DLK(cKO)<sup>fl/fl</sup> X Vglut1<sup>Cre/+</sup>;DLK(cKO)<sup>fl/fl</sup>;Rosa26<sup>tdT/+</sup>
For DLKiOE: H11-DLK<sup>iOE/iOE</sup> X Vglut1<sup>Cre/+</sup>;Rosa26<sup>tdT/+</sup>
Dissociated cells from a given litter were pooled into the same culture. Because there were different proportions of neurons with our genotype of interest in each culture, it is not simple to know whether DLK was causing significant cell death.
On pg 13, we stated our observation:
‘We did not notice an obvious effect of DLK(iOE) or DLK(cKO) on neuron density in cultures at DIV2. To assess neuronal type distribution in our cultures, we immunostained DIV14 neurons with antibodies for Satb2, as a CA1 marker (Nielsen et al., 2010), and Prox1, as a marker of DG neurons (Iwano et al., 2012). We did not observe significant differences in the proportion of cells labeled with each marker in DLK(cKO) or DLK(iOE) cultures (Fig.S13E). These data are consistent with the idea that DLK signaling does not have a strong role in neuron-type specification both in vivo and in vitro’.
We cannot rule out whether variable factors in our cultures may confound effects on synaptogenesis/synapse maturation, and would hope future studies will shed clarity.
Correlations between c-jun expression and phosphorylation are extremely important and need to be carefully and convincingly documented. I am a bit concerned about Supplementary Figure 6 images, especially 6B-CA1 (no difference between control and KO, too small images) and 6D (no p-c-Jun expression at all anywhere in the hippocampus at P15?).
At P10, P15, and P60 we stained for p-c-Jun using the Rabbit monoclonal p-c-Jun (Ser73) (D47G9) antibody from Cell Signaling (cat# 3270) at a 1:200 dilution and imaged using an LSM800 confocal microscope with a 20x objective. We observed p-c-Jun to be quite low generally in control animals. We have replaced the images in Fig.S7F (previously S6D), and adjusted the brightness/contrast to enable better visualization of the low signal in Fig.S7B,D,F (previously Fig.S6B,D).
We revised our text to present the data carefully as stated above:
Pg. 9-10: ‘In control mice, glutamatergic neurons in CA1 had low but detectable c-Jun immunostaining at P10 and P15, but reduced intensity at P60; those in CA3 showed an overall low level of c-Jun immunostaining at P10, P15 and P60; and those in DG showed a low level of c-Jun immunostaining at P10 and P15, and an increased intensity at P60 (Fig.S7A,C,E). In Vglut1<sup>Cre/+</sup>;H11-DLK<sup>iOE/+</sup> mice at P10 when no discernable neuron degeneration was seen in any regions of hippocampus, only CA3 neurons showed a significant increase of immunostaining intensity of c-Jun, compared to control (Fig.S7A). In P15 mice, we observed further increased immunostaining intensity of c-Jun in CA1, CA3, and DG, with the strongest increase (~4-fold) in CA1, compared to age-matched control mice (Fig.S7C). The overall increased c-Jun staining is consistent with RiboTag analysis’.
Pg. 10: ‘In Vglut1<sup>Cre/+</sup>;H11-DLK<sup>iOE/+</sup> mice, we observed increased p-c-Jun positive nuclei in CA1 at P10, and strong increase in CA1 (~10-fold), CA3 (~6-fold), and DG (~8-fold) at P15 (Fig.S7B,D).
Recommendations for the authors:
Several major and minor reservations were raised. The major issues are the need for more information about the over-expression of DLK and a need to extrapolate to an in vivo condition with DLK. A considerable amount of useful information is presented with some very nicely done experiments but it is not yet a coherent or integrated story. The lack of impact of DLK overexpression in some neurons is perhaps the most impactful observation of the study and would be great to have more information around the differential transcriptional/signaling response in these cell types. There is also a need for more experimental details and to address several questions about the mouse genetic and translatome analysis. They are valid concerns that require attention by the authors.
We thank the editors and reviewers for their thoughtful evaluation and suggestions. We hope that the editors and reviewers find that the new data and text changes in our revised manuscript, along with above point-to-point response, have addressed the concerns and strengthened our findings.
Minor points:
(1)The authors state that deletion of DLK has no effect on CA1 at 1yr, however, the image of CA1 in Figure S1D shows substantially fewer NeuN+ neurons. Is this a representative field of view?
We have re-examined images, and observed no effect on hippocampal morphology at 1 yr. We now included representative images in the revised Fig S1D.
(2) Is the DLK protein section staining in Figure 2C a real signal? The staining looks like speckles and is purely somatic. Axonal staining is widely expected based on the literature and the authors' own work. There should be a specificity control.
To our knowledge, axonal staining of DLK reported in the literature is mostly based on cultured DRG neurons. In addition to the reported axonal localization, DLK is present in the cell soma, near the golgi (Hirai et al., 2002), and in the post-synaptic density (Pozniak et al., 2013).
In the revised manuscript, we addressed this point by including controls with no primary antibody, and using an antibody against the closely related kinase, LZK. These additional data are shown in (Fig.S3C,D) (previously Fig.S2C), supporting that DLK protein staining represents real signal. At P10 and P15, DLK immunostaining around CA3 showed axonal staining of the mossy fibers, as well as in the soma and dendritic layers (Fig.S3C,D). A similar pattern was also seen in primary cultured neurons (Fig 6A).
(3) The protein expression of DLK in the transgenic overexpressor (Figure S7C) looks, to the resolution of this blot, to be at least 50kD heavier than 'WT' DLK. Can the authors explain this discrepancy?
The Cre-induced DLK(iOE) transgene has T2A and tdTomato in-frame to C-terminus of DLK. It is known that T2A ‘self-cleavage’ is often incomplete. DLK-T2A-tdTomato would be about 50 kD bigger than WT DLK. We now include the transgene design in revised Fig S1D, and also stated in figure legend of Fig.S8C (previously S7C) that ‘Larger molecular weight band of DLK in Vglut1<sup>Cre/+</sup>;H11-DLKiOE/+ would match the predicted molecular weight of DLK-T2A-tdTomato if T2A-peptide induced ‘self-cleavage’ due to ribosomal skipping is ineffective (Fig.S1D).’
(4) Expression changes in DLK affect various aspects of neurites in CA1 cultures (Figure 6), and changes in DLK also modestly affect STMN4 (and 2, perhaps indirectly) levels (Figure S7C), but there is no indication that DLK acts via STMN4 to cause these changes. It is not clear what to make of these data. Of note, Stmn4 levels change in response to DLK in CA3, without DLK affecting cell death in this region.
We appreciate and agree with the comment. Other studies (Asghari Adib et al., 2024; DeVault et al., 2024; Hu et al., 2019; Larhammar et al., 2017; Le Pichon et al., 2017; Shin et al., 2019; Watkins et al., 2013) reported expression changes in Stmn4 mRNAs in other cell types and cellular contexts, which appeared to depend on DLK. Hippocampal neurons express multiple Stmns (Fig.S8A). While we present our analysis on the effects of DLK dosage on Stmn4, and also Stmn2, we do not think that DLK-induced changes of Stmn4 expression per se is a major factor underlying CA1 cell death vs CA3 survival.
In the revised manuscript, we addressed this point in ‘Limitation of our study’ (pg 20):
‘Additional experiments will be needed to elucidate in vivo roles of STMN4 and its interaction with other STMNs’.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public Review):
The role of enteric glial cells in regulating intestinal mucosal functions at a steady state has been a matter of debate in recent years. Enteric glial cell heterogeneity and related methodological differences likely underlie the contrasting findings obtained by different laboratories. Here, Prochera and colleagues used Plp1-CreERT2 driver mice to deplete the majority of enteric glia from the gut. They found that glial loss has very limited effects on the transcriptome of gut cells 11 days after tamoxifen treatment (used to induce DTA expression), and by extension - more specifically, has only minimal impact on cells of the intestinal mucosa. Interestingly, in the colon (where Paneth cells are not present) they did observe transcriptomic changes related to Paneth cell biology. Although no overt gene expression alterations were found in the small intestine - also not in Paneth cells - morphological, ultrastructural, and functional changes were detected in the Paneth cells of enteric glia-depleted mice. In addition, and possibly related to Paneth cell dysfunction, enteric glia-depleted mice also show alterations in intestinal microbiota composition.
In their analyses of enteric glia from existing single-cell transcriptomic data sets, it is stated that these come from 'non-diseased' humans. However, the data on the small intestine is obtained from children with functional gastrointestinal disorders (Zheng 2023). Data on colonic enteric glia was obtained from colorectal cancer patients (Lee 2020). Although here the cells were isolated from non-malignant regions, saying that the large intestines of these patients are nondiseased is probably an overstatement.
In the Zheng et al. dataset, “functional GI disorders” refers to biopsies from children that do not have any histopathologic evidence of digestive disease. The children do, however, have at least one GI symptom that prompted a diagnostic endoscopy with biopsies, leading to the designation of “functional” disorder. Given that diagnostic endoscopies are invasive procedures that necessitate anesthesia, obtaining biopsies from asymptomatic children without any clinical indication would not be allowable per most institutional review boards, leading the authors of that study to use these samples as a control group. We had thus used the “non-diseased” label to encompass these samples as well as those from the unaffected regions of large intestine from colorectal cancer patients. We now recognize, however, that this label could be misleading, so we have revised the Results and Figure Legends to more accurately reflect details of control tissue origin for this and the Lee et al. (2020) datasets. Per the reviewer’s suggestion, we have removed the term “non-diseased”.
Another existing dataset including human mucosal enteric glia of healthy subjects is presented in Smillie et al (2019). It would be interesting to see how the current findings relate to the data from Smillie et al.
Per the reviewer’s suggestion, we have now added an analysis of the Smillie et al. dataset in Supp. Fig. 1B. This dataset derives from colonic mucosal biopsies from 12 healthy adults (8480 stromal cells) and 18 adults with ulcerative colitis (10,245 stromal cells from inflamed bowel segments and 13,147 from uninflamed), all between the ages of 20-77 years. These data show that SOX10, PLP1, and S100B are selectively expressed within the putative glial cluster from colonic mucosa of both healthy adults and individuals with ulcerative colitis, whereas GFAP is not detected (Supp. Fig. 1B). These observations are consistent with our observations from the two other human datasets already included in our manuscript in Fig. 1 and Supp. Fig. 1.
The time between enteric glia depletion and analyses (mouse sacrifice) must be a crucial determinant of the type of effects, and the timing thereof. In the current study 11 days after tamoxifen treatment was chosen as the time point for analyses, which is consistent with earlier work by the lab using the same model (Rao et al 2017). What would happen when they wait longer than 11 days after tamoxifen treatment? Data, not necessarily for all parameters, on later time points would strengthen the manuscript significantly.
This is an excellent question, particularly given the longer-lived nature of Paneth cells relative to other epithelial cell types. As detailed in our previous study, Cre<sup>+</sup> mice in the Plp1CreER-DTA model are well-appearing and indistinguishable from their Cre-negative control littermates through 11dpt. Unfortunately, a limitation of the model is that beyond 11dpt, Cre<sup>+</sup> mice become anorexic, lose body weight, and have signs of neurologic debility such as hindlimb weakness and uncoordinated gait. These deficits are overt by 14dpt and likely due to targeting Plp1<sup>+</sup> glia outside the gut, such as Schwann cells and oligodendrocytes (as described in another study which used a similar model to study demyelination in the central nervous system, PMID: 20851998). Given these CNS effects and that starvation is well known to affect Paneth cell phenotypes (PMIDs: 1167179, 21986443), we elected not to examine timepoints beyond 11dpt. Technological advances that enable more selective cell depletion will allow study of chronic effects of enteric glial loss in the future.
The authors found transcriptional dysregulation related to Paneth cell biology in the colon, where Paneth cells are normally not present. Given the bulk RNA sequencing approach, the cellular identity in which this shift is taking place cannot be determined. However, it would be useful if the authors could speculate on which colonic cell type they reckon this is happening in.
Per the reviewer’s suggestion, we have added a paragraph to the Discussion addressing one plausible hypothesis to explain this observation. Paneth-like cells have been described in the large intestine and are known, particularly in humans, to express markers typical of Paneth cells, such as lysozyme and defensins (PMID: 27573849, 31753849). These cells could represent the source of the Paneth cell-like transcriptional signature observed in our model. Alternatively, ectopic expression of Paneth cell-associated genes in the colon has been documented in certain pathological conditions, such as colorectal cancer models (e.g., PMID: 15059925), where changes in the local microenvironment appear to trigger activation of Paneth cell genes. Similar, yet unidentified changes in our model could potentially underlie the transcriptional dysregulation related to Paneth cell biology observed here.
On the other hand, enteric glia depletion was found to affect Paneth cells structurally and functionally in the small intestine, where transcriptional changes were initially not identified. Only when performing GSEA with the in silico help of cell type-specific gene profiles, differences in Paneth cell transcriptional programs in the small intestine were uncovered. A comment on this discrepancy would be helpful, especially for the non-bioinformatician readers among us.
Standard differential gene expression analysis (DEG) of the effects of glial loss revealed significant differences only in the colon, and even then, only a handful of genes were changed. These changes were not accompanied by corresponding changes at the protein level, at least as detectable by IHC. In the small intestine, there were no significant differences by standard DEG thresholds. Unlike DEG, gene set enrichment analyses (GSEA), provides a significance value based on whether there is a higher than chance number of genes that are changing in a uniform direction without consideration for the significance of the magnitude of change. Therefore, the GSEA detected that a significant number of genes in the curated Paneth cell gene list exhibited a positive fold change difference in the bulk RNA sequencing data. This prompted us to examine Paneth cells and other epithelial cell types in more detail by IHC, functional and ultrastructural analyses, which all converged on the observation that Paneth cells were relatively selectively disrupted in the epithelium of glial depleted mice.
From looking at Figure 3B it is clear that Paneth cells are not the only epithelial cell type affected (after less stringent in silico analyses) by enteric glial cell depletion. Although the authors show that this does not translate into ultrastructural or numerical changes of most of these cell types, this makes one wonder how specific the enteric glia - Paneth cell link is. Besides possible indirect crosstalk (via neurons), it is not clear if enteric glia more closely associate with Paneth cells as compared to these other cell types. Immunofluorescence stainings of some of these cells in the Plp1-GFP mice would be informative here.
Enteric glia have long been reported to closely associate with crypts, the sites of residence for Paneth cells and intestinal stem cells (PMID: 7043279, 16423922). Consistent with these reports, our observations from Plp1-eGFP mice confirm that enteric glia often appose the entire base of small intestinal crypts (see Author response image 1 below). Given this reproducible observation, we did not pursue histological quantification to compare preferential glial apposition to specific epithelial cell types. Enteric glia have been reported to form close associations with enteroendocrine cells as well (PMID: 24587096), which is not surprising because these cells are highly innervated; however, our analyses did not reveal changes in the abundance and morphology of these cells or other epithelial cell types.
Author response image 1.
(A) Immunohistochemical staining of a small intestinal cross-section from a Vil1<sup>Cre</sup>Rosa26<sup>tdTomato/+</sup> Plp1<sup>eGFP</sup> transgenic mouse in which enteric glia are labeled with green fluorescent protein (GFP) and intestinal epithelial cells are labeled with tdTomato. (B) Mucosal glia closely associate with epithelial cells in intestinal crypts. Scale bar – 20µm.
The authors mention IL-22 as a possible link, but do Paneth cells express receptors for transmitters commonly released by enteric glia? Maybe they can have a look at putative cell-cell interactions by mapping ligand-receptor pairs in the scRNAseq datasets they used.
Beyond IL-22R, it is established that Paneth cells express receptors for secreted WNT proteins, which enteric glia have been shown to express (PMID: 34727519). This interaction could potentially be involved in glial regulation of Paneth cells, but mice lacking glia do not exhibit the same phenotypes as mouse models with disrupted WNT signaling. For example, animals lacking the WNT receptor Frizzled-5 in Paneth cells have mislocalization of Paneth cells to the villi (PMID: 15778706), which we do not readily observe in Plp1CreER-DTA mice. Furthermore, while mucosal enteric glia have been proposed as a source of WNT ligands, this role has been specifically attributed to GFAP+ cells, which may or may not be glia in the mucosa. Moreover, several other cell types in the mucosa around crypts have also been identified as significant sources of WNT ligands (PMID: 16083717, 22922422). We have now added these ideas to the Discussion.
Per the reviewer’s suggestion to use bioinformatics to explore other potential ligand-receptor pairings that might underlie glial regulation of Paneth cells, we conducted a CellPhoneDB analysis focused on these two cell types with a collaborator. This analysis highlighted a handful of potential ligand-receptor interactions, but none of these pathways could be clearly linked to the observed Paneth cell phenotype. Furthermore, virtually all the candidate interactions were not specific to glia, with the candidate ligands expressed by many other more abundant cell types in the mucosa. For these reasons, we decided not to include this analysis in the revised manuscript.
Previously the authors showed that enteric glia regulation of intestinal motility is sex-dependent (Rao et al 2017). While enteric glia depletion caused dysmotility in female mice, it did not affect motility in males. For this reason, most experiments in the current study were conducted in male mice only. However, for the experiments focusing on the effect of enteric glia depletion on hostmicrobiome interactions and intestinal microbiota composition both male and female mice were used. In Figure 8A male and female mice are distinctly depicted but this was not done for Figure 8C. Separate characterization of the microbiome of male and female mice would have helped to figure out how much intestinal dysmotility (in females) contributes to the effect on gut microbial composition. This is an important exercise to confirm that the effect on the microbiome is indeed a consequence of altered Paneth cell function, as suggested by the authors (in the results and discussion, and in the abstract).
In our microbiome analysis, we initially analyzed males and females separately but did not observe significant differences between the two sexes. Thus, we merged the data to increase the statistical power of the genotype comparisons. It was an oversight on our part to not label the datapoints by sex as we did for the other data in the manuscript. We have now revised the figures related to microbiome characterization (Fig. 5D-E and Supp. Fig. 8C) to indicate the sexes of the mice used. Stratifying the data by sex within-sample revealed no major sex-specific differences in microbiome diversity or enriched/depleted biomarkers in the core genotype-dependent observations.
In this context, it would also be interesting to compare the bulk sequencing data after enteric glia depletion between female and male mice.
Our bulk sequencing analysis of the effects of glial loss was conducted in male mice only in order to assess the effects independent of colonic dysmotility, a phenotype observed only in female Plp1CreER-DTA animals (PMID: 28711628). Given that we found rather muted transcriptional changes in male mice, we chose not to perform subsequent transcriptional analyses in female mice, further reasoning that any changes identified would most likely be attributable to dysmotility rather than direct glial effects. Future studies focusing on sex differences in the small intestine, where motility in the Plp1CreER-DTA model is unaffected by glial loss, could provide additional insights, especially in light of the recently reported sex differences in the gene expression and activity levels of enteric glia in the myenteric plexus (PMID: 34593632, 38895433).
Reviewer #1 (Recommendations For The Authors):
- Intro 2nd paragraph: please add to the sentence: "They found no major defects in epithelial properties AT STEADY STATE (or during homeostasis).
Revised as suggested.
- There seems to be a word missing in the 2nd sentence of paragraph 2 on page 4. "... but xxx consistent...".
Reviewed and there were no missing words.
- In the 2nd paragraph on page 8, when discussing GFAP expression in IBD patients, a reference is missing. Also, here it should be GFAP, not Gfap (in italics).
Revised as suggested.
Reviewer #2 (Public Review):
This is an excellent and timely study from the Rao lab investigating the interactions of enteric glia with the intestinal epithelium. Two early studies in the late 1990s and early 2000s had previously suggested that enteric glia play a pivotal role in control of the intestinal epithelial barrier, as their ablation using mouse models resulted in severe and fatal intestinal inflammation. However, it was later identified that these inflammatory effects could have been an indirect product of the transgenic mouse models used, rather than due to the depletion of enteric glia. In previous studies from this lab, the authors had identified expression of PLP1 in enteric glia, and its use in CRE driver lines to label and ablate enteric glia.
In the current paper, the authors carefully examine the role of enteric glia by first identifying that PLP1-creERT2 is the most useful driver to direct enteric glial ablation, in terms of the number of glial cells targeted, their proximity to the intestinal epithelium, and the relevance for human studies (GFAP expression is rather limited in human samples in comparison). They examined gene expression changes in different regions of the intestine using bulk RNA-seq following ablation of enteric glia by driving expression of diphtheria toxin A (PLP1-creERT2;Rosa26-DTA). Alterations in gene expression were observed in different regions of the gut, with specific effects in different regions. Interestingly, while there were gene expression changes in the epithelium, there were limited changes to the proportions of different epithelial cell types identified using immunohistochemistry in control vs glial-ablated mice. The authors then focused on the investigation of Paneth cells in the ileum, identifying changes in the ultrastructural morphology and lysozyme activity. In addition, they identified alterations in gut microbiome diversity. As Paneth cells secrete antimicrobial peptides, the authors conclude that the changes in gut microbiome are due to enteric glia-mediated impacts on Paneth cell activity.
Overall, the study is excellent and delves into the different possible mechanisms of action, including the investigation of changes in enteric cholinergic neurons innervating the intestinal crypts. The use of different CRE drivers to target enteric glial cells has led to varying results in the past, and the authors should be commended on how they address this in the Discussion.
We thank the reviewer for this positive feedback.
Reviewer #2 (Recommendations For The Authors):
I have a few minor comments:
Changes in bacterial diversity - the authors make a very compelling case that changes in the proportions of various intestinal microbiome species were impacted by the change in Paneth cell secretions resulting from the depletion of enteric glia. Another potential mechanism of action could be alterations in gut motility resulting from loss of enteric glia. It appears that faecal samples were collected from both male and female mice, and hence changes in colonic motility could be involved. This should be addressed in the Results and Discussion.
We agree with the reviewer that GI dysmotility could influence microbial composition. To address this, we initially analyzed microbiome data separately for male and female mice, because only female Plp1CreER-Rosa26DTA exhibit dysmotility. We found no significant sex-specific differences in microbiome composition, however, which suggested to us that dysmotility was unlikely to be the primary driver of the observed microbial changes. Based on these findings, we opted to combine data from male and female mice in our final microbiome analysis. We have now revised the Results, Discussion, and Methods sections to clarify this.
Supplementary Figure 2: it would be helpful to include some labels of landmarks on the tissues, and arrows pointing to immunoreactive cells.
We have added labels and arrows to images in Supplementary Figure 2 per the reviewer’s suggestion.
Figure 4B: It's hard to tell the difference in ultrastructural morphology of the Paneth cells between Cre- and Cre+ mice in the EM images. Heterogeneous granules (PG) seem to be labelled in cells from both genotypes of mice. Some outlines of cells or arrows pointing to errant granules would be helpful.
We have added arrows indicated errant granules to images in Figure 4 per the reviewer’s suggestion.
Reviewer #3 (Public Review):
In this study, Prochera, et al. identify PLP1+ cells as the glia that most closely interact with the gut epithelium and show that genetic depletion of these PLP1+ glia in mice does not have major effects on the intestinal transcriptome or the cellular composition of the epithelium. Enteric glial loss, however, causes dysregulation of Paneth cell gene expression that is associated with morphological disruption of Paneth cells, diminished lysozyme secretion, and altered gut microbial composition.
Overall, the authors need to first prove whether the Plp1CreER Rosa26DTA/+ mice system is viable.
In previous work, we discovered that the gene Plp1 is broadly expressed by enteric glia and, within the mouse intestine, is quite specific to glial cells (PMID: 26119414). We characterized the Plp1CreER mouse line as a genetic tool in detail in this initial study. Then in a subsequent manuscript, we used Plp1CreER-DTA mice to genetically deplete enteric glia and study the consequences on epithelial barrier integrity, crypt cell proliferation, enteric neuronal health and gastrointestinal motility (PMID: 28711628). In this second study, we performed extensive validation of the Plp1CreER-DTA mouse model including detailed quantification of glial depletion in the small and large intestines across the myenteric, intramuscular and mucosa compartments by immunohistochemical (IHC) staining of whole tissue segments to sample thousands of cells. We found that the majority of S100B<sup>+</sup>enteric glia were depleted within 5 days in both sexes, including more than 88% loss of mucosal glia, and that this loss was stable at 3 subsequent timepoints (7, 9 and 14 days post-tamoxifen induction of Cre activity). Glial loss was further confirmed by IHC for GFAP in the myenteric plexus, and by ultrastructural analysis of the small intestine to ensure cell depletion rather than simply loss of marker expression. Our group was the first to use this model to study enteric glia, and since then similar models and our key observations have been replicated by other groups (PMID: 33282743, 34550727). Thus, we consider this model to be well established.
Also, most experimental systems have been evaluated by immunohistochemistry, scRNAseq, and electron microscopy, but need quantitative statistical processing.
RNA-sequencing and microbiome analyses are inherently quantitative (Figures 1A-B, Supp. Figure 1, Figure 2, Supp. Figure 4A, Figure 3A-B, Supp. Figure 5, Figure 5, and Supp. Figure 8C). Virtually all our other observations are also supported by quantitative analysis including analysis of mucosal glial markers (Fig. 1C-D), validation of Paneth cell transcript expression in the colon (Supp. Fig. 4B), measurement of epithelial cell type composition (Figure 3C, D), assessment of crypt innervation (Supp. Fig. 7E), and measurement of bacteria-to-crypt distance (Supp. Fig. 8A-B). The only observation that was not quantified was that of morphological abnormalities of Paneth cells. Given the inherently low sampling rate of EM studies, we felt that functional assays (explant secretion assays, effects on microbial composition) would be more meaningful for interrogation of a potential Paneth cell phenotype and thus elected to focus our quantitative analyses on those functional assays rather than further histological measurements.
In addition, the value of the paper would be enhanced if the significance of why the phenotype appeared in the large intestine rather than the small intestine when PLP1 is deficient for Paneth cells is clarified.
Please see detailed response to Reviewer 1 that addresses this comment and the corresponding addition to the Discussion.
Major Weaknesses:
(1) Supplementary Figure 2; Cannot be evaluated without quantification.
Supplemental Figure 2 shows qualitative IHC observations that were highly reproducible across all the subjects indicated for each marker and align well with the quantitative transcriptional data from human subjects shown in Figure 1 and Supplemental Figure 1. The DAB staining in Supplemental Figure 2 could theoretically be quantified by staining intensity or counting cell number but we felt this would be arbitrary and difficult to achieve in a meaningful way with a single chromogen. The DAB reaction is associated with a non-linear relationship between amount of an antigen and staining intensity, especially at higher levels (PMID: 16978204, 19575836), because it is not a direct conjugate and relies upon an enzymatic reaction. The amplification step required for DAB staining using Horseradish Peroxidase (HRP) introduces variability, particularly with cytoplasmic markers and in complex tissue structures like the plexuses, where proteins are distributed throughout the glial network. Counting cell number also would not lead to fair comparisons between markers because while SOX10 shows a clear nuclear signal suitable for quantification, the other markers are all membrane or cytoplasmic proteins, making accurate counting nearly impossible in dense ganglia. Finally, quantifying cell number in 5-micron paraffin sections which have major differences in sampling from one subject to another in terms of presence of ganglia and ganglia size, would also make this prone to inaccuracy. Given these limitations and the robust qualitative data we have shown that aligns completely with the quantitative transcriptional analyses, we respectfully disagree with the reviewer’s comment.
(2) Figure 2A; Is Plp1CreER Rosa26DTA/+ mice system established correctly? S100B immunohistology picture is not clear. A similar study is needed for female Plp1CreER Rosa26DTA/+ mice. What is the justification for setting 5 dpt, 11 dpt? Any consideration of changes to organs other than the intestine? Wouldn't it be clearer to introduce Organoid technology?
Please see the detailed response to first comment. The Plp1CreER- DTA mouse model is well-established and there are detailed experimental justifications for the 5 and 11dpt timepoints as well as the focus on male mice for RNA-sequencing analyses. As described in our previous work (PMID: 28711628), Plp1<sup>+</sup> cells throughout the animal would be affected, including Schwann cells and oligodendrocytes, which is why we limit our analyses to the first 11dpt, when there are fewer confounding variables. The S100B immunohistology picture in Figure 2A was intended to be a schematic graphical representation of the paradigm of glial loss, not a data figure. Extensive validation of glial loss in this model was shown in our previous study. To improve clarity, we have now enlarged the picture for the reader.
Regarding the suggestion to use organoid technology, standard intestinal epithelial organoids do not incorporate any elements of the enteric nervous system (ENS), which is the focus of this study. Some groups have made heroic efforts to incorporate ENS components into intestinal organoids by introducing neural crest progenitor cells and grafting the hybrid organoids under the renal capsule in mice (example PMID: 27869805); but these studies are still limited, and it remains unclear how much the preparations reflect functional, natively innervated intestine. Our ex vivo explant assay preserves native ENS-epithelial interactions, providing a more effective model for studying the relationship between enteric glia and Paneth cells.
(3) Figure 2B; Need an explanation for the 5 genes that were altered in the colon. Five genes should be evaluated by RT-qPCR. Why was there a lack of change in the duodenum and ileum?
While RT-qPCR validation of differentially expressed genes was once common practice, especially with microarray data, there is now robust evidence for strong correlations between RNA sequencing (RNAseq) results and RT-qPCR measurements of gene expression (PMID: 26208977, 28484260). Notably Rajkumar et al. (PMID: 26208977) demonstrated that RNAseq analyzed using DESeq2 (a method which we employed in our study), yields highly accurate results. They reported a 0% false positive rate and a 100% positive predictive value for DESeq2, rendering additional RT-qPCR validation redundant. We only performed RT-qPCR analysis of colonic Lyz1 expression because our IHC analyses failed to show any ectopic expression of the protein in the colons of Cre<sup>+</sup> mice (Supp. Figure 4D) and we wished to validate the gene expression change seen by RNAseq in an independent cohort to be absolutely sure. Per the detailed response to Reviewer 1, we do not have a mechanistic explanation for why there is selective transcriptional induction of Paneth cell-related genes in the colon upon glial depletion. We have elaborated on this in the revised Discussion.
(4) Supplementary Figure 3; Top 3 genes should be evaluated by RT-qPCR.
Given that none of the changes included in Supplementary Figure 3 for the duodenum or ileum reach the standard threshold for statistical significance and in view of the findings by Rajkumar, et al. (2015) described above, we don’t believe that evaluating expression of these genes by RT-qPCR would be informative in interpreting these negative results.
(5) Supplementary Figure 4B, C, and D; Why not show analysis in the small intestine?
We chose to focus on the colon for this analysis because this was the only region of the intestine that exhibited statistically significant differences in transcriptional profiles as assessed by DEG.
(6) Supplementary Figure 4D; Cannot be evaluated without quantification.
As shown in the representative images, no LYZ1 or DEFA5 signal was detected in the colons of Cre<sup>-</sup> or Cre<sup>+</sup> mice (n=3 mice per genotype; >100 crypts/mouse assessed), though it was readily detectable in the ileums of both genotypes. We have now added the number of crypts assessed to the figure legend.
(7) Figure 3D; Cannot be evaluated without quantification.
Please see Fig. 3C for quantification of each cell type marker shown in Figure 3D.
(8) Supplementary Figure 5B and C; Top 3 genes should be evaluated by RT-qPCR.
Please see detailed explanation to comments #3 and #4 above.
(9) Supplementary Figure 6; Top 3 genes should be evaluated by RT-qPCR.
This comment was likely made in error because Supplementary Fig. 6 does not show any gene expression data.
(10) Figure 4A; Cannot be evaluated without quantification.
We appreciate the reviewer’s comment here and strived very hard to add quantification of the Paneth cell granule phenotype seen by light microscopy to our study. IHC for LYZ1 is typically the gold standard for assessment of Paneth cell granules by light microscopy. In our hands, however, we encountered persistent issues with IHC for this protein. While it very reproducibly detected Paneth cells with sufficient specificity to enable quantification of number of immunoreactive cells (as shown in Figure 3C), it did not enable quantification of granule morphology because it consistently exhibited diffuse staining throughout the cell (see Author response image 2 below). This appearance persisted regardless of extensive titration of fixation parameters (time, temperature, fixative supplier, 10% NBF vs 4% PFA), tissue preparation (fixed as intact tubes versus “swiss-rolls”), permeabilization conditions, operator, antibody used, and other variables. Upon subsequently surveying the literature, it seems that similar diffuse staining patterns for LYZ1 have been observed by numerous other groups and this may simply be an experimental limitation.
Author response image 2.
Representative IHC images showing LYZ1 staining optimization. Ileal tissues from 8-10-week-old mice were prepared as either 'swiss-rolls' (A-D) or tubes (E-F) and fixed using different protocols: 10% neutral buffered formalin (NBF) from Epredia (#5710-LP) (A-B, E), 10% NBF from G-Biosciences (#786-1057) (C-D), or 4% paraformaldehyde (PFA) from VWR (#100503-917) (F). Fixations were conducted at room temperature (A, C) or at 4°C (B, D-F). Diffuse cytoplasmic LYZ1 staining is observed within Paneth cells, regardless of conditions of tissue preparation.
As an alternative approach to detecting Paneth cell granules, we tried UEA-I lectin staining. This labeling approach was sufficient to reveal qualitative differences in Paneth granule morphology in Cre<sup>+</sup> mice, as shown in Fig. 4A. However, the transient nature of this lectin labeling made it very difficult to systematically quantify granule morphology in a blinded manner, as we did for our other analyses. Given these persistent challenges, we decided to present qualitative data on morphology by two orthogonal approaches (UEA-I staining by light microscopy and ultrastructure by EM) and focus on functional read-outs for quantitative analyses (explant secretion assays and microbiome analyses). In aggregate, we feel that these data provide robust and complementary evidence of the observed phenotype from independent experimental approaches.
(11) Figure 4D; Cannot be evaluated without quantification.
This comment was likely made in error because there is no Figure 4D.
(12) Additional experiments on in vivo infection systems comparing Plp1CreER Rosa26DTA/+ mice and controls would be great.
We agree that in vivo infection experiments would be very interesting to pursue, particularly given the potential role of Paneth cells in innate immunity. These studies are beyond the scope of the current manuscript, but we hope to report on them in the future.
Reviewer #3 (Recommendations For The Authors):
Patients with inflammatory bowel disease (IBD); UC or CD.
Revised per reviewer suggestion.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public review):
Summary:
The article by Piersma et al. aims to reduce the complex process of NK cell licensing to the action of a single inhibitory receptor for MHC class I. This is achieved using a mouse strain lacking all of the Ly49 receptors expressed by NK cells and inserting the Ly49a gene into the Ncr1 locus, leading to expression on the majority of NK cells.
Strengths:
The mouse model used represents a precise deletion of all NK-expressed genes within the Ly49 cluster. The re-introduction of the Ly49a gene into the Ncr1 locus allows expression by most NK cells. Convincing effects of Ly49a expression on in vitro activation and in vivo killing assay are shown.
Weaknesses:
The choice of Ly49a provides a clear picture of H-2D<sup>d</sup> recognition by this Ly49. It would be valuable to perform additional studies investigating Ly49c and Ly49i receptors for H-2b. This is of interest because there are reports indicating that Ly49c may not be a functional receptor in B6 mice due to strong cis interactions.
We agree with the reviewer that it will be important to extend our findings to H-2b haplotypes with individual cognate Ly49 receptors (Ly49C and Ly49I). While these experiments are subject of our ongoing studies, they are beyond the scope of the current manuscript considering the significant time, effort and cost to generate these new Ly49C and Ly49I knockin mice.
This work generates an excellent mouse model for the study of NK cell licensing by inhibitory Ly49s that will be useful for the community. It provides a platform whereby the functional activity of a single Ly49 can be assessed.
Reviewer #2 (Public review):
Piersma et al. continue to work on deciphering the role and function of Ly49 NK cell receptors. This manuscript shows that a single inhibitory Ly49 receptor is sufficient to license NK cells and eliminate MHC-I-deficient target cells in mice. In short, they refined the mouse model ∆Ly49-1 (Parikh et al., 2020) into the Ly49KO model in which all Ly49 genes are disrupted. Using this model, they confirmed that NK cells from Ly49KO mice cannot be licensed, produce lower levels of IFN-gamma, and cannot reject MHC-I-deficient cells. To study the effect of a single Ly49 receptor in the function of NK cells, the authors backcrossed Ly49KO mice to H-2D<sup>d</sup> transgenic KODO (D8-KODO) Ly49A knock-in mice in which a single inhibitory Ly49A receptor that recognizes H-2D<sup>d</sup> ligands is expressed. By doing so, they demonstrate that a single inhibitory Ly49 receptor expressed by all NK cells is sufficient for licensing and missing-self killing.
While the results of the study are largely consistent with the conclusions, it is important to address some discrepancies. For instance, in the title of Figure 1, the authors state that NK cells in Ly49KO mice compared to WT mice have a less mature phenotype , which is not consistent with the corresponding text in the Results section (lines 170-171) that states there is no difference in maturation. These differences are not evident in Figure 1, panel D. It is crucial to acknowledge these inconsistencies to ensure a comprehensive understanding of the research findings.
We thank the reviewer for pointing this out. We have corrected the figure legend title to: “Mice generated to lack all NK-related Ly49 molecules using CRISPR have NK cells that display alterations in select surface molecules.”
In the legend of Figure 2. the text related to panel C indicates the use of dyes to label the splenocytes, and CFSE, CTV, and CTFR were mentioned. However, only CTV and CTFR are shown on the plots and mentioned in the corresponding text in the Results section. Similarly, in the legend of Figure 4, which is related to panel C, the authors write that splenocytes were differentially labeled with CFSE and CTV as indicated; however, in Figure 4, C and the Results section text, there is no mention of CFSE.
We thank the reviewer to point out these inconsistencies. We did label target cells with CFSE to distinguish them from host cells, to clarify we have done the following:
We have removed CFSE from figure legends of Figure 2 and 4.
We included the following on CFSE labeling in the Materials and Methods section: “Target splenocytes were additionally labeled with CFSE to identify transferred target splenocytes from host cells.”
The authors should clarify why they assume that KLRG1 expression is influenced by the expression of inhibitory Ly49 receptors and not by manipulations on chromosome 6, where the genes for both KLRG1 and Ly49 receptors are located.
The effect on KLRG1 expression in phenocopied in the Ly49A KI mice (on a Ly49 KO background). The Ly49A KI allele is encoded by the Ncr1 locus, which is located on chromosome 7 and not by chromosome 6 where KLRG1 is located, thus excluding involvement of cis-regulatory elements encoded by the Ly49 locus on chromosome 6.
We have clarified this in the discussion section (lines 350-358):
“The Ly49 gene family as well as Klrg1 is located within the NKC on chromosome 6 (Yokoyama and Plougastel, 2003) …. expression of only Ly49A, encoded in the Ncr1 locus located on chromosome 7, in Ly49KO mice on a H-2D<sup>d</sup> background restored KLRG1 expression”
However, a better explanation for the possible influence of other inhibitory NK cell receptors still needs to be included. In the study by Zhang et al. (doi: 10.1038/s41467-019-13032-5 the authors showed the synergized regulation of NK cell education by the NKG2A receptor and the specific Ly49 family members. Although in this study, Piersma and colleagues show the control of MHC-I deficient cells by Ly49A+ NKG2A-NK cells in Figure 4., this receptor is not mentioned in the Results or in the Discussion section, so its role in this story needs to be clarified. Therefore, the reader would benefit from more information regarding NKG2A receptor and NKG2A+/- populations in their results.
We agree with the reviewer that it is important to describe our results in the context of other inhibitory receptors. To clarify the role of NKG2A and potentially other inhibitory receptors we have made the following improvements to our manuscript:
We discuss the role of NKG2A in the discussion section, which now include (lines 259-266):
“While our results did not interrogate licensing by inhibitory receptors outside of the Ly49 receptor family, such as has been reported for NKG2A (Anfossi et al., 2006; Zhang et al., 2019), they do demonstrate that expression of Ly49A without other Ly49 family members can mediate NK cell licensing. Moreover, we found that Ly49 receptors are required and sufficient for missing-self rejection under steady-state conditions. However, these observations do not rule out involvement of other inhibitory receptors under specific inflammatory conditions. For example, NKG2A contributes to rejection of missing-self targets in poly(I:C)-treated mice (Zhang et al., 2019).”
We also added the following to the result section (lines 179-182):
NKG2A has been implicated in NK cell licensing by the non-classical MHC-I molecule Qa1 (Anfossi et al., 2006), to eliminate potential confounding effects by this interaction, effector functions of NKG2A- NK cells were evaluated as described before (Bern et al., 2017).
Reviewer #3 (Public review):
Summary:
In this study, Piersma et al. successfully generated a mouse model with all Ly49n et al., 2017 genes knocked out, resulting in the complete absence of Ly49 receptor expression on the cell surface. The absence of Ly49 expression led to the loss of NK cell education/licensing and consequently, a failure in responsiveness against missing-self target cells. The experimental work and findings are partially overlapping with the previous work by Zhang et al. (2019), who also performed knockout of the entire Ly49 locus in mice and demonstrated that loss of NK responsiveness was due to the removal of inhibitory, and not activating Ly49 genes. The authors demonstrate the restoration of NK cell licensing by knocking in a single Ly49 gene, Ly49A, in a mouse expressing the H-2D<sup>d</sup> ligand for this receptor, which is a novel and important finding.
Strengths:
The authors established a novel mouse model enabling them to have a clean and thorough study on the function of Ly49 on NK cell licensing. Also, by knocking in a single Ly49, they were able to investigate the function of a given Ly49 receptor excluding the "contamination" of co-expression of any other Ly49 genes. Their idea and method were novel though the mouse model was somehow genetically similar to a previous study. The experiment design and data interpretation were logically clear and the evidence was solid.
Weaknesses:
The paper is very poorly written and confusing. The authors should be more accurate in the usage of terminology, provide more details on experimental procedures, and revise much of the text to improve clarity and coherence. A thorough revision aiming to clarify the paper would be helpful.
We regret that the manuscript was confusing to the reviewer. We have made thorough revisions to the different sections, which we hope will improve the clarity of the manuscript.
We have made changes to all sections of the manuscript, including the title. These revisions include improved clarity on description of NK cell licensing and consistent usage throughout the manuscript, per the reviewer recommendations. We hope that all our improvements help the clarity of the manuscript.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
I was confused by lines 262-270 in the discussion. The data from Hanke et al. is presented as contradictory to the observation that Ly49s bind more efficiently to H2-Kb than -Db, but they showed that Ly49c/i did not bind Kb-deficient cells, supporting the preferred binding to Kb.
We have clarified this issue and the paragraph now reads: “This is further supported by early studies using Ly49 transfectants binding to Con A blasts showing that Ly49C and Ly49I can bind to H-2D<sup>b</sup>-deficient but not H-2K<sup>b</sup>-deficient cells (Hanke et al., 1999), despite the caveat of testing binding to cells overexpressing Ly49s in these studies.”
Reviewer #2 (Recommendations for the authors):
The authors' conclusion that one type of inhibitory Ly49 receptor expressed on NK cells is sufficient for successful licensing and rejection of missing self-cells is a significant step forward. However, it would be beneficial to complement this with additional data. For instance, exploring the role of a single inhibitory Ly49 receptor responsible for licensing in a mouse model with a different haplotype (e.g. Ly49C or Ly49I on H-2b MHC I haplotype in C57BL/6J mice) could provide valuable insights and open new avenues for research in the field.
We agree with the reviewer that it will be important to extend our findings to additional MHC-I haplotypes with single cognate Ly49 receptors. While these experiments are subject of our ongoing studies, they are beyond the scope of the current manuscript considering the significant effort, time, and cost to generate these new Ly49C and Ly49I knockin mice.
Reviewer #3 (Recommendations for the authors):
Specific issues that should be addressed are as follows:
(1) The title of the paper: "Expression of a single inhibitory Ly49 receptor is sufficient to license NK cells for effector functions" is ambiguous. When I first read the title, I thought the authors meant that only a single Ly49 molecule on the NK cell surface was necessary to induce licensing. It might be better to replace "single inhibitory receptor" with "single member of Ly49 receptor family".
We have changed the title to: “Expression of a single inhibitory member of the Ly49 receptor family is sufficient to license NK cells for effector functions”
(2) In the abstract, introduction, and results, the authors distinguish "licensing" and "rejection of missing-self targets" as two distinct phenomena. An example includes Abstract, lines 51-53: "Herein, we showed mice lacking expression of all Ly49s were unable to reject missing-self target cells in vivo, were defective in NK cell licensing, and displayed lower KLRG1 on the surface of NK cells". Similarly, the title of the second subsection of the Results states: "Ly49-deficient NK cells are defective in licensing and rejection of cognate MHC-I deficient target cells" (line 176). In these instances, it seems that by "licensing", they mean only response to plate-bound anti-NK1.1 stimulation and not a response to missing-self targets. Alternatively, in the first paragraph of the Discussion, it sounds as if licensing includes both anti-NK1.1 and missing-self responses (lines 258-260): "...NK cells were fully licensed in terms of their functional phenotype, including the capacity to be activated by an activation receptor in vitro and efficient rejection of MHC-I deficient target cells in vivo". Please define the terms and use the terms consistently throughout the paper.
We were the first to describe the term licensing and have defined this as acquisition of NK cell functional competence by self-MHC molecules (Kim et al., 2005), which is characterized by increased NK cell effector functions to activating signals. Thus, licensed NK cells are prevented from attacking normal MHC-I<sup>+</sup> cells by the same self-MHC-I-specific receptor that conferred licensing, while unlicensed NK cells without appropriate Ly49 receptors are functionally incompetent.
To clarify we made changes throughout the manuscript including the following:
Lines 91-101:
“In addition to effector function in missing-self, Ly49 receptors that recognize their cognate MHC-I ligands are involved in licensing or education of NK cells to acquire functional competence. NK cell licensing is characterized by potent effector functions including IFNγ production and degranulation in response to activation receptor stimulation (Elliott et al., 2010; Kim et al., 2005). Like missing-self recognition, inhibitory Ly49s require SHP-1 for NK cell licensing which interacts with the ITIM-motif encoded in the cytosolic tail of inhibitory Ly49s (Bern et al., 2017; Kim et al., 2005; Viant et al., 2014). Moreover, lower expression of SHP-1, particularly within the immunological synapse, is associated with licensed NK cells (Schmied et al., 2023; Wu et al., 2021). Thus, inhibitory Ly49s have a second function that licenses NK cells to self-MHC-I thereby generating functionally competent NK cells but it has not been possible to exclude contributions from other co-expressed Ly49s.”
Lines 268-271 (previously 258-260):
“Yet the NK cells were fully licensed in terms of IFNγ production and degranulation in vitro and efficiently rejected MHC-I deficient target cells in vivo. Thus, a single Ly49 receptor is capable to confer the licensed phenotype and missing-self rejection in vitro and in vivo.”
Lines 309-312:
“In conclusion, these data show that expression of a single inhibitory Ly49 receptor is necessary and sufficient to license NK cells and mediate missing self-rejection under steady state conditions in vivo.”
(3) Introduction, lines 76-79. Please provide the C57BL/6 MHC-I genotype. It is difficult to follow the text here without this information. In general, please provide information to help the reader who may not be working in this precise field.
We thank the reviewer for pointing this out. We have included this and the lines now read: “For example, in the C57BL/6 background, Ly49C and Ly49I can recognize H-2<sup>b</sup> MHC-I molecules that include H-2K<sup>b</sup> and H-2D<sup>b</sup>, while Ly49A and Ly49G cannot recognize H-2<sup>b</sup> molecules and instead they recognize H-2<sup>d</sup> alleles.”
(4) Introduction, lines 85-97. Please use commas: "...the MHC-I specificities of other Ly49s have been primarily studied with MHC tetramers containing human b2m, which is not recognized by Ly49A, on cells overexpressing Ly49s" in order to clarify the sentence.
Commas have been added as suggested by the reviewer.
(5) Introduction, lines 91-101. The whole paragraph starting with the following sentence does not make sense and should be re-written. "In addition to effector function in missing-self, when inhibitory Ly49 receptors recognize their cognate MHC-I ligands in vivo, they license or educate NK cells for potent effector functions including IFNγ production and degranulation in response to activation receptor stimulation".
We regret that this paragraph was not clear to the reviewer. We have changed this paragraph to:
“In addition to effector function in missing-self, Ly49 receptors that recognize their cognate MHC-I ligands are involved in licensing or education of NK cells to acquire functional competence. NK cell licensing is characterized by potent effector functions including IFNγ production and degranulation in response to activation receptor stimulation (Elliott et al., 2010; Kim et al., 2005). Like missing-self recognition, inhibitory Ly49s require SHP-1 for NK cell licensing which interacts with the ITIM-motif encoded in the cytosolic tail of inhibitory Ly49s (Bern et al., 2017; Kim et al., 2005; Viant et al., 2014). Moreover, lower expression of SHP-1, particularly within the immunological synapse, is associated with licensed NK cells (Schmied et al., 2023; Wu et al., 2021). Thus, inhibitory Ly49s have a second function that licenses NK cells to self-MHC-I thereby generating functionally competent NK cells but it has not been possible to exclude contributions from other co-expressed Ly49s.”
(6) Results, line 181. Please edit: "...MHC-I-deficient H-2K<sup>b</sup> x H-2D<sup>b</sup> deficient (KODO) mice".
This sentence now reads “... NK cells from H-2K<sup>b</sup> and H-2D<sup>b</sup> double deficient (KODO) mice”
(7) Results, line 192. Please re-word the following phrase: "missing-self is dominated by H-2K<sup>b</sup> in the C57BL/6 background", as it is unclear. Do you mean that H-2K<sup>b</sup> is protected from lysis as opposed to H-2D<sup>b</sup>?
We thank the reviewer for pointing this out, line 192 now reads: “..missing-self recognition in the C57BL/6 background depends on the absence of H-2K<sup>b</sup> rather than H-2D<sup>b</sup>.”
(8) Please briefly describe the Ncr1-Ly49A knockin procedure so that the reader understands the link between NKp46 and Ly49A expression without going to the earlier paper. Also, it needs to be mentioned that Ncr1 is the gene encoding NKp46.
Lines 201-205 now read: “To investigate the potential of a single inhibitory Ly49 receptor on mediating NK cell licensing and missing-self rejection, the Ly49KO mice were backcrossed to H-2D<sup>d</sup> transgenic KODO (D8-KODO) Ly49A KI mice that express Klra1 cDNA encoding the inhibitory Ly49A receptor in the Ncr1 locus encoding NKp46 and its cognate ligand H-2D<sup>d</sup> but not any other classical MHC-I molecules (Parikh et al., 2020).
In the materials and Methods section, the following has been added (lines 324-326):
“In Ly49A KI mice the stop codon of Ncr1 encoding NKp46 is replaced with a P2A peptide-cleavage site upstream of the Ly49A cDNA, while maintaining the 3’ untranslated region.”
(9) Figure 4C, legend. There is no CFSE staining in this experiment. Please correct.
We did label target cells with CFSE to distinguish them from host cells, to clarify we have done the following:
We have removed CFSE from figure legends of Figure 2 and 4.
We included the following on CFSE labeling in the Materials and Methods section (lines 377-379): “Target splenocytes were additionally labeled with CFSE to identify transferred target splenocytes from host cells.”
(10) Discussion, lines 262-270. This paragraph sounds as if data by Hanke et al. does not agree with the data presented in the paper. On the contrary, Hanke et al. demonstrate that Ly49C and Ly49I detectably bind to H-2K<sup>b</sup>, but poorly to H-2D<sup>b</sup>, supporting observations shown in Figure 2C.
We have clarified this issue and the paragraph now reads: “This is further supported by early studies using Ly49 transfectants binding to Con A blasts showing that Ly49C and Ly49I can bind to H-2D<sup>b</sup>-deficient but not H-2K<sup>b</sup>-deficient cells (Hanke et al., 1999), despite the caveat of testing binding to cells overexpressing Ly49s in these studies.”
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the current reviews.
Public Reviews:
Reviewer #1 (Public review):
Summary:
Zanetti et al use biophysical and cellular assays to investigate the interaction of the birnavirus VP3 protein with the early endosome lipid PI3P. The major novel finding is that association of the VP3 protein with an anionic lipid (PI3P) appears to be important for viral replication, as evidenced through a cellular assay on FFUs.
Strengths:
Support previously published claims that VP3 associates with early endosome membrane, potentially through binding to PI3P. The finding that mutating a single residue (R200) critically affects early endosome binding and that the same mutation also inhibits viral replication suggests a very important role for this binding in the viral life cycle.
Weaknesses:
The manuscript is relatively narrowly focused: the specifics of the bi-molecular interaction between the VP3 of an unusual avian virus and a host cell lipid (PIP3). Further, the affinity of this interaction is low and its specificity relative to other PIPs is not tested, leading to questions about whether VP3-PI3P binding is relevant.
Regarding the manuscript’s focus, we challenge the notion that studying a single bi-molecular interaction makes the scope of the paper overly narrow. This interaction—between VP3 and PI3P—plays a critical role in the replication of the birnavirus, which is the central theme of our work. Moreover, identifying and understanding such distinct interactions is a fundamental aspect of molecular virology, as they shed light on the precise mechanisms that viruses exploit to hijack the host cell machinery. Consequently, far from being narrowly focused, we believe our work contributes to the broader understanding of host-pathogen interactions.
As for the low affinity of the VP3-PI3P interaction, we argue that this is not a limitation but rather a biologically relevant feature. As discussed in the manuscript, the moderate strength of this interaction is likely critical for regulating the turnover rate of VP3/endosomal PI3P complexes, which in turn could optimize viral replication efficiency. A stronger affinity might trap VP3 on the endosomal membrane, whereas weaker interactions might reduce its ability to efficiently target PI3P. Thus, the observed affinity may reflect a fine-tuned balance that supports the viral life cycle.
With regard to specificity, we emphasize that in the context of the paper, we refer to biological specificity, which is not necessarily the same as chemical specificity. The binding of PI3P to early endosomes is “biologically” preconditioned by the distribution of PI3P within the cell. PI3P is predominantly localized in endosomal membranes, which “biologically precludes” interference from other PIPs due to their distinct cellular distributions. Moreover, while early endosomes also contain other anionic lipids, our work demonstrates that among these, PI3P plays a distinctive role in VP3 binding. This highlights its functional relevance in the context of early endosome dynamics.
Reviewer #3 (Public review):
Summary:
Infectious bursal disease virus (IBDV) is a birnavirus and an important avian pathogen. Interestingly, IBDV appears to be a unique dsRNA virus that uses early endosomes for RNA replication that is more common for +ssRNA viruses such as for example SARS-CoV-2. This work builds on previous studies showing that IBDV VP3 interacts with PIP3 during virus replication. The authors provide further biophysical evidence for the interaction and map the interacting domain on VP3.
Strengths:
Detailed characterization of the interaction between VP3 and PIP3 identified R200D mutation as critical for the interaction. Cryo-EM data show that VP3 leads to membrane deformation.
We thank the reviewer for the feedback.
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
Summary:
Zanetti et al. use biophysical and cellular assays to investigate the interaction of the birnavirus VP3 protein with the early endosome lipid PI3P. The major novel finding is that the association of the VP3 protein with an anionic lipid (PI3P) appears to be important for viral replication, as evidenced through a cellular assay on FFUs.
Strengths:
Supports previously published claims that VP3 may associate with early endosomes and bind to PI3P-containing membranes. The claim that mutating a single residue (R<sub>200</sub>) critically affects early endosome binding and that the same mutation also inhibits viral replication suggests a very important role for this binding in the viral life cycle.
Weaknesses:
The manuscript is relatively narrowly focused: one bimolecular interaction between a host cell lipid and one protein of an unusual avian virus (VP3-PI3P). Aspects of this interaction have been described previously. Additional data would strengthen claims about the specificity and some technical issues should be addressed. Many of the core claims would benefit from additional experimental support to improve consistency.
Indeed, our group has previously described aspects of the VP3-PI3P interaction, as indicated in lines 100-105 from the manuscript. In this manuscript, however, we present biochemical and biophysical details that have not been reported before about how VP3 connects with early endosomes, showing that it interacts directly with the PI3P. Additionally, we have now identified a critical residue in VP3—the R<sub>200</sub>—for binding to PI3P and its key role in the viral life cycle. Furthermore, the molecular dynamics simulations helped us come up with a mechanism for VP3 to connect with PI3P in early endosomes. This constitutes a big step forward in our understanding of how these "non-canonical" viruses replicate.
We have now incorporated new experimental and simulation data; and have carefully revised the manuscript in accordance with the reviewers’ recommendations. We are confident that these improvements have further strengthened the manuscript.
Reviewer #2 (Public Review):
Summary:
Birnavirus replication factories form alongside early endosomes (EEs) in the host cell cytoplasm. Previous work from the Delgui lab has shown that the VP3 protein of the birnavirus strain infectious bursal disease virus (IBDV) interacts with phosphatidylinositol-3-phosphate (PI3P) within the EE membrane (Gimenez et al., 2018, 2020). Here, Zanetti et al. extend this previous work by biochemically mapping the specific determinants within IBDV VP3 that are required for PI3P binding in vitro, and they employ in silico simulations to propose a biophysical model for VP3-PI3P interactions.
Strengths:
The manuscript is generally well-written, and much of the data is rigorous and solid. The results provide deep knowledge into how birnaviruses might nucleate factories in association with EEs. The combination of approaches (biochemical, imaging, and computational) employed to investigate VP3-PI3P interactions is deemed a strength.
Weaknesses:
(1) Concerns about the sources, sizes, and amounts of recombinant proteins used for co-flotation: Figures 1A, 1B, 1G, and 4A show the results of co-flotation experiments in which recombinant proteins (control His-FYVE v. either full length or mutant His VP3) were either found to be associated with membranes (top) or non-associated (bottom). However, in some experiments, the total amounts of protein in the top + bottom fractions do not appear to be consistent in control v. experimental conditions. For instance, the Figure 4A western blot of His-2xFYVE following co-flotation with PI3P+ membranes shows almost no detectable protein in either top or bottom fractions.
Liposome-based methods, such as the co-flotation assay, are well-established and widely regarded as the preferred approach for studying protein-phosphoinositide interactions. However, this approach is rather qualitative, as density gradient separation reveals whether the protein is located in the top fractions (bound to liposomes) or the bottom fractions (unbound). Our quantifications aim to demonstrate differences in the bound fraction between liposome populations with and without PI3P. Given the setting of the co-flotation assays, each protein-liposome system [2xFYVE-PI3P(-), 2xFYVE-PI3P(+), VP3-PI3P(-), or VP3-PI3P(+)] is assessed separately, and even if the experimental conditions are homogeneous, it is not surprising to observe differences in the protein level between different experiments. Indeed, the revised version of the manuscript includes membranes with more similar band intensities, as depicted in the new versions of Figures 1 and 4.
Reading the paper, it was difficult to understand which source of protein was used for each experiment (i.e., E. coli or baculovirus-expressed), and this information is contradicted in several places (see lines 358-359 v. 383-384). Also, both the control protein and the His-VP3-FL proteins show up as several bands in the western blots, but they don't appear to be consistent with the sizes of the proteins stated on lines 383-384. For example, line 383 states that His-VP3-FL is ~43 kDa, but the blots show triplet bands that are all below the 35 kDa marker (Figures 1B and 1G). Mass spectrometry information is shown in the supplemental data (describing the different bands for His-VP3-FL) but this is not mentioned in the actual manuscript, causing confusion. Finally, the results appear to differ throughout the paper (see Figures 1B v. 1G and 1A v. 4A).
Thank you for pointing out these potentially confusing points in the previous version of the manuscript. Indeed, we were able to produce recombinant VP3 from the two sources: Baculovirus and Escherichia coli. Initially, we opted for the baculovirus system, based on evidence from previous studies showing that it was suitable for ectopic expression of VP3. Subsequently, we successfully produced VP3 using Escherichia coli. On the other side, the fusion proteins His-2xFYVE and GST-2xFYVE were only produced in the prokaryotic system, also following previous reported evidence. We confirmed that VP3, produced in either system, exhibited similar behavior in our co-flotation and bio-layer interferometry (BLI) assays. However, the results of co-flotation and BLI assays shown in Figs. 1 and 4 were performed using the His-VP3 FL, His-VP3 FL R<sub>200</sub>D and His-VP3 FL DCt fusion proteins produced from the corresponding baculoviruses. We have clarified this in the revised version of our manuscript. Please, see lines 430-432.
Additionally, we have made clear that the His-VP3 FL protein purification yielded four distinct bands, and we confirmed their VP3 identity through mass spectrometry in the revised version of the manuscript. Please, see lines 123-124.
Finally, we replaced membranes for Figs. 4A and 1G (left panel) with those with more similar band intensities. Please, see the new version of Figures 1 and 4.
(2) Possible "other" effects of the R<sub>200</sub>D mutation on the VP3 protein. The authors performed mutagenesis to identify which residues within patch 2 on VP3 are important for association with PI3P. They found that a VP3 mutant with an engineered R<sub>200</sub>D change (i) did not associate with PI3P membranes in co-floatation assays, and (ii) did not co-localize with EE markers in transfected cells. Moreover, this mutation resulted in the loss of IBDV viability in reverse genetics studies. The authors interpret these results to indicate that this residue is important for "mediating VP3-PI3P interaction" (line 211) and that this interaction is essential for viral replication. However, it seems possible that this mutation abrogated other aspects of VP3 function (e.g., dimerization or other protein/RNA interactions) aside from or in addition to PI3P binding. Such possibilities are not mentioned by the authors.
The arginine amino acid at position 200 of VP3 is not located in any of the protein regions associated with its other known functions: VP3 has a dimerization domain located in the second helical domain, where different amino acids across the three helices form a total of 81 interprotomeric close contacts; however, R<sub>200</sub> is not involved in these contacts (Structure. 2008 Jan;16(1):29-37, doi:10.1016/j.str.2007.10.023); VP3 has an oligomerization domain mapped within the 42 C-terminal residues of the polypeptide, i.e., the segment of the protein composed by the residues at positions 216-257 (J Virol. 2003 Jun;77(11):6438–6449, doi: 10.1128/jvi.77.11.6438-6449.2003); VP3’s ability to bind RNA is facilitated by a region of positively-charged amino acids, identified as P1, which includes K<sub>99</sub>, R<sub>102</sub>, K<sub>105</sub>, and K<sub>106</sub> (PLoS One. 2012;7(9):e45957, doi: 10.1371/journal.pone.0045957). Furthermore, our findings indicate that the R<sub>200</sub>D mutant retains a folding pattern similar to the wild-type protein, as shown in Figure 4B. All these lead us to conclude that the loss of replication capacity of R<sub>200</sub>D viruses results from impaired, or even loss of, VP3-PI3P interaction.
We agree with the reviewer that this is an important point and have accordingly addressed it in the Discussion section of the revised manuscript. Please, see lines 333-346.
(3) Interpretations from computational simulations. The authors performed computational simulations on the VP3 structure to infer how the protein might interact with membranes. Such computational approaches are powerful hypothesis-generating tools. However, additional biochemical evidence beyond what is presented would be required to support the authors' claims that they "unveiled a two-stage modular mechanism" for VP3-PI3P interactions (see lines 55-59). Moreover, given the biochemical data presented for R<sub>200</sub>D VP3, it was surprising that the authors did not perform computational simulations on this mutant. The inclusion of such an experiment would help tie together the in vitro and in silico data and strengthen the manuscript.
We acknowledge that the wording used in the previous version of the manuscript may have overstated the "unveiling" of the two-stage binding mechanism of VP3. Our intention was to propose a potential mechanism, that is consistent both with the biophysical experiments and the molecular simulations. In the revised version of the manuscript, we have tempered these claims and framed them more appropriately.
Regarding the simulations for the R<sub>200</sub>D VP3 mutant, these simulations were indeed performed and included in the original manuscript as part of Figure S14 in the Supplementary Information. However, we realize that this was not sufficiently emphasized in the main text, an oversight on our part. We have now revised the manuscript to highlight these results more clearly.
Additionally, to further strengthen the connection between experimental and simulation trends, we have now included a new figure in the Supplementary Information (Figure S15). This figure depicts the binding energy of VP3 ΔNt and two of its mutants, VP3 ΔNt R<sub>200</sub>D and VP3 ΔNt P2 Mut, as a function of salt concentration. The results show that as the number of positively charged residues in VP3 is systematically reduced, the binding of the protein to the membrane becomes weaker. The effect is more pronounced at lower salt concentrations, which highlights the weight of electrostatic forces on the adsorption of VP3 on negatively charged membranes. Please, see Supplementary Information (Figure S15).
Reviewer #3 (Public Review):
Summary:
Infectious bursal disease virus (IBDV) is a birnavirus and an important avian pathogen. Interestingly, IBDV appears to be a unique dsRNA virus that uses early endosomes for RNA replication that is more common for +ssRNA viruses such as for example SARS-CoV-2.
This work builds on previous studies showing that IBDV VP3 interacts with PIP3 during virus replication. The authors provide further biophysical evidence for the interaction and map the interacting domain on VP3.
Strengths:
Detailed characterization of the interaction between VP3 and PIP3 identified R<sub>200</sub>D mutation as critical for the interaction. Cryo-EM data show that VP3 leads to membrane deformation.
Weaknesses:
The work does not directly show that the identified R<sub>200</sub> residues are directly involved in VP3-early endosome recruitment during infection. The majority of work is done with transfected VP3 protein (or in vitro) and not in virus-infected cells. Additional controls such as the use of PIP3 antagonizing drugs in infected cells together with a colocalization study of VP3 with early endosomes would strengthen the study.
In addition, it would be advisable to include a control for cryo-EM using liposomes that do not contain PIP3 but are incubated with HIS-VP3-FL. This would allow ruling out any unspecific binding that might not be detected on WB.
The authors also do not propose how their findings could be translated into drug development that could be applied to protect poultry during an outbreak. The title of the manuscript is broad and would improve with rewording so that it captures what the authors achieved.
In previous works from our group, we demonstrated the crucial role of the VP3 P2 region in targeting the early endosomal membranes and for viral replication, including the use of PI3K inhibitors to deplete PI3P, showing that both the control RFP-2xFYVE and VP3 lost their ability to associate with the early endosomal membranes and reduces the production of an infective viral progeny (J Virol. 2018 May 14;92(11):e01964-17, doi: 10.1128/jvi.01964-17; J Virol. 2021 Feb 24;95(6):e02313-20, doi: 10.1128/jvi.02313-20). In the present work, to further characterize the role of R<sub>200</sub> in binding to early endosomes and for viral replication, we show that: i) the transfected VP3 R<sub>200</sub>D protein loses the ability to bind to early endosomes in immunofluorescence assays (Figure 2E and Figure 3); ii) the recombinant His-VP3 FL R<sub>200</sub>D protein loses the ability to bind to liposomes PI3P(+) in co-flotation assays (Figure 4A); and, iii) the mutant virus R<sub>200</sub>D loses replication capacity (Figure 4C).
Regarding the cryo-electron microscopy observation, we verified that there is no binding of gold particles to liposomes PI3P(-) when they are incubated solely with the gold-particle reagent, or when they are pre-incubated with the gold-particle reagent with either His-2xFYVE or His-VP3 FL. We have incorporated a new panel in Figure 1C showing a representative image of these results. Please, see lines 143-144 in the revised version of our manuscript and our revised version of Figure 1C.
We have replaced the title of the manuscript by a more specific one. Thus, our current is " On the Role of VP3-PI3P Interaction in Birnavirus Endosomal Membrane Targeting".
Regarding the question of how our findings could be translated into drug development, indeed, VP3-PI3P binding constitutes a good potential target for drugs that counteract infectious bursal disease. However, we did not mention this idea in the manuscript, first because it is somewhat speculative and second because infected farms do not implement any specific treatment. The control is based on vaccination.
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
Critical issues to address:
(1) The citations in the important paragraph on lines 101-5 are not identifiable. These references are described as showing that VP3 is associated with EEs via P2 and PI3P, which is basically what this paper also shows. The significant advance here is unclear.
We apologize for this mistake. These citations are identifiable in the revised version of the manuscript (lines 100-105). As mentioned before, in this manuscript we present biochemical and biophysical details that have not been reported before about how VP3 connects with early endosomes, showing that it interacts directly with the PI3P. Additionally, we have now identified a critical residue in VP3 P2—the R<sub>200</sub>—for binding to PI3P and its key role in the viral life cycle. Furthermore, the molecular dynamics simulations helped us come up with a mechanism for VP3 to connect with PI3P in early endosomes. This constitutes a big step forward in our understanding of how these "non-canonical" viruses replicate.
(2) Even if all the claims were to be clearly supported through major revamping, authors should make the significance of knowing that this protein binds to early endosomes through PI3P more clear?
Thank you for the recommendation, which aligns with a similar suggestion from Reviewer #2. In response, we have revised the significance paragraph to emphasize the mechanistic aspects of our findings. Please refer to lines 62–67 in the revised manuscript.
(3) Flotation assay shows binding, but this is not quantitative. An estimate of a Kd would be useful. BLI experiments suggest that half of the binding disappears at 0.5 mM, implying a very low binding affinity.
We agree with the reviewer that our biophysical and molecular simulation results suggest a specific but weak interaction of VP3 with PI3P bearing membranes. Indeed, our previous version of the manuscript already contained a paragraph in this regard. Please, see lines 323-332 in the revised version of the manuscript.
From a biological point of view, a low binding affinity of VP3 for the endosomes may constitute an advantage for the virus, in the sense that its traffic through the endosomes may be short lived during its infectious cycle. Indeed, VP3 has been demonstrated to be a "multifunctional" protein involved in several processes of the viral cycle (detailed in lines 84-90), and in our laboratory we have shown that the Golgi complex and the endoplasmic reticulum are organelles where further viral maturation occurs. Taking all of this into account, a high binding affinity of VP3 for endosomes could result in the protein becoming trapped on the endosomal membrane, potentially hindering the progression of the viral infection within the host cell.
(4) There are some major internal inconsistencies in the data: Figure 1B quantifies VP3-FL T/B ratio ~4 (which appears inconsistent with the image shown, as the T lanes are much lighter than the B) whereas apparently the same experiment in Figure 1G shows it to be ~0.6. With the error bars shown, these results would appear dramatically different from each other, despite supposedly measuring the same thing. The same issue with the FYVE domain between Figures 1A and 4A.
We appreciate the reviewer’s comment, as it made us aware of an error in Figure 1B. There, the mean value for the VP3-FL Ts/B ratio is 3.0786 for liposomes PI3P(+) and 0.4553 for liposomes PI3P(-) (Please, see the new bar graph on Figure 1B). This may have occurred because, due to the significance of these experiments, we performed multiple rounds of quantification in search of the most suitable procedure for our observations, leading to a mix-up of data sets. Anyway, it’s possible that these corrected values still seem inconsistent given that T lanes are much lighter than the B for VP3-FL in the image shown. Flotation assays are quite labor-intensive and, at least in our experience, yield fairly variable results in terms of quantification. To illustrate this point, the following image shows the three experiments conducted for Figure 1B, where it is clear that, despite producing visually distinct images, all three yielded the same qualitative observation. For Figure 1B, we chose to present the results from experiment #2. However, all three experiments contributed to a Ts/B ratio of 3.0786 for His-VP3 FL, which may account for the apparent inconsistency when focusing solely on the image in Figure 1B.
Author response image 1.
We acknowledge that, at first glance, some inconsistencies may appear in the results, and we have thoroughly discussed the best approach for quantification. However, we believe the observations are robust in terms of reproducibility and reliable, as the VP3-PI3P interaction was consistently validated by comparison with liposomes lacking PI3P, where no binding was observed.
(5) Comparison of PA (or PI) to PI3P at the same molar concentration is inappropriate because PI3P has at least double charge. The more interesting question about specificity would be whether PI45P2 (or even better PI35P2) binds or not. Without this comparison, no claim to specificity can be made.
For us, "specificity" refers to the requirement of a phosphoinositide in the endosomal membrane for VP3 binding. Phosphoinositides have a conspicuous distribution among cellular compartments, and knowing that VP3 associates with early endosomes, our specificity assays aimed to demonstrate that PI3P is strictly required for the binding of VP3. To validate this, we used PI (lacking the phosphate group) and PA (lacking the inositol group) despite their similar charges. In spite of the potential chemical interactions between VP3 and various phosphoinositides, our experimental results suggest that the virus specifically targets endosomal membranes by binding to PI3P, a phosphoinositide present only in early endosomes.
That said, we agree with the reviewer’s point and consider adequate to smooth our specificity claim in the manuscript as follows: “We observed that His-VP3 FL bound to liposomes PI3P(+), but not to liposomes PA or PI, reinforcing the notion that a phosphoinositide is required since neither a single negative charge nor an inositol ring are sufficient to promote VP3 binding to liposomes (SI Appendix, Fig. S2)” (Lines 136-139).
(6) In the EM images, many of the gold beads are inside the vesicles. How do they cross the membranes?
They do not cross the membrane. Our EM images are two-dimensional projections, meaning that the gold particles located on top or beneath the plane appear to be inside the liposome.
(7) Images in Figure 2D are very low quality and do not show the claimed difference between any of the mutants. All red signal looks basically cytosolic in all images. It is not clear what criteria were used for the quantification in Figure 2E. The same issue is in Figure 2E, where no red WT puncta are observable at all. Consistently, there is minimal colocalization in the quantification in Figure S3, which appears to show no significant differences between any of the mutants, in direct contradiction to the claim in the manuscript.
We apologize for the poor quality of panels in Figures 2D and 2E. Unfortunately, this was due to the PDF conversion of the original files. Please, check the high-quality version of Figure 2. As suggested by reviewers #2 and #3, we have incorporated zoomed panels, which help the reader to better see the differences in distribution.
As mentioned in the legend to Figure 2, the quantification in Figure 2D was performed by calculating the percentage of cells with punctuated fluorescent red signal (showing VP3 distribution) for each protein. The data were then normalized to the P2 WT protein, which is the VP3 wild type.
Figure S3 certainly shows a tendency which positively correlates with the results shown in Figure 3, where we used FYVE to detect PI3P on endosomes and observed significantly less co-localization when VP3 bears its P2 region all reversed or lacks the R<sub>200</sub>
(8) The only significant differences in colocalization are in Figure 3B, whose images look rather dramatically different from the rest of the manuscript, leading to some concern about repeatability. Also, it is unclear how colocalization is quantified, but this number typically cannot be above 1. Finally, it is unclear what is being colocalized here: with three fluorescent components, there are 3 possible binary colocalizations and an additional ternary colocalization.
We thank the reviewer for pointing out those aspects related to Figure 3. The experiments performed for Figure 3B were conducted by a collaborator abroad handling the purified GST-2xFYVE, which recognizes endogenous PI3P, while the rest of the cell biology experiments were conducted in our laboratory in Argentina. This is why they are aesthetically different. We have made an effort in homogenizing the way they look for the revised version of the manuscript. Please, see the new version of Figure 3.
For quantification of the co-localization of VP3 and EGFP-2xFYVE (Figure 3A), the Manders M2 coefficient was calculated out of approximately 30 cells per construct and experiment. The M2 coefficient, which reflects co-localization of signals, is defined as the ratio of the total intensities of magenta image pixels for which the intensity in the blue channel is above zero to the total intensity in the magenta channel. JACoP plugin was utilized to determine M2. For VP3 puncta co-distributing with EEA1 and GST-FYVE (Figure 3B), the number of puncta co-distributing for the three signals was manually determined out of approximately 40 cells per construct and experiment per 200 µm². We understand that Manders or Pearson coefficients, typically ranging between 0 and 1, is the most commonly used method to quantify co-localizing immunofluorescent signals; however, this “manual” method has been used and validated in previous published manuscripts [Figures 3 and 7 from (Morel et al., 2013); Figure 7 in (Khaldoun et al., 2014); and Figure 4 in (Boukhalfa et al., 2021)].
(9) SegA/B plasmids are not introduced, and it is not clear what these are or how this assay is meant to work. Where are the foci forming units in the images of Figure 4C? How does this inform on replication? Again, this assay is not quantitative, which is essential here: does the R<sub>200</sub> mutant completely kill activity (whatever that is here)? Or reduce it somewhat?
We apologize for the missing information. Segments A and B are basically the components of the IBDV reverse genetics system. For their construction, we used a modification of the system described by Qi and coworkers (Qi et al., 2007), in which the full length sequences of the IBDV RNA segments A and B, flanked by a hammerhead ribozyme at the 5’-end and the hepatitis delta ribozyme at the 3’-end, were expressed under the control of an RNA polymerase II promoter within the plasmids pCAGEN.Hmz.SegA.Hdz (SegA) and pCAGEN.Hmz.SegB.Hdz (SegB). For this specific experiment we generated a third plasmid, pCAGEN.Hmz.SegA.R<sub>200</sub>D.Hdz (SegA.R<sub>200</sub>D), harboring a mutant version of segment A cDNA containing the R<sub>200</sub>D substitution. Then, QM7 cells were transfected with the plasmids SegA, SegB or Seg.R<sub>200</sub>D alone (as controls) or with a mixture of plasmids SegA+SegB (wild type situation) or SegA.R<sub>200</sub>D+SegB (mutant situation). At 8 h post transfection (p.t.), when the new viruses have been able to assemble starting from the two segments of RNA, the cells were recovered and re-plated onto fresh non-transfected cells for revealing the presence (or not) of infective viruses. At 72 h post-plating, the generation of foci forming units (FFUs) was revealed by Coomassie staining. As expected, single-transfections of SegA, SegB or Seg.R<sub>200</sub>D did not produce FFUs and, as shown in Figure 4C, the transfection of SegA+SegB produced detectable FFUs (the three circles in the upper panel) while no FFUs (the three circles in the lower panel) were detected after the transfection of SegA.R<sub>200</sub>D+SegB (Figure 4C). This system is quantitative, since the FFUs detected 72 h post-plating are quantifiable by simply counting the FFUs. However, since no FFUs were detected after the transfection of SegA.R<sub>200</sub>D+SegB, evidenced by a complete monolayer of cells stained blue, we did not find any sense in quantifying. In turn, this drastic observation indicates that viruses bearing the VP3 R<sub>200</sub>D mutation lose their replication ability (is “dead”), demonstrating its crucial role in the infectious cycle.
We agree with the reviewer that a better explanation was needed in the manuscript, so we have incorporated a paragraph in the results section of our revised version of the manuscript (lines 209-219).
(10) Why pH 8 for simulation?
The Molecular Theory calculations were performed at pH 8 for consistency with the experimental conditions used in our biophysical assays. These biophysical experiments were also performed at pH 8, following the conditions established in the original study where VP3 was first purified for crystallization (DOI: 10.1016/j.str.2007.10.023).
(11) There is minimal evidence for the sequential binding model described in the abstract. The simulations do not resolve this model, nor is truly specific PI3P binding shown.
In response to your concerns, we would like to emphasize that our simulations provide robust evidence supporting the two more important aspects of the sequential binding model: 1) Membrane Approach: In all simulations, VP3 consistently approaches the membrane via its positively charged C-terminal (Ct) region. 2) PI3P Recruitment: Once the protein is positioned flat on the membrane surface, PI3P is unequivocally recruited to the positively charged P2 region. The enrichment of PI3P in the proximity to the protein is clearly observed and has been quantified via radial distribution functions, as detailed in the manuscript and supplementary material.
While we understand that opinions may vary on the sufficiency of the data to fully validate the model, we believe the results offer meaningful insights into the proposed binding mechanism. That said, we acknowledge that the specificity of VP3 binding may not be restricted solely to PI3P but could extend to phosphoinositides in general. To address this, we performed the new set of co-flotation experiments which are discussed in detail in our response to point 5.
Reviewer #2 (Recommendations For The Authors):
(1) Line 1: Consider changing the title to better reflect the mostly biochemical and computational data presented in the paper: "Mechanism of Birnavirus VP3 Interactions with PI3P-Containing Membranes". There are no data to show hijacking by a virus presented.
We appreciate this recommendation, which was also expressed by reviewer #3. Additionally, we thank for the suggested title. We have replaced the title of the manuscript by a more specific one. Thus, our current is
"On the Role of VP3-PI3P Interaction in Birnavirus Endosomal Membrane Targeting".
(2) Lines 53-54 and throughout: Consider rephrasing "demonstrate" to "validate" to give credit to Gimenez et al., 2018, 2022 for discovery.
Thanks for the suggestion. We have followed it accordingly. Please see line 52 from our revised version of the manuscript.
(3) Line 56-59 and throughout: Consider tempering and rephrasing these conclusions that are based mostly on computational data. For example, change "unveil" to "suggest" or another term.
We have now modified the wording throughout the manuscript.
(4) The abstract could also emphasize that this study sought to map the resides within VP3 that are important for P13P interaction.
Thanks for the suggestion. We have followed it accordingly. Please, see lines 53-55 from our revised version of the manuscript.
(5) Lines 63-69: This Significance paragraph seems tangential. The findings in this paper aren't at all related to the evolutionary link between birnaviruses and positive-strand RNA viruses. The significance of the work for me lies in the deep biochemical/biophysical insights into how a viral protein interacts with membranes to nucleate its replication factory.
We have re-written the significance paragraph highlighting the mechanistic aspect of our findings. Please, see lines 62-67 in our revised version of the manuscript.
(6) Line 74: Please define "IDBV" abbreviation.
We apologize for the missing information. We have defined the IBDV abbreviation in our revised version of the manuscript (please, see line 73).
(7) Line 88: Please define "pVP2" abbreviation.
We apologize for the missing information. We have defined the pVP2 abbreviation in our revised version of the manuscript (please, see line 87).
(8) Lines 101-105: Please change references (8, 9, 10) to be consistent with the rest of the manuscript (names, year).
We apologize for this mistake. These citations are identifiable and consistent in the revised version of the manuscript (lines 100-105).
(9) Line 125: For a broad audience, consider explaining that recombinant His-2xFYVE domain is known to exhibit PI3P-binding specificity and was used as a positive control.
Thanks for the recommendation. We have incorporated a brief explanation supporting the use of His-2xFYVE as a positive control in our revised version of the manuscript. Please, see lines 127-129.
(10) Lines 167-171: The quantitative data in Figure S3 shows that there was a non-significant co-localization coefficient of the R<sub>200</sub>D mutant. For transparency, this should be stated in the Results section when referenced.
We agree with this recommendation. We have clearly mentioned it in the revised version of the manuscript. Please, see lines 177-179. Also, we have referred this fact when introducing the assays performed using the purified GST-2xFYVE, shown in Figure 3. Please, see lines 182-184.
(11) Lines 156 and 173: These Results section titles have nearly identical wording. Consider rephrasing to make it distinct.
We agree with the reviewer’s observation. In fact, we sought to do it on purpose as for them to be a “wordplay”, but we understand that could result in a awkwarded redundancy. So, in the revised version of the manuscript, both titles are:
Role of VP3 P2 in the association of VP3 with the EE membrane (line 163).
VP3 P2 mediates VP3-PI3P association to EE membranes (line 182).
(12) Line 194: Is it alternatively possible that the R<sub>200</sub>D mutant lost its capacity to dimerize, and that in turn impacted PI3P interaction?
Thanks for the relevant question. VP3 was crystallized and its structure reported in (Casañas et al., 2008) (DOI: 10.1016/j.str.2007.10.023). In that report, the authors showed that the two VP3 subunits associate in a symmetrical manner by using the crystallographic two-fold axes. Each subunit contributes with its 30% of the total surface to form the dimer, with 81 interprotomeric close contacts, including polar bonds and van der Waals contacts. The authors identified the group of residues involved in these interactions, among which the R<sub>200</sub> is not included. Addittionally, the authors determined that the interface of the VP3 dimer in crystals is biologically meaningful (not due to the crystal packing).
To confirm that the lack of binding was not due to misfolding of the mutant, we compared the circular dichroism spectra of mutant and wild type proteins, without detecting significant differences (shown in Figure 4B). These observations do not exclude the possibility mentioned by the reviewer, but constitute solid evidences, we believe, to validate our observations.
(13) Lines 231-243: Consider changing verbs to past tense (i.e., change "is" to "was") for the purposes of consistency and tempering.
Thanks for the recommendation, we have proceeded as suggested. Please, see lines 249-262 in our revised version of the manuscript.
(14) Lines 306-308: Is there any information about whether it is free VP3 (v. VP3 complexed in RNP) that binds to membrane? I am just trying to wrap my head around how these factories form during infection.
Thanks for pointing this out. We first observed that in infected cell, all the components of the RNPs [VP3, VP1 (the viral polymerase) and the dsRNA] were associated to the endosomes. Since by this moment it had been already elucidated that VP3 "wrapped" de dsRNA within the RNPs (Luque et al., 2009) (DOI: 10.1016/j.jmb.2008.11.029), we sought that VP3 was most probably leading this association. We answered yes after studying its distribution, also endosome-associated, when ectopically expressed. These results were published in (Delgui et al., 2013) (DOI: 10.1128/jvi.03152-12).
Thus, in our subsequent studies, we have worked with both, the infection-derived or the ectopically expressed VP3, to advance in elucidating the mechanism by which VP3 hijacks the endosomal membranes and its relevancy for viral replication, reported in this current manuscript.
(15) Lines 320-334: This last paragraph discussing evolutionary links between birnaviruses and positive-strand RNA viruses seems tangential and distracting. Consider reducing or removing.
Thanks for highlighting this aspect of our work. Maybe difficult to follow, but in the context of other evidences reported for the Birnaviridae family of viruses, we strongly believe that there is an evolutionary aspect in having observed that these dsRNA viruses replicate associated to membranous organelles, a hallmark of +RNA viruses. However, we agree with the reviewer that this might not be the main point of our manuscript, so we reduced this paragraph accordingly. Please, see lines 358-367 in our revised version of the manuscript.
(16) Lines 322-324: Change "RdRd" to "RdRp" if keeping paragraph.
Thanks. We have corrected this mistake in lines 360 and 361.
(17) Figures 1A, 1B, and throughout: Again, please check and explain protein sizes and amounts. This would improve the clarity of the manuscript.
All our flotation assays were performed using 1 mM concentration of purified protein in a final volume of 100 mL (mentioned in M&M section). The complete fusion protein His-2xFYVE (shown in Figs. 1A and 4A left panel) is 954 base pairs-long and contains 317 residues (~35 kDa). The complete fusion protein His-VP3 FL (shown in Figs. 1B and 1G left panel) is 861 base pairs-long and contains 286 residues (~32 kDa). The complete fusion protein His-VP3 DCt (shown in Fig. 1G, right panel) is 753 bp-long and contains 250 residues (~28 kDa). The complete fusion protein His-VP3 FL R<sub>200</sub>D (shown in Fig. 4A right panel) is 861 bp-long and contains 286 residues (~32 kDa). This latter information was incorporated in our revised version of the manuscript. Please, see lines 381-382, 396-397 and 399-400 from the M&M section, and lines in the corresponding figure legends.
(18) Figures 1B and 1G show different results for PI3P(+) membranes. I see protein associated with the top fraction in 1B, but I don't see any such result in 1G.
As already mentioned, liposome-based methods, such as the co-flotation assay, are well-established and widely regarded as the preferred approach for studying protein-phosphoinositide interactions. However, this approach is rather qualitative, as density gradient separation reveals whether the protein is located in the top fractions (bound to liposomes) or the bottom fractions (unbound). Our quantifications aim to demonstrate differences in the bound fraction between liposome populations with and without PI3P. Given the setting of the co-flotation assays, each protein-liposome system [2xFYVE-PI3P(-), 2xFYVE-PI3P(+), VP3-PI3P(-), or VP3-PI3P(+)] is assessed separately, and even if the conditions are homogeneous, it’s not surprising to observe differences in the protein level between each one. Indeed, the revised version of the manuscript include a membrane for Figure 1G, were His-VP3 FL associated with the top fraction is more clear. Please, see the new version of Figure 1G.
(19) Figure 1C: Please include cryo-EM images of the liposome PI3P(-) variables to assess the visual differences of the liposomal membranes under these conditions.
Thanks for the recommendation. it has been verified that there is no binding of gold particles to liposomes PI3P(-) when they are incubated solely with the gold-particle reagent, or when they are pre-incubated with the gold-particle reagent with either His-2xFYVE or His-VP3 FL. We have incorporated a new panel in Figure 1C showing a representative image of these results. Please, see lines 143-144 in the revised version of our manuscript and our revised version of Figure 1C.
(20) Figures 2D, 2E, and 3A: The puncta are not obvious in these images. Consider adding Zoomed panels.
We apologize for this aspect of Figures 2 and 3, also highlighted by reviewer #1. We believe that this was due to the low quality resulting from the PDF conversion of the original files. For Figure 3A, we have homogenized its aspect with those from 3B. Regarding Figure 2, we have incorporated zoomed panels, as suggested. Please, see the revised versions of both Figures.
(21) Figure 4A: There is almost no protein in the control PI3P(+) blot. Why? Also, the quantification shows no significant membrane association for this control. This result is different from Figure 1A and very confusing (and concerning).
We apologize for the confusion. We replaced membranes for Figure 4A (left panel) with more similar band intensities to that shown in Figure 1A. Please, visit our new version of Figure 4. The quantification shows no significant difference in the association to liposomes PI3P(+) compared to liposomes PI3P(+); it’s true and this is due to, once more, the intrinsically lack of homogeneity of co-flotation assays. However, this one shown in Figure 4A is a redundant control (has been shown in Figure 1A) and we believe that the new membrane is qualitative eloquent.
Reviewer #3 (Recommendations For The Authors):
(1) Overall, the title is general and does not summarize the study. I recommend making the title more specific. The current title is better suited for a review as opposed to a research article. This study provides further biophysical details on the interaction. This should be reflected in the title.
We appreciate this recommendation, which was also expressed by reviewer #2. We have chosen a new title for the manuscript: “On the Role of VP3-PI3P Interaction in Birnavirus Endosomal Membrane Targeting”.
(2) References 8,9,10 are important but they were not correctly cited in the work, this should be corrected.
We apologize for this mistake. These citations are identifiable in our revised version of the manuscript. See lines 100-105.
(3) Flotation experiments and cryo-EM convincingly show that VP3 binds to membranes in a PIP3-dependent manner. However, it would be advisable to include a control for cryo-EM using liposomes that do not contain PIP3 but are incubated with HIS-VP3-FL. This would allow us to rule out any unspecific binding that might not be detected on WB.
Thanks for the advice, also given by reviewer #2. We confirmed that no gold particles were bound on liposomes PI3P(-) even when incubated with the Ni-NTA reagent alone or pre-incubated with His-2xFYVE of His-VP3 FL. We have incorporated a new panel to Figure 1C showing a representative image of these results. Please, see lines 143-144 in the revised version of the manuscript and see the revised version of Figure 1C.
(4) It is not clear what is the difference between WB in B and WB in G. Figure 1G seems to show the same experiment as shown in B, is this a repetition? In both cases, plots next to WBs show quantification with bars, do they represent STD or SEM? Legend A mentions significance p>0.01 (**) but the plot shows ***. This should be corrected.
The Western blot membrane in Figure 1B shows the result of co-flotation assay using His-VP3 FL protein, while the Western blot membrane in Figure 1G (left panel) shows a co-flotation assay using His-VP3 FL protein as a positive control. In another words, in 1B the His-VP3 FL protein is the question while in 1G (left panel) it’s the co-flotation positive control for His-VP3 DCt. The bar plots next to Western blots show quantification, the mean and the STD. Thanks for highlighting this inconsistency. We have now corrected it on the revised version of the manuscript.
(5) It would be useful to indicate positively charged residues and P2 on the AF2 predicted structure in Fig 1.
These are indicated in panels A and B of Figure 2.
(6) Figure 1 legend: Change cryo-fixated liposomes to cryo-fixation or better to "liposomes were vitrified". There is a missing "o" in the cry-fixation in the methods section.
Thanks for the recommendation. We have modified Figure 1. legend to "liposomes were vitrified" (line 758), and fixed the word cryo-fixation in the methods section (line 512).
(7) Figure 2B. It is not clear how the punctated phenotype was unbiasedly characterized (Figure 2D). I see no difference in the representative images. Magnified images should be shown. This should be measured as colocalization (Pearson's and Mander's coefficient) with an early endosomal marker Rab5. Perhaps this figure could be consolidated with Figure 3.
Unfortunately, the lack of clarity in Figure 2D was due to the PDF conversion of the original files. Please, observe the high-quality original image above in response to reviewer #1, where we have additionally included zoomed panels, as also suggested by the other reviewers. For quantification of the co-localization of VP3 and either EGFP-Rab5 orEGFP-2xFYVE, the Manders M2 coefficient was calculated out of approximately 30 cells per construct and experiment and were shown in Figure S3 and Figure 3A, respectively, in our previous version of the manuscript.
(8) PIP3 antagonist drugs should be used to further substantiate the results. If PIP3 specifically recruits VP3, this interaction should be abolished in the presence of PIP3 drug and VP3 should show a diffused signal.
We certainly agree with this point. These experiments were performed and the results were reported in (Gimenez et al., 2020). Briefly, in that work, we blocked the synthesis of PI3P in QM7 cells in a stable cell line overexpressing VP3, QM7-VP3, with either the pan-PI3Kinase (PI3K) inhibitor LY294002, or the specific class III PI3K Vps34 inhibitor Vps34-IN1. In Figure 4, we showed that 98% of the cells treated with these inhibitors had the biosensor GFP-2FYVE dissociated from EEs, evidencing the depletion of PI3P in EEs (Figure 4A). In QM7-VP3 cells, we showed that the depletion of PI3P by either inhibitor caused the dissociation of VP3 from EEs and the disaggregation of VP3 puncta toward a cytosolic distribution (Figure 4B). Moreover, since this observation was crucial for our hipothesis, these results were further confirmed with an alternative strategy to deplete PI3P in EEs. We employed a system to inducibly hydrolyze endosomal PI3P through rapamycin-induced recruitment of the PI3P-myotubularin 1 (MTM1) to endosomes in cells expressing MTM1 fused to the FK506 binding protein (FKBP) and the rapamycin-binding domain fused to Rab5, using the fluorescent proteins mCherry-FKBP-MTM1 and iRFP-FRB-Rab5, as described in (Hammond et al., 2014). These results, shown in Figures 5, 6 and 7 in the same manuscript, further reinforced the notion that PI3P mediates and is necessary for the association of VP3 protein with EEs.
(9) The authors should show the localization of VP3 in IBDV-infected cells and treat cells with PI3P antagonists. The fact that R<sub>200</sub> is not rescued does not necessarily mean that this is because of the failed interaction with PI3P. As the authors wrote in the discussion: VP3 bears multiple essential roles during the viral life cycle (line 305).
Indeed, after having confirmed that the VP3 lost its localization associated to the endosomes after the treatment of the cells with PI3P antagonists, we demonstrated that depletion of PI3P significantly reduced the production of IBDV progeny. For this aim, we used two approaches, the inhibitor Vps34-IN1 and an siRNA against VPs34. In both cases, we observed a significantly reduced production of IBDV progeny (Figures 9 and 10). Specifically related to the reviewer’s question, the localization of VP3 in IBDV-infected cells and treated with PI3P antagonists was shown and quantified in Figure 9a.
(10) Could you provide adsorption-free energy profiles and MD simulations also for the R<sub>200</sub> mutant?
Following the reviewer’s suggestion, we have added a new figure to the supplementary information (Figure S15). Instead of presenting a full free-energy profile for each protein, we focused on the adsorption free energy (i.e., the minimum of the adsorption free-energy profile) for VP3 ΔNt and its mutants, VP3 ΔNt R<sub>200</sub>D and VP3 ΔNt P2 Mut, as a function of salt concentration. The aim was to compare the adsorption free energy of the three proteins and evaluate the effect of electrostatic forces on it, which become increasingly screened at higher salt concentrations. As shown in the referenced figure, reducing the number of positively charged residues from VP3 ΔNt to VP3 ΔNt P2 Mut systematically weakens the protein’s binding to the membrane. This effect is particularly pronounced at lower salt concentrations, underscoring the importance of electrostatic interactions in the adsorption of the negatively charged VP3 onto the anionic membrane.
(11) Liposome deformations in the presence of VP3 are interesting (Figure 6G), were these also observed in Figure 1C?
Good question. The liposome deformations in the presence of VP3 shown in Figure 6G were a robust observation since, as mentioned, it was detectable in 36% of the liposomes PI3P(+), while they were completely absent in PI3P(-) liposomes. However, and unfortunately, the same deformations were not detectable in experiments performed using gold particles shown in Figure 1C. In this regard, we think that it might be possible that the procedure of gold particles incubation itself, or even the presence of the gold particles in the images, would somehow “mask” the deformations effect.
Bibliography
Boukhalfa A, Roccio F, Dupont N, Codogno P, Morel E. 2021. The autophagy protein ATG16L1 cooperates with IFT20 and INPP5E to regulate the turnover of phosphoinositides at the primary cilium. Cell Rep 35:109045. doi:10.1016/j.celrep.2021.109045
Casañas A, Navarro A, Ferrer-Orta C, González D, Rodríguez JF, Verdaguer N. 2008. Structural Insights into the Multifunctional Protein VP3 of Birnaviruses. Structure 16:29–37. doi:10.1016/j.str.2007.10.023
Delgui LR, Rodriguez JF, Colombo MI. 2013. The Endosomal Pathway and the Golgi Complex Are Involved in the Infectious Bursal Disease Virus Life Cycle. J Virol 87:8993–9007. doi:10.1128/JVI.03152-12
Gimenez MC, Issa M, Sheth J, Colombo MI, Terebiznik MR, Delgui LR. 2020. Phosphatidylinositol 3-Phosphate Mediates the Establishment of Infectious Bursal Disease Virus Replication Complexes in Association with Early Endosomes. J Virol 95:e02313-20. doi:10.1128/jvi.02313-20
Hammond GRV, Machner MP, Balla T. 2014. A novel probe for phosphatidylinositol 4-phosphate reveals multiple pools beyond the Golgi. J Cell Biol 205:113–126. doi:10.1083/jcb.201312072
Khaldoun SA, Emond-Boisjoly MA, Chateau D, Carrière V, Lacasa M, Rousset M, Demignot S, Morel E. 2014. Autophagosomes contribute to intracellular lipid distribution in enterocytes. Mol Biol Cell 25:118. doi:10.1091/mbc.E13-06-0324
Luque D, Saugar I, Rejas MT, Carrascosa JL, Rodríguez JF, Castón JR. 2009. Infectious Bursal Disease Virus: Ribonucleoprotein Complexes of a Double-Stranded RNA Virus. J Mol Biol 386:891–901. doi:10.1016/j.jmb.2008.11.029
Morel E, Chamoun Z, Lasiecka ZM, Chan RB, Williamson RL, Vetanovetz C, Dall’Armi C, Simoes S, Point Du Jour KS, McCabe BD, Small SA, Di Paolo G. 2013. Phosphatidylinositol-3-phosphate regulates sorting and processing of amyloid precursor protein through the endosomal system. Nature Communications 2013 4:1 4:1–13. doi:10.1038/ncomms3250
Qi X, Gao Y, Gao H, Deng X, Bu Z, Wang Xiaoyan, Fu C, Wang Xiaomei. 2007. An improved method for infectious bursal disease virus rescue using RNA polymerase II system. J Virol Methods 142:81–88. doi:10.1016/j.jviromet.2007.01.021
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Public Reviews:
Reviewer #1 (Public review):
Summary:
Hua et al show how targeting amino acid metabolism can overcome Trastuzumab resistance in HER2+ breast cancer.
Strengths:
The authors used metabolomics, transcriptomics and epigenomics approaches in vitro and in preclinical models to demonstrate how trastuzumab-resistant cells utilize cysteine metabolism.
Thank you for your valuable comments. We would like to extend our appreciation for your efforts. Your constructive suggestion would help improve our research.
Weaknesses:
However, there are some key aspects that needs to be addressed.
Major:
(1) Patient Samples for Transcriptomic Analysis: It is unclear from the text whether tumor tissues or blood samples were used for the transcriptomic analysis. This distinction is crucial, as these two sample types would yield vastly different inferences. The authors should clarify the source of these samples.
Thank you for your valuable comments. In the transcriptomic analysis, we included the data of HER2 positive breast cancer patients who received trastuzumab in I-SPY2 trial (GSE181574). Tumor tissues were used in this dataset.
(2) The study only tested one trastuzumab-resistant and one trastuzumab-sensitive cell line. It is unclear whether these findings are applicable to other HER2-positive tumor cell lines, such as HCC1954. The authors should validate their results in additional cell lines to strengthen their conclusions.
Thank you for your valuable comments. We agree with your opinion, and the exploration of multiple cell lines would make our research findings more comprehensive. This is a limitation of our study, and we would continue to improve our design and methods in future experiments.
(3) Relevance to Metastatic Disease: Trastuzumab resistance often arises in patients during disease recurrence, which is frequently associated with metastasis. However, the mouse experiments described in this paper were conducted only in the primary tumors. This article would have more impact if the authors could demonstrate that the combination of Erastin or cysteine starvation with trastuzumab can also improve outcomes in metastasis models.
Thank you for your valuable comments. We agree with your suggestions. The exploration of metastatic disease would make our research more meaningful and help better address clinical key issues. In our future studies, we will continue to investigate the association between the invasive and metastatic capabilities of trastuzumab resistant HER2 positive breast cancer and cysteine metabolism.
Minor:
(1) The figures lack information about the specific statistical tests used. Including this information is essential to show the robustness of the results.
Thank you for your valuable comments. We would include the statistical information in our figure legends.
(2) Figure 3K Interpretation: The significance asterisks in Figure 3K do not specify the comparison being made. Are they relative to the DMSO control? This should be clarified.
Thank you for your valuable comments. We would clarify the comparison information in our figure legends.
Reviewer #2 (Public review):
In this manuscript, Hua et al. proposed SLC7A11, a protein facilitating cellular cystine uptake, as a potential target for the treatment of trastuzumab-resistant HER2-positive breast cancer. If this claim holds true, the finding would be of significance and might be translated to clinical practice. Nevertheless, this reviewer finds that the conclusion was poorly supported by the data.
Notably, most of the data (Figures 2-6) were based on two cell lines - JIMT1 as a representative of trastuzumab-resistant cell line, and SKBR3 as a representative of trastuzumab sensitive cell line. As such, these findings could be cell-line specific while irrelevant to trastuzumab sensitivity at all. Furthermore, the authors claimed ferroptosis simply based on lipid peroxidation (Figure 3). Cell viability was not determined, and the rescuing effects of ferroptosis inhibitors were missing. The xenograft experiments were also suspicious (Figure 4). The description of how cysteine starvation was performed on xenograft tumors was lacking, and the compound (i.e., erastin) used by the authors is not suitable for in vivo experiments due to low solubility and low metabolic stability. Finally, it is confusing why the authors focused on epigenetic regulations (Figures 5 & 6), without measuring major transcription factors (e.g., NRF2, ATF4) which are known to regulate SLC7A11.
To sum up, this reviewer finds that the most valuable data in this manuscript is perhaps Figure 1, which provides unbiased information concerning the metabolic patterns in trastuzumab-sensitive and primary resistant HER2-positive breast cancer patients.
Thank you for your valuable comments. We agree with your suggestions. Your feedback would help enhance the quality of our research.
(1) Our research was mainly conducted in JIMT1 (trastuzumab resistant) and SKBR3 (trastuzumab sensitive), and this is a limitation of our study. The experimental validation using different cell lines will make our research findings more persuasive. In our future research, we will continuously optimize experimental design and methods to make our findings more comprehensive.
(2) The detection of ferroptosis in our research was mainly performed by evaluating the lipid peroxidation. Experiments measuring cell viability and rescuing effects would help provide more evidence.
(3) In xenograft experiments, the cysteine starvation was performed by feeding cysteine-free diet. The drug dissolution and other conditions were optimized by referring to previous relevant literature. We would clarify more details in our article.
(4) Epigenetic modifications have been recognized as crucial factors in drug resistance formation. An increasing number of studies have emphasized the importance of epigenetic changes in regulating the abnormal expression of oncogenes and tumor suppressor genes related to drug resistance. Currently, the role of epigenetic changes in the development of trastuzumab resistance in HER2 positive breast cancer is still in exploration. We tried to investigate the dysregulation of histone modifications and DNA methylation in trastuzumab resistant HER2 positive breast cancer. Our findings indicated that targeting H3K4me3 and DNA methylation could decrease SLC7A11 expression and induce ferroptosis. This would provide more evidence in exploring trastuzumab resistance mechanisms. We will provide a more detailed discussion in the article.
We would like to extend our appreciation for your constructive suggestions and continue to improve our research in future experiments.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Public Reviews:
Reviewer #1 (Public review):
In this manuscript, the authors report that GPR55 activation in presynaptic terminals of Purkinje cells decrease GABA release at the PC-DCN synapse. The authors use an impressive array of techniques (including highly challenging presynaptic recordings) to show that GPR55 activation reduces the readily releasable pool of vesicle without affecting presynaptic AP waveform and presynaptic Ca2+ influx. This is an interesting study, which is seemingly well-executed and proposes a novel mechanism for the control of neurotransmitter release. However, the authors' main conclusions are heavily, if not solely, based on pharmacological agents that most often than not demonstrate affinity at multiple targets. Below are points that the authors should consider in a revised version.
We thank the reviewer for the encouraging comments, and will fully address the reviewer’s concerns as detailed below.
Major points:
(1) There is no clear evidence that GPR55 is specifically expressed in presynaptic terminals at the PC-DCN synapse. The authors cited Ryberg 2007 and Wu 2013 in the introduction, mentioning that GPR55 is potentially expressed in PCs. Ryberg (2007) offers no such evidence, and the expression in PC suggested by Wu (2013) does not necessarily correlate with presynaptic expression. The authors should perform additional experiments to demonstrate the presynaptic expression of GPR55 at PC-DCN synapse.
We agree with the reviewer’s concern that the present manuscript lacks the evidence for localization of GPR55 at PC axon terminals. Honestly, our previous attempt to immune-label GPR55 did not work well. Now, we realize that different antibodies are commercially available, and are going to test them. Hopefully, in the revised manuscript, we will demonstrate immunocytochemical images showing GPR55 at terminals of PCs.
(2) The authors' conclusions rest heavily on pharmacological experiments, with compounds that are sometimes not selective for single targets. Genetic deletion of GPR55 would be a more appropriate control. The authors should also expand their experiments with occlusion experiments, showing if the effects of LPI are absent after AM251 or O-1602 treatment. In addition, the authors may want to consider AM281 as a CB1R antagonist without reported effects at GPR55.
We appreciate the reviewer for pointing out the essential issue regarding the specificity of activation of GPR55 in our study. Regarding the direct manipulation of GPR55, such as genetic deletion, we will try acute knock-down of its expression, considering the possibility of compensation which sometimes occur when the complete knock-out is performed. In addition, according to the reviewer’s suggestion, we will examine whether the effects of LPI and AM251 occlude each other, and also perform control experiments showing the lack of CB1R involvement.
(3) It is not clear how long the different drugs were applied, and at what time the recordings were performed during or following drug application. It appears that GPR55 agonists can have transient effects (Sylantyev, 2013; Rosenberg, 2023), possibly due to receptor internalization. The timeline of drug application should be reported, where IPSC amplitude is shown as a function of time and drug application windows are illustrated.
As suggested, the timing and duration of drug application will be indicated together with the time course of changes of IPSC amplitudes. This change will make things much clearer. Thank you for the suggestion.
(4) A previous investigation on the role of GPR55 in the control of neurotransmitter release is not cited nor discussed Sylantyev et al., (2013, PNAS, Cannabinoid- and lysophosphatidylinositol-sensitive receptor GPR55 boosts neurotransmitter release at central synapses). Similarities and differences should be discussed.
We are really sorry for missing this important study in discussion and citation. In the revised version, of course, we will discuss their findings and our data.
Minor point:
(1) What is the source of LPI? What isoform was used? The multiple isoforms of LPI have different affinities for GPR55.
We are sorry for insufficient explanation about the LPI used in our study. We used LPI derived from soy (Merck, catalog #L7635) that was estimated to contain 58% C16:0 and 42% C18:0 or C18:2 LPI. This information will be added to the Materials and Methods in the revised manuscript.
Reviewer #2 (Public review):
Summary:
This paper investigates the mode of action of GPR55, a relatively understudied type of cannabinoid receptor, in presynaptic terminals of Purkinje cells. The authors use demanding techniques of patch clamp recording of the terminals, sometimes coupled with another recording of the postsynaptic cell. They find a lower release probability of synaptic vesicles after activation of GPR55 receptors, while presynaptic voltage-dependent calcium currents are unaffected. They propose that the size of a specific pool of synaptic vesicles supplying release sites is decreased upon activation of GPR55 receptors.
Strengths:
The paper uses cutting-edge techniques to shed light on a little-studied, potentially important type of cannabinoid receptor. The results are clearly presented, and the conclusions are for the most part sound.
We are really happy to hear the encouraging comments from the reviewer.
Weaknesses:
The nature of the vesicular pool that is modified following activation of GPR55 is not definitively characterized.
During revision, we will perform further analysis and additional experiments to obtain deeper insights into the vesicle pools affected by GPR55 as much as possible.
Reviewer #3 (Public review):
Summary:
Inoshita and Kawaguchi investigated the effects of GPR55 activation on synaptic transmission in vitro. To address this question, they performed direct patch-clamp recordings from axon terminals of cerebellar Purkinje cells and fluorescent imaging of vesicular exocytosis utilizing synapto-pHluorin. They found that exogenous activation of GPR55 suppresses GABA release at Purkinje cell to deep cerebellar nuclei (PC-DCN) synapses by reducing the readily releasable pool (RRP) of vesicles. This mechanism may also operate at other synapses.
Strengths:
The main strength of this study lies in combining patch-clamp recordings from axon terminals with imaging of presynaptic vesicular exocytosis to reveal a novel mechanism by which activation of GPR55 suppresses inhibitory synaptic strength. The results strongly suggest that GPR55 activation reduces the RRP size without altering presynaptic calcium influx.
We thank the reviewer for the positive evaluation on our conclusions.
Weaknesses:
The study relies on the exogenous application of GPR55 agonists. It remains unclear whether endogenous ligands released due to physiological or pathological activities would have similar effects. There is no information regarding the time course of the agonist-induced suppression. There is also little evidence that GPR55 is expressed in Purkinje cells. This study would benefit from using GPR55 knockout (KO) mice. The downstream mechanism by which GPR55 mediates the suppression of GABA release remains unknown.
We agree with the reviewer in all respects suggested as weaknesses. Most issues will be made much clearer by the additional experiments and analysis described above to respond to respective issues raised by other reviewers. The situation of endogenous ligands for GPR55 causing the synaptic depression and its downstream mechanism are very important issues, and we are going to discuss these points in the revised manuscript, and like to work on these in the future study.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Reviewer #1:
In their paper entitled "Combined transcriptomic, connectivity, and activity profiling of the medial amygdala using highly amplified multiplexed in situ hybridization (hamFISH)" Edwards et al. present a new method designated as hamFISH (highly amplified multiplexed in situ hybridization) that enables sequential detection of {less than or equal to}32 genes using multiplexed branched DNA amplification. As proof-of-principle, the authors apply the new technique - in conjunction with connectivity, and activity profiling - to the medial amygdala (MeA) of the mouse, which is a critical nucleus for innate social and defensive behaviors.
As mentioned by Edwards et al., hamFISH could prove beneficial as an affordable alternative to other in situ transcriptomic methods, including commercial platforms, that are resource-intensive and require complex analysis pipelines. Thus, the authors envision that the method they present could democratize in situ cell-type identification in individual laboratories.
The data presented by Edwards et al. is convincing. The authors use the appropriate and validated methodology in line with the current state-of-the-art. The paper makes a strong case for the benefits of hamFISH when combining transcriptomics studies with connectivity tracing and immediate early gene-based activity profiling. Notably, the authors also discuss the caveats and limitations of their study/approach in an open and transparent manner.
In its current state, the manuscript touches upon a number of most intriguing, yet rather preliminary findings. For example, the roles of inhibitory neuron cluster i3 or of the selective and apparently MeA neuron-specific projections (Figure 3 - Figure Supplement 2D) remain elusive. As it is the authors' prime intent to provide "a proof-of-principle example of overlaying transcriptomic types, projection, and activity in a behaviorally relevant manner and demonstrates the usefulness of hamFISH in multiplexed in situ gene expression profiling", such studies might be beyond the scope of the present manuscript. The absence of such more in-depth hypothesis-based analysis, however, prevents an even more enthusiastic overall assessment.
We thank the reviewer for their positive assessment and agree that further studies are needed to explore and understand the MeA circuit further.
Reviewer #2:
The authors describe the development and implementation of hamFISH, a sensitive multiplexed ISH method. They leverage a pre-existing scRNA-seq dataset for the MeA to design 32 probes that combinatorically represent MeA neuronal populations - ~80% of MeA neurons express three of these markers. Using these markers to assess the spatial organization of the MeA, the authors identify a novel population of Ndnf+ projection neurons and characterize their connectivity with anterograde and retrograde labeling. They additionally combine hamFISH with CTB labeling of three principal MeA projection sites to show that 75% of MeA neurons have only a single projection target. Finally, they engage adult male mice in encounters with other adult males (aggression), females (mating), and pups (infanticide), followed by hamFISH and c-fos labeling to relate cell identity to behavior. Their overall conclusion is that hamFISH-defined cell types are broadly active to multiple sensory stimuli. However, the data presented are not sufficient to conclude that no selectivity exists within the MeA. A weakness of the study is that the selected hamFISH genes contain only Lhx6 as a lineage-marking transcription factor. Instead, the authors predominately use neuropeptides as markers. Genes such as Tac1, Cartpt, Adcyap1, Calb1, and Gal are expressed throughout the MeA, and many other brain regions; they are not restricted to a single transcriptomic cell type and they do not denote any developmental origins. By design, the panel has low cell type specificity as all MeA neurons express at least three of the genes. Therefore, the authors' conclusions may not hold with a more stringent classification of cell type or cell identity.
We agree with the reviewer that a deeper level of cell type classification may reveal the selectivity of cell types that may have been missed. The design of our hamFISH bridge-readout probes allows modification to be compatible with a barcoded readout system such as MERFISH, which would substantially increase the number of genes that can be included in the gene panel. This would, however, increase the complexity of the analysis pipeline and reduce throughput, but would be a potential avenue to explore to define MeA cell types at a deeper level. An advantage of hamFISH is the ease of including and reading out alternative gene panels. For example, one panel could examine developmental-lineage-specific genes. Overall, our panel captures the highest hierarchical level (similar to the subclass level of the Allen taxonomy) of MeA transcriptomic types, based on published data available at the time of our gene panel design. Genes including Tac1, Cartpt, Adcyap1, Calb1, and Gal are expressed in specific patterns within the MeA and are useful for classification. In the original manuscript, we also included our rationale for dropping Foxp2, a lineage-specific marker gene in the MeA.
Reviewer #3:
In this manuscript, Edwards et al. describe hamFISH, a customizable and cost-efficient method for performing targeted spatial transcriptomics. hamFISH utilizes highly amplified multiplexed branched DNA amplification, and the authors extensively describe hamFISH development and its advantages over prior variants of this approach.
The authors then used hamFISH to investigate an important circuit in the mouse brain for social behavior, the medial amygdala (MeA). To develop a hamFISH probe set capable of distinguishing MeA neurons, the authors mined published single-cell RNA-sequencing datasets of the MeA, ultimately creating a panel of 32 hamFISH probes that mostly cover the identified MeA cell types. They evaluated over 600,000 MeA cells and classified neurons into 16 inhibitory and 10 excitatory types, many of which are spatially clustered. The authors combined hamFISH with viral and other circuit tracer injections to determine whether the identified MeA cell populations sent and/or received unique inputs from connected brain regions, finding evidence that several cell types had unique patterns of input and output. Finally, the authors performed hamFISH on the brains of male mice that were placed in behavioral conditions that elicit aggressive, infanticidal, or mating behaviors, finding that some cell populations are selectively activated (as assessed by c-fos mRNA expression) in specific social contexts.
Strengths:
(1) The authors developed an optimized tissue preparation protocol for hamFISH and implemented oligopools instead of individually synthesized oligonucleotides to reduce costs. The branched DNA amplification scheme improved smFISH signal compared to previous methods, and multiple variants provide additional improvements in signal intensity and specificity. Compared to other spatial transcriptomics methods, the pipeline for imaging and analysis is streamlined and is compatible with other techniques like fluorescence-based circuit tracing. This approach is cost-effective and has several advantages that make it a valuable addition to the list of spatial transcriptomics toolkits.
(2) Using 31 probes, hamFISH was able to detect 16 inhibitory and 10 excitatory neuron types in the MeA subregions, including the vast majority of cell types identified by other transcriptomics approaches. The authors quantified the distributions of these cell types along the anterior-posterior, dorsal-ventral, and medial-lateral axes, finding spatial segregation among some, but not all, MeA excitatory and inhibitory cell types. The authors additionally identified a class of inhibitory neurons expressing Ndnf (and a subset of these that express Chrna7) that project multiple social chemosensory circuits.
(3) The authors combined hamFISH with MeA input and output mapping, finding cell-type biases in the projections to the MPOA, BNST, and VMHvl, and inputs from multiple regions.
(4) The authors identified excitatory and inhibitory cell types, and patterns of activity across cell types, that were selectively activated during various social behaviors, including aggression, mating, and infanticide, providing new insights and avenues for future research into MeA circuit function.
Weaknesses:
(1) Gene selection for hamFISH is likely to still be a limiting factor, even with the expanded (32-probe) capacity. This may have contributed to the lack of ability to identify sexually dimorphic cell types (Figure S2B). This is an expected tradeoff for a method that has major advantages in terms of cost and adaptability.
We recognise that the 32-plex gene detection might not be sufficient to address key questions in the transcriptomic organization of innate social behavior circuits, and that the study fell short of addressing more quantitative gene expression differences between sexes. Detecting sexually dimorphic gene expression likely requires a more targeted approach as the dimorphism is expression differences rather than binary expression of marker genes, and the gene panel needs to be specifically configured for this purpose.
(2) Adaptation of hamFISH, for example, to adapt it to other brain regions or tissues, may require extensive optimization.
We have successfully performed hamFISH on at least two other mouse brain regions without needing to optimize further, suggesting that compatibility with other mouse brain regions is not an issue. We recognise, however, that optimization of hamFISH may be required for its application in other types of tissue or species. Human brain tissue, for example, typically suffers from high autofluorescence and different tissue preparation methods may need to be employed. We note that the amplification by hamFISH signal boost with v2 amplifiers may be useful to this end.
(3) Pairing this method with behavioral experiments is likely to require further optimization, as c-fos mRNA expression is an indirect and incomplete survey of neuronal activity (e.g. not all cell types upregulate c-fos when electrically active). As such, there is a risk of false negative results that limit its utility for understanding circuit function.
We acknowledge that c-fos is not the only readout of neuronal activity and that a panel of immediate early genes would allow a more comprehensive readout of activity-dependent gene expression. We fully agree that immediate early gene induction is an indirect readout of neural activity, and alternative methods such as in vivo physiology would provide a complementary insight into the selectivity of MeA neuron responses.
(4) The limited compatibility of hamFISH with thicker tissue samples and lack of optical sectioning introduce additional technical limitations. For example, it would be difficult to densely sample larger neural circuits using serial 20 micron sections. Also, because the imaging modality is not clear from the methods, it is difficult to know whether the analysis methods introduce the risk of misattributing gene expression to overlapping cells.
We agree that the use of hamFISH as described here is restricted to thin (<20 um) sections. We have shown, however, that our encoding probe and bridge-readout probe design are compatible with HCR-based mRNA detection, which is compatible with thicker sections. Regarding the misattribution of gene expression to overlapping cells in the z-axis, we used epifluorescence microscopy with 14x 500 nm z-steps to collect our raw data and generate maximum intensity projections for further analysis. Because of the thin sections (10 um) used for the imaging, the overlap between cells in z is expected to be minimal. Regarding throughput, we agree that hamFISH is likely not suitable for brain-wide questions that require large volume coverage, but its major advantage is that it allows routine use of low-level multiplexing for targeted brain areas.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public review):
Summary:
This is a comprehensive study that clearly and deeply investigates the function of GATA6 in human early cardiac development.
Strengths:
This study combines hESC engineering, differentiation, detailed gene expression, genome occupancy, and pathway modulation to elucidate the role of GATA6 in early cardiac differentiation. The work is carefully executed and the results support the conclusions. The use of publicly available data is well integrated throughout the manuscript. The RIME experiments are excellent.
Weaknesses:
Much has been known about GATA6 in mesendoderm development, and this is acknowledged by the authors.
We appreciate the comments and have tried to highlight both the early role of GATA6 in cardiac progenitor biology as well as the haploinsufficiency for relevance to human congenital heart disease, which we believe adds value to other recent published work, among others Sharma et al. eLife 2020.
Reviewer #2 (Public review):
Summary:
This manuscript by Bisson et al describes the role of GATA6 to regulate cardiac progenitor cell (CPC) specification and cardiomyocyte (CM) generation using human embryonic stem cells (hESCs). The authors found that GATA6 loss-of-function hESC exhibits early defects in mesendoderm and lateral mesoderm patterning stages. Using RNA-seq and CUT&RUN assays the genes of the Wnt and BMP programs were found to be affected by the loss of GATA6 expression. Modulating Wnt and BMP during early cardiac differentiation can partially rescue CPC and CM defects in GATA6 hetero- and homozygous mutant hESCs.
Strengths:
The studies performed were rigorous and the rationale for the experimental design was logical. The results obtained were clear and supported the conclusions that the authors made regarding the role of GATA6 on Wnt and BMP pathway gene expression.
Weaknesses:
Given the wealth of studies that have been performed in this research area previously, the amount of new information provided in this study is relatively modest. Nevertheless, the results and quite clear and should make a strong contribution to the field.
Likewise for reviewer 2, we appreciate the comments and have tried to highlight both the early role of GATA6 in cardiac progenitor biology as well as the haploinsufficiency for relevance to human congenital heart disease.
Reviewer #3 (Public review):
In this study, Bison et al. analyzed the role of the GATA6 transcription factor in patterning the early mesoderm and generating cardiomyocytes, using human embryonic stem cell differentiation assays and patient-derived hiPSCs with heart defects associated with mutations in the GATA6 gene. They identified a novel role for GATA6 in regulating genes involved in the WNT and BMP pathways -findings not previously noted in earlier analyses of GATA6 mutant hiPSCs during early cardiac mesoderm specification (Sharma et al., 2020). Modulation of the WNT and BMP pathways may partially rescue early cardiac mesoderm defects in GATA6 mutant hESCs. These results provide significant insights into how GATA6 loss-of-function and heterozygous mutations contribute to heart defects.
I have the following comments:
(1) Throughout the manuscript, Bison et al. alternate between different protocols to generate cardiomyocytes, which creates some confusion (e.g., Figure 1 vs. Supplemental Figure 2A). The authors should provide a clear justification for using alternative protocols.
We agree and clarified this issue in the revision (p. 6). The reviewer is correct that there are two widely used protocols for directed differentiation of PSCs to cardiac fate. One is a cytokine-based protocol (Fig. 1A) and the other uses small molecules to manipulate the WNT pathway (CHIR protocol, Supplemental Fig. 2B). In our study, we used the CHIR protocol only for experiments in Supplemental Figure 2B-E. Since our data implicated BMP and WNT as mediators of the GATA6-dependent program, we did this mainly to confirm that the phenotype we observed with the cytokine-based protocol was not biased by the differentiation protocol. However, we found the CHIR protocol to be overall relatively inefficient for cardiac differentiation using the parental H1 hESCs and the various isogenic lines. The in vitro cardiac differentiation protocols for hPSCs are known to be variable depending on lines and sometimes require extensive optimization for various media components and concentrations, cell seeding densities, and batch variations for crucial reagents. The cytokine-based protocol we optimized worked most efficiently with our hPSC lines to generate cardiomyocytes, therefore we committed to using it for the bulk of experiments in this study.
(2) The authors should characterise the mesodermal identity and cardiomyocyte subtypes generated with the activin/BMP-induction protocol thoroughly and clarify whether defects in the expression of BMP and WNTrelated gene affect the formation of specific cardiomyocyte subtypes in a chamber-specific manner. This analysis is important, as Sharma et al. suggested a role for GATA6 in orchestrating outflow tract formation, and Bison et al. similarly identified decreased expression of NRP1, a gene involved in outflow tract septation, in their GATA6 mutant cells.
We agree it is important that the mesodermal identities are quite thoroughly characterized.
For example, Fig. 2 (K+P+, Brachyury, EOMES), Fig. 3G&H (lateral mesoderm, cardiac mesoderm RNAseq & GSEA comparing datasets from Koh et al.). The capacity of the cytokine-based protocol to generate both FHF and SHF derived sub-types has been rigorously evaluated by Keller and colleagues, which we now cite (Yang et al. 2022). Since the null cells do not generate CMs, chamber specific subtypes cannot be evaluated; whether the GATA6 heterozygous mutants are biased is an interesting question. Indeed, the top GO term identified by CUT&RUN analysis for GATA6 at day 2 of
differentiation is outflow tract morphogenesis, which is consistent with the interpretation by Sharma et al., but implicates this program at a much earlier developmental stage, long before cardiomyocyte differentiation. We think this is one of the most important findings of our study and appreciate the chance to highlight this in the revision (p. 9, 17). When we evaluated chamber-specificity for differentiated cardiomyocytes, we did not find significant differences, as indicated for the reviewer in the panel below (day 20 of differentiation). Since our study focuses on early stages of progenitor specification rather than cardiomyocyte differentiation, we agree that a more rigorous analysis would be of value, and indicated this as a limitation of our current study (p. 18).
Author response image 1.
(3) The authors developed an iPSC line derived from a congenital heart disease (CHD) patient with an atrial septal defect and observed that these cells generate cTnnT+ cells less efficiently. However, it remains unclear whether atrial cardiomyocytes (or those localised specifically at the septum) are being generated using the activin/BMP-induction protocol and the patient-derived iPSC line.
As indicated above, our study is focused on cardiac progenitor specification, and we found similar differences with the patient-derived iPSC-CMs compared to using hESC heterozygous targeted mutants. While we did not note any major differences in expression of cardiomyocyte markers, whether the mutants show any biases toward sub-types of cardiomyocytes is an interesting question to be pursued in subsequent work.
(4) The authors should also justify the necessity of using the patient-derived line to further analyse GATA6 function.
This is a good point, and as suggested we provided the justification (p. 5-6). This is the first patient-derived iPSC line published with a heterozygous GATA6 mutation along with an isogenic mutation-corrected control generated for cardiac directed differentiation. Patients with congenital heart disease (CHD) associated with GATA6 mutations are typically heterozygous (also true for many other CHD variants; presumably homozygous null embryos would not survive). It is important to query if phenotypes found using targeted mutations in hESCs (or iPSCs) model the human disease, since the patient cells (or the hESCs) likely have additional genetic variants that might interact with the GATA6 mutation. The fact that both types of heterozygous cells (patient-derived iPSCs and targeted hESCs) generate similar defects in CM differentiation provides evidence supporting the use of these human cellular models to study the genetic and cellular basis for congenital heart disease. This is particularly important, since other models, such as heterozygous mice, do not show such phenotypes.
(5) Figure 3 suggests an enrichment of paraxial mesoderm genes in the context of GATA6 loss-of-function, which is intriguing given the well-established role of GATA6 in specifying cardiac versus pharyngeal mesoderm lineages in model organisms. Could the authors expand their analysis beyond GO term enrichment to explore which alternative fates GATA6 mutant cells may acquire? Additionally, how does the potential enrichment of paraxial mesoderm, rather than pharyngeal mesoderm, relate to the initial mesodermal induction from their differentiation protocol? Could the authors also rule out the possibility of increased neuronal cell fates?
We need to interpret our in vitro differentiation data cautiously in relation to what has been shown in vivo, since we are unlikely to be reproducing all the complex signaling taking place in the embryo. Yet we do see modest increases in gene expression levels including signatures of paraxial mesoderm and ECM/mesenchymal at days 2 or 3 of differentiation in the GATA6 mutant cells. Therefore, we now include a heatmap showing enriched paraxial mesoderm gene expression in the mutant cells, new Fig. 3I (see page 10).
A caveat of this result is that the cells are being differentiated toward cardiac fate, so a bias for alternative fates might be suppressed. We modified the protocol to favor paraxial fate by adding CHIR at day 2 (rather than XAV) and performing qPCR assays at day 3. We found this successfully induced paraxial mesoderm gene expression, but equally comparing wildtype, heterozygous, or null cells, so do not feel it warrants highlighting further.
Recommendations for the authors:
Reviewing Editor (Recommendations for the authors):
Incorporation of marker analysis for various stages of iPSC to CM differentiation (mesoderm, cardiac progenitor, CM subtypes) would increase the significance and support for the findings presented. Further data on the link (direct or indirect) between GATA6 and Wnt/BMP signalling would also add to the significance of this study. A number of textual changes/clarifications are also suggested to improve the manuscript.
We appreciate the feedback and provide responses for issues raised for markers, direct or indirect interactions, and textual changes/clarifications in the following sections. As indicated above, we did not find obvious alterations in cardiac subtypes, but since our study is focused on early progenitor specification, this is an interesting question that we think should be more rigorously evaluated in subsequent work.
Reviewer #1 (Recommendations for the authors):
Minor details:
(1) On p6 "Principal component analysis (PCA) showed that the cells derived from each genotype were well separated from each other (Supplemental Figure 2C)". All genotypes should be in one PCA plot to better evaluate the three genotypes.
We prepared the new plot as suggested, presented as new Supplemental Fig. 2C.
(2) p10: "Chia et al.22 and found a significantly decreased enrichment in GATA6-/- cells relative to WT at day 2" decreased enrichment of what? Direct target genes?
Thank you for catching this. Yes, the text was changed to indicate a “decreased enrichment in GATA6-/- cells relative to WT at day 2 for putative direct GATA6 target genes.”
Reviewer #2 (Recommendations for the authors):
Overall, this is an interesting study that addresses the early developmental roles of GATA6 on cardiac differentiation. While the identification of Wnt and BMP pathway genes to be involved in GATA6 regulation is not entirely unexpected, the authors do bring forth some useful knowledge that helps to further elucidate the mechanism of pre-cardiac mesoderm regulation. Some suggestions for improvement are included below -
Major points:
(1) Since the loss of Gata6 in this study is global (either as heterozygous or homozygous, it is likely that the very early requirement of Gata6 (e.g. mesodermal stage of differentiation) is responsible for the cardiac transcriptional phenotype observed and not due to specific role of Gata6 in the cardiac lineage which would need to be addressed using conditional knock out of Gata6 in hPSC model. The authors should be more explicit when discussing the results as disruption of mesodermal differentiation leading to loss of downstream cardiac lineage cells. For example, I would change the title "GATA6 loss-of-function impairs CM differentiation" to "GATA6 loss-of-function impairs mesodermal (or mesodermal lineage) differentiation" and show the changes in cardiac progenitor cells genes (Isl1, Tbx1, Hand1, and BAF50c/Smarcd3) in addition to cardiomyocyte genes but no change in mesodermal (e.g. Brachyury, T, Eomes, Mesp1/2, etc) genes.
We agree with the reviewer’s interpretation. The title for the section was changed as suggested. In Fig. 1, we show changes in cardiac progenitor cell genes (Isl1, Hand1, and BAF50c/Smarcd3) while not seeing changes in mesodermal genes in Fig. 2 (e.g. Brachyury, Eomes, Mesp1/2). We note that the defect may be specific to cardiac (or anterior lateral) mesoderm, as the ability to express paraxial mesoderm markers was not impaired.
(2) The use of NKX2.5, TBX5, TBX20, and GATA4 as markers for CPC is not ideal. These markers are also expressed in differentiated cardiomycytes. ISL1 or TBX1 for second heart field progenitors and HAND1 or BAF60c/Smarcd3 for first heart field progenitors would be ideal.
As suggested, we included additional day 6 qPCR panel (new Fig. 1E) to evaluate the heart field progenitor markers.
(3) Much of the findings described in this study have been known in the field including the requirement of Wnt and BMP to induce mesodermal and subsequently cardiomyocyte differentiation. The key new information here is that Gata6 knockout disrupts Wnt and BMP signaling. It would help to further validate experimentally some of the Wnt and BMP genes as either direct or indirect targets of Gata6 using reporter assays.
While reporter assays are feasible and do provide relevant outputs, we feel that the use of any one or even several response elements in a reporter assay adds relatively little value compared to comprehensive analysis of bona fide network components. To address the reviewers concern we have included profiling heat maps for WNT and BMP pathway components to more rigorously and specifically evaluate the disruption in the signaling networks caused by loss of GATA6. Proving direct targets of endogenous genes is challenging, but we mapped many binding peaks for GATA6 to putative enhancers of WNT/BMP pathway genes (based on histone marks). We provide a list of these genes (new Fig. 4F) and distinguish these from WNT/BMP pathway genes that were not bound by GATA6 yet are down-regulated in the GATA6 mutant cells and are likely to be indirect targets (p. 12).
Minor points:
(1) Figures 1 and 2 - in the figure legend the labels w2, w4, m2, m5, m11, and m14 should be explained as the name of the clones of targeted hESC.
The legends were edited to provide this information.
(2) Supplemental Figure 3A - the resolution of the FACS plot is suboptimal.
We apologize and have corrected the plot resolution in the revised manuscript.
(3) Supplemental Table 1 - it's intriguing that amongst all the SWI/SNF factors, the one that is known to be cardiac-specific (SMARCD3) did not come up in the GATA6-RIME-enriched proteins. Is this a reflection of the early stage in which GATA6 plays a role in development (e.g. mesendoderm development but not precardiac mesoderm development when SMARCD3 is expressed)?
We agree and have noted this feature in the revised manuscript (p. 17). We note that SMARCD3 is expressed in the RNA-seq data as early as day 2. Although speculative, it may be that GATA6 primarily interacts with SWI/SNF complexes prior to the role for SMARCD3 in cardiac specification.
Reviewer #3 (Recommendations for the authors):
(1) Figures 3G and 3H, as well as others, have resolution issues. The gene names are unreadable, and higherresolution images should be provided.
We apologize for the resolution issues and these have been fixed in the revised version.
(2) In their early manipulation of the WNT and BMP pathways (Figure 6A), it is unclear whether the activin/BMP protocol shown in Figure 1A was used. If this is the case, the authors should compare their results to a wild-type + DOX EV condition for consistency.
We clarified in the revision (Fig. 6A) that all the experiments in Fig. 6 use the cytokine protocol. In the revised figure, we included the wild-type + DOX EV condition as suggested.
(3) In Figures 6C and 6D, the authors should include an analysis of a wild-type isogenic line under their new CHIR/LB condition for comparison.
As suggested, we included the WT isogenic line in the comparison. For Fig. 6C these are shown on a separate graph because the Y-axis values are very different. Note that the CHIR/LB treatments that improve mutant cell differentiation impact the WT cells in the opposite manner.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
Summary:
The study by Pudlowski et al. investigates how the intricate structure of centrioles is formed by studying the role of a complex formed by delta- and epsilon-tubulin and the TEDC1 and TEDC2 proteins. For this, they employ knockout cell lines, EM, and ultrastructure expansion microscopy as well as pull-downs. Previous work has indicated a role of delta- and epsilon-tubulin in triplet microtubule formation. Without triplet microtubules centriolar cylinders can still form, but are unstable, resulting in futile rounds of de novo centriole assembly during S phase and disassembly during mitosis. Here the authors show that all four proteins function as a complex and knockout of any of the four proteins results in the same phenotype. They further find that mutant centrioles lack inner scaffold proteins and contain an extended proximal end including markers such as SAS6 and CEP135, suggesting that triplet microtubule formation is linked to limiting proximal end extension and formation of the central region that contains the inner scaffold. Finally, they show that mutant centrioles seem to undergo elongation during early mitosis before disassembly, although it is not clear if this may also be due to prolonged mitotic duration in mutants.
Strengths:
Overall this is a well-performed study, well presented, with conclusions mostly supported by the data. The use of knockout cell lines and rescue experiments is convincing.
Weaknesses:
In some cases, additional controls and quantification would be needed, in particular regarding cell cycle and centriole elongation stages, to make the data and conclusions more robust.
We thank the reviewer for these comments and have improved our analyses of these as detailed below.
Reviewer #2 (Public Review):
Summary:
In this article, the authors study the function of TEDC1 and TEDC2, two proteins previously reported to interact with TUBD1 and TUBE1. Previous work by the same group had shown that TUBD1 and TUBE1 are required for centriole assembly and that human cells lacking these proteins form abnormal centrioles that only have singlet microtubules that disintegrate in mitosis. In this new work, the authors demonstrate that TEDC1 and TEDC2 depletion results in the same phenotype with abnormal centrioles that also disintegrate into mitosis. In addition, they were able to localize these proteins to the proximal end of the centriole, a result not previously achieved with TUBD1 and TUBE1, providing a better understanding of where and when the complex is involved in centriole growth.
Strengths:
The results are very convincing, particularly the phenotype, which is the same as previously observed for TUBD1 and TUBE1. The U-ExM localization is also convincing:
despite a signal that's not very homogeneous, it's clear that the complex is in the proximal region of the centriole and procentriole. The phenotype observed in U-ExM on the elongation of the cartwheel is also spectacular and opens the question of the regulation of the size of this structure. The authors also report convincing results on direct interactions between TUBD1, TUBE1, TEDC1, and TEDC2, and an intriguing structural prediction suggesting that TEDC1 and TEDC2 form a heterodimer that interacts with the TUBD1- TUBE1 heterodimer.
Weaknesses:
The phenotypes observed in U-ExM on cartwheel elongation merit further quantification, enabling the field to appreciate better what is happening at the level of this structure.
We thank the reviewer for these comments and have improved our analyses of cartwheel elongation as detailed below.
Reviewer #3 (Public Review):
Summary:
Human cells deficient in delta-tubulin or epsilon-tubulin form unstable centrioles, which lack triplet microtubules and undergo a futile formation and disintegration cycle. In this study, the authors show that human cells lacking the associated proteins TEDC1 or TEDC2 have these identical phenotypes. They use genetics to knockout TEDC1 or TEDC2 in p53negative RPE-1 cells and expansion microscopy to structurally characterize mutant centrioles. Biochemical methods and AlphaFold-multimer prediction software are used to investigate interactions between tubulins and TEDC1 and TEDC2.
The study shows that mutant centrioles are built only of A tubules, which elongate and extend their proximal region, fail to incorporate structural components, and finally disintegrate in mitosis. In addition, they demonstrate that delta-tubulin or epsilon-tubulin and TEDC1 and TEDC2 form one complex and that TEDC1 TEDC2 can interact independently of tubulins. Finally, they show that the localization of four proteins is mutually dependent.
Strengths:
The results presented here are mostly convincing, the study is exciting and important, and the manuscript is well-written. The study shows that delta-tubulin, epsilon-tubulin, TEDC1, and TEDC2 function together to build a stable and functional centriole, significantly contributing to the field and our understanding of the centriole assembly process.
Weaknesses:
The ultrastructural characterization of TEDC1 and TEDC2 obtained by U-ExM is inconclusive. Improving the quality of the signals is paramount for this manuscript.
We thank the reviewer for these comments and have improved our imaging of TEDC1 and TEDC2 localization, as detailed below.
Recommendations for the authors:
Reviewing Editor (Recommendations For The Authors):
The reviewers agreed that the conclusions are largely supported by solid evidence, but felt that improving the following aspects would make some of the data more convincing:
(1) The UExM localizations of TEDC1/2 are not very convincing and the reviewers suggest to complement these with alternative super-resolution approaches (e.g. SIM) and/or different labeling techniques such as pre-expansion labeling using STAR red/orange secondaries (also robust for SIM and STED), use of the Halo tag, different tag antibodies, etc
We thank the reviewers for these recommendations and have adapted two of these strategies to improve our imaging of TEDC1 and TEDC2 localization. First, we used an alternative super-resolution approach, a Yokogawa CSU-W1 SoRA confocal scanner (resolution = 120 nm) and imaged cells grown on coverslips (not expanded). We found that TEDC1 and TEDC2 localize to procentrioles and the proximal end of parental centrioles (Fig 2 – Supplementary Figure 1a, b). Second, we used a recently described expansion gel chemistry (Kong et al., Methods Mol Biol 2024) combined with Abberior Star red and orange secondary antibodies. This technique resulted in robust signal at centrosomes and in the cytoplasm and indicated that TEDC1 and TEDC2 localize near the centriole walls of procentrioles and the proximal region of parental centrioles, near CEP44 (Fig 2 – Supplementary Figure 1c, d). These results complement and support our initial observations (Fig 2C, D) and we have edited the text to reflect this (lines 157-163). We also note that these Flag tag and V5 tag primary antibodies are specific and have little background signal in all applications (Fig 2 – Supplementary Fig 1E-J), while other commercially available antibodies against these tags did exhibit non-specific signal.
(2) The cell cycle classifications of centrioles would strongly benefit, apart from a better description, from adding quantifications of average centriole length at a given stage based on tubulin staining (not acTub).
We thank the reviewers for these recommendations. We have added an improved description of our cell cycle analyses (lines 234-237). We have also added new analyses for centriole length as measured by staining with alpha-tubulin (Fig 4 – Supp 3 and Fig 4 – Supp 4). We find that in all mutants, acetylated tubulin elongates along with alpha-tubulin in a similar way as control centrioles.
Reviewer #1 (Recommendations For The Authors):
Specific points:
(1) The introduction is a bit oddly structured. About halfway through it summarizes what is going to be presented in the study, giving the impression that it is about to conclude, but then continues with additional, detailed introduction paragraphs. Overall, the authors may also want to consider making it more concise.
We thank the reviewer for these suggestions and have shortened and restructured the introduction for clarity and conciseness.
(2) The text should explain to the non-expert reader why endogenous proteins are not detected and why exogenously expressed, tagged versions are used. Related to this, the authors state overexpression, but what is this assessment based on? Does expression at the endogenous level also rescue? At least by western blot, these questions should be addressed.
In the text, we have added clarification about why endogenous proteins were not detected for immunofluorescence (lines 149-151). To quantify the overexpression, we have added Western blots of TEDC1 and TEDC2 to Fig 1 – Supplementary Figure 1E,F. We note that endogenous levels of both proteins are very low, and the rescue constructs are overexpressed 20 to 70 fold above endogenous levels.
(3) The figures should clearly indicate when tagged proteins are used and detected.
Currently, this info is only found in the legends but should be in the figure panels as well.
We have made these changes to the figure panels in Fig 2, Fig 2 – Supp 1, and Fig 3.
(4) I could not find a description and reference to Figure 2 Supplement 2 and 3.
We have replaced these supplements with new supplementary figures for TEDC1 and TEDC2 localization (Fig 2 – Supp 1).
(5) The multiple bands including unspecific (?) bands should be labeled to guide the reader in the western blots.
We have labeled nonspecific bands in our Western blots with asterisks (Fig 1 – Supp 1, Fig 3)
(6) The alphafold prediction suggests that TUBD1 can bind to the TED complex in the absence of TUBE1 can this be shown? This would be a nice validation of the predicted architecture of the complex. I also missed a bit of a discussion of the predicted architecture. How could it be linked to triplet microtubule formation? Is the latest alphafold version 3 adding anything to this analysis?
In our pulldown experiments, we found that TUBD1 cannot bind to TEDC1 or TEDC2 in the absence of TUBE1 (Fig 3C, D, IB: TUBD1). We performed this experiment with three biological replicates and found the same result. It is possible that TUBD1 and TUBE1 form an intact heterodimer, similar to alpha-tubulin and beta-tubulin, and this will be an exciting area of future research.
We have added new analysis from AlphaFold3 (Fig 3 – Supp 1B). AlphaFold3 predicts a similar structure as AlphaFold Multimer.
We have also added additional discussion about the AlphaFold prediction to the text (lines 220-222, 365-367). Thanks to the reviewer for pointing out this oversight.
(7) I suggest briefly explaining in the text how cells and centrioles at different cell cycle stages were identified. I found some info in the legend of Figure 1, but no info for other figures or in the text. Related to this, how are procentrioles defined in de novo formation? There is no parental centriole to serve as a reference.
We have added a brief explanation of the synchronization and identification in lines 234237. We have also clarified the text regarding de novo centrioles, and now term these “de novo centrioles in the first cell cycle after their formation” (lines 271-272).
(8) Related to point 7: using acetylated tubulin as a universal length and width marker seems unreliable since it is a PTM. The authors should use general tubulin staining to estimate centriole dimensions, or at least establish that acetylated tubulin correlates well with the overall tubulin signal in all mutants.
We have added two supplementary data figures (Fig 4 – supp 3 and Fig 4 – supp 4) in which we co-stain control and mutant centrioles with alpha-tubulin. We found that acetylated tubulin marked mutant centrioles well and as alpha-tubulin length increased, acetylated tubulin length also increased.
(9) Presence and absence of various centriolar proteins. These analyses lack a clear reference for the precise centriole elongation stage. This is particularly problematic for proteins that are recruited at specific later stages (such as inner scaffold proteins). The staining should be correlated with centriole length measurements, ideally using general tubulin staining.
As described for point 8, we have added two supplementary data figures in which we costain control and mutant centrioles with alpha-tubulin and found that acetylated tubulin also increases as overall tubulin length increases in all mutants. We note that inner scaffold proteins are absent in all our mutant centrioles at all stages of the cell and centriole cycle, as also previously reported for POC5 in Wang et al., 2017.
Reviewer #2 (Recommendations For The Authors):
Here's a list of points I think could be improved:
- As the authors previously published, the centriole appears to have a smaller internal diameter than mature centrioles. Could the authors measure to see if the phenotype is identical? Is the centriole blocked in the bloom phase (Laporte et al. 2024)?
We have added an additional supplementary figure (Fig 4 – supp 5) to show that mutant centrioles have smaller diameters than mature centrioles, as we previously reported for the delta-tubulin and epsilon-tubulin mutant centrioles by EM. We thank the reviewers for the additional question of the bloom phase. Given the comparatively smaller number of centrioles we analyzed in this paper compared to Laporte et al (50 to 80 centrioles per condition here, versus 800 centrioles in Laporte et al), it is difficult to definitively conclude whether there is a block in bloom phase. This would be an interesting area for future research.
- The images of the centrioles in EM are beautiful. Would it be possible to apply a symmetrisation on it to better see the centriolar structures? For example, is the A-C linker present?
We thank the reviewer for this excellent suggestion. Using centrioleJ, we find that the A-C linker is absent from mutant centrioles. The symmetrized images have been added to Fig 1 – Supplementary Fig 2, and additional discussion has been added to the text (line 143-144, line 368-374).
- How many EM images were taken? Did the centrioles have 100% A-microtubule only or sometimes with B-MT?
For TEM, we focused on centrioles that were positioned to give perfect cross-section images of the centriolar microtubules, and thus did not take images of off-angle or rotated centrioles. Given the difficulty of this experiment (centrioles are small structures within the cell, centrosomes are single-copy organelles, and off-angle centrioles were not imaged), we were lucky to image 3 centrioles that were in perfect cross-section – 2 for Tedc1<sup>-/-</sup> and 1 for Tedc2<sup>-/-</sup>. Our images indicate that these centrioles only have A-tubules (Fig 1 – Supp Fig
2).
- In Figure 2 - it would be preferable to write TEDC2-flag or TEDC1-flag and not TEDC2/1.
We have made this change
- It seems that Figures 2C and D aren't cited, and some of the data in the supplemental data are not described in the main text.
We have replaced these supplements with new supplementary figures for TEDC1 and TEDC2 localization (Fig 2 – Supp 1).
- The signal in U-ExM with the anti-Flag antibody is heterogeneous. Did the authors test several anti-FLAG antibodies in U-ExM?
We tested several anti-Flag and anti-V5 antibodies for our analyses, and chose these because they have little background signal in all applications (Fig 2 – Supplementary Fig 1E-J). Other commercially available antibodies against these tags did exhibit non-specific signal.
- The AlphaFold prediction is difficult to interpret, the authors should provide more views and the PDB file.
We have added 2 additional views of the AlphaFold prediction in Fig 3 – Supp 1A.
- In general, but particularly for Figure 4: the length doesn't seem to be divided by the expansion factor, it is therefore difficult to compare with known EM dimensions. Can the authors correct the scale bars?
We have corrected the scale bars for all figures to account for the expansion factor.
- Concerning Gamma-tubulin that is "recruited to the lumen of centrioles by the inner scaffold, had localization defects in mutant centrioles. However, we were unable to reliably detect gamma-tubulin within the lumen of control or de novo-formed centrioles in S or G2-phase (Figure 4 - Supplement 1E), and thus were unable to test this hypothesis". In Laporte et al 2024, Gamma-tubulin arrives later than the inner scaffold and only on mature centrioles, so this result appears to be in line with previous observation. However, the authors should be able to detect a proximal signal under the microtubules of the procentriole, is this the case?
We agree that this is an exciting question. However, in our expansion microscopy staining, we frequently observe that gamma-tubulin surrounds centrioles, corresponding to its role in the pericentriolar material (PCM). In our hands, we find it difficult to distinguish between centriolar gamma-tubulin at the base of the A-tubule from gamma-tubulin within the PCM.
- In the signal elongation of SAS-6, STIL, CEP135, CPAP, and CEP44, would it be possible to quantify the length of these signals (with dimensions divided by the expansion factor for comparison with known TEM distances)?
We have quantified the lengths of SAS-6 and CEP135 in new Fig 4 – Supp 3 and Fig 4 – Supp 4.
- The authors observe that centrin is present, but only as a SFI1 dot-like localization (which is another protein that would be interesting to look at), and not an inner scaffold localization. Can the authors elaborate? These results suggest that the distal part is correctly formed with only a microtubule singlet.
We agree with the reviewer’s interpretation that the centriole distal tip is likely correctly formed with only singlet microtubules, as both distal centrin and CP110 are present. We have added this point to the discussion (line 415).
-The authors observe that CPAP is elongated, but CPAP has two locations, proximal and distal. Is it distal or proximal elongation? Is the proximal signal of CPAP longer than that of CEP44 in the mutants? The authors discuss that the elongation could come from overexpression of CPAP, but here it seems that the centriole is not overlong, just the structures around the cartwheel.
We thank the reviewer for this point. It is difficult for us to conclude whether the proximal or distal region is extended in the mutants, as our mutant centrioles lacks a visible separation between these two regions. It would be interesting to probe this question in the future by testing whether subdomains of CPAP may be differentially regulated in our mutants.
Reviewer #3 (Recommendations For The Authors):
It isn't apparent to me what was counted in Figure 1C. Were all centrioles (mother centrioles and procentrioles) counted? Where is the 40% in control cells coming from? Can this set of data be presented differently?
We apologize for the confusion. In this figure, all centrioles were counted. We have updated the figure legend for clarity. We performed this analysis in a similar way as in Wang et al., 2017 to better compare phenotypes.
Figure 2C. and the text lines 182-187: The ultrastructural characterization of TEDC1 and TEDC2 suffers from the low quality of the TEDC1 and TEDC2 signals obtained postexpansion. In comparison with robust low-resolution immunosignal, it appears that most of the signal cannot be recovered after expansion. Another sub-resolution imaging method to re-analyze TEDC1 and TEDC22 localization would be essential. The same concern applies to Figures 2 - Supplement 2 and 3. Also, Figure 2 - Supplement 2 and Supplement 3 do not seem to be cited.
We thank the reviewer for these recommendations. As also mentioned above, we used an alternative super-resolution approach, a Yokogawa CSU-W1 SoRA confocal scanner (resolution = 120 nm), and found that TEDC1 and TEDC2 localize to procentrioles and the proximal end of parental centrioles (Fig 2 – Supplementary Figure 1a, b). Second, we used a recently described expansion gel chemistry (Kong et al., Methods Mol Biol 2024) combined with Abberior Star red and orange secondary antibodies. This technique resulted in robust signal at centrosomes and in the cytoplasm and indicated that TEDC1 and TEDC2 localize near the centriole walls of procentrioles and the proximal region of parental centrioles, near CEP44 (Fig 2 – Supplementary Figure 1c, d). These stainings complement and support our initial observations (Fig 2C, D) and we have edited the text to reflect this (lines 157-163). We have also removed the supplementary figures that were uncited in the text.
TUBD1 and TUBE1 form a dimer and TEDC2 and TEDC1 can interact. Any speculation as to why TEDC2 does not pull down both TUBE1 and TUBD1?
We apologize for the confusion. TEDC2 does pull down both TUBE1 and TUBD1 (Fig 3D, pull-down, second column, Tedc2-V5-APEX2 rescuing the Tedc2<sup>-/-</sup> cells pulls down TUBD1, TUBE1, and TEDC1).
Figure 4A and B. The authors use acetylated tubulin to determine the length of procentrioles in the S and G2 phases. However, procentrioles are not acetylated on their distal ends in these cell phase phases (as the authors also mention further in the text). Why has alpha tubulin not been used since it works well in U-ExM? The average size of the control, G2 procentrioles, seems too small in Figure 4A and not consistent with other imaging data (for instance, in Figure 4 - Supplement 1 C, Cp110, and CPAP staining). There is no statistical analysis in F4A.
We have added two supplementary data figures (Fig 4 – supp 3 and Fig 4 – supp 4) in which we co-stain control and mutant centrioles with alpha-tubulin. We found that acetylated tubulin correlates well with overall tubulin signal in all mutants. We have added statistical analysis to the figure legend of Fig 4A.
Lines 260 - 262: "These results indicate that centrioles with singlet microtubules can elongate to the same length as controls, and therefore that triplet microtubules are not essential for regulating centriole length." It is hard to agree with this statement. Mutant procentrioles show aberrantly elongated proximal signals of several tested proteins. In addition, in lines 326 - 328, the authors state that "Together, these results indicate that centrioles lacking compound microtubules are unable to properly regulate the length of the proximal end."
We thank the reviewer and have clarified the statement to state that these results indicate that centrioles with singlet microtubules can elongate to the same overall length as control centrioles in G2 phase.
Line 353: The authors suggest that elongated procentriole structure in mitosis may represent intermediates in centriole disassembly. Another interpretation, more in line with the EM data from Wang et al., 2017, would be that these mutant procentrioles first additionally elongate before they disassemble in late mitosis. The aberrant intermediate structure concept would need further exploration. For instance, anti-alpha/beta-tubulin antibodies could be used to investigate centriole microtubules.
We apologize for the confusion and have edited this section for clarity (lines 341-343): “We conclude that in our mutant cells, centrioles elongate in early mitosis to form an aberrant intermediate structure, followed by fragmentation in late mitosis.”
References need to be included in lines 122, 277, 279.
We have added these references
Line 281: Add references PMID: 30559430 and PMID: 32526902.
We have added these references (lines 265-266).
Line 289: "Moreover, our results suggest that centriole glutamylation is a multistep process, in which long glutamate side chains are added later during centriole maturation." This does not seem like an original observation. For instance, see PMID: 32526902.
We have added this reference (lines 273-274).
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Reviewer 1:
(1) Provide Rsmd and DALI scores to show how similar the AlphaFold-predicted structures of BrrG are to other anti-termination factors. This should be done for Fig1B and also for Suppl. Fig 1 to support the claim that BrrG, GafA, GafZ, Q21 share structural features.
In the revised manuscript we will provide Rsmd and DALI scores.
(2) Throughout the manuscript, flow cytometry data of gfp expression was used and shown as single replicate. Korotaev et al wrote in the legends that error bars are shown (that is not true for e.g. Figs. 3, 4, and 5). It is difficult for reviewers/readers to gauge how reliable are their experiments.
As stated in the manuscript all flow cytometry data were performed in triplicate. In the revised manuscript we will include the two replicates not presented in the main figures as supplementary information.
(3) I am unsure how ChIP-seq in Fig. 2A was performed (with anti-FLAG or anti-HA antibodies? I cannot tell from the Materials & Methods). More importantly, I did not see the control for this ChIP-seq experiment. If a FLAG-tagged BrrG was used for ChIP-seq, then a WT non-tagged version should be used as a negative control (not sequencing INPUT DNA), this is especially important for anti-terminator that can co-travel with RNA polymerase. Please also report the number of replicates for ChIP-seq experiments.
Fig. 2A presents a coverage plot from the ChIP-Seq of ∆brrG +pTet:brrG-3xFLAG (N’). A replicate of this N-terminally tagged construct will be added as supplementary data in the revised version. As anticipated by the referee, we had used ∆brrG +pTet:brrG (untagged) as control.
(4) Korotaev et al mentioned that BrrG binds to DNA (as well as to RNA polymerase). With the availability of existing ChIP-seq data, the authors should be able to locate the DNA-binding element of BrrG, this additional information will be useful to the community.
We will mine the ChIP-Seq data to define the BrrG binding site as closely as possible and include the analysis in the revised version of the manuscript.
(5) Mutational experiments to break the potential hairpin structure are required to strengthen the claim that this putative hairpin is the potential transcriptional terminator.
We did not claim that the identified hairpin is a terminator but rather suggested it as a candidate terminator. We agree with the referee that the proposed experiment would be necessary to definitively prove its terminator function. However, our primary aim was to demonstrate that BrrG acts as a processive terminator, which we have shown by replacing the putative terminator with a well-characterized synthetic terminator that BrrG successfully overcame. Therefore, we prefer not to conduct the proposed experiment and will instead further tone down our conclusions regarding the putative terminator function of the identified hairpin structure.
Reviewer 2:
(1) The authors wrote "GTAs are not self-transmitting because the DNA packaging capacity of a GTA particle is too small to package the entire gene cluster encoding it" (page 3). I thought that at least the Bartonella capsid gene cluster should be self-transmissible within the 14 kb packaged DNA (https://doi.org/10.1371/journal.pgen.1003393, https://doi.org/10.1371/journal.pgen.1000546). This was also concluded by Lang et al (https://doi.org/10.1146/annurev-virology-101416-041624). In this case the presented results would have important implications. As the gene cluster and the anti-terminator required for its expression are separated on the chromosome, it would not be possible to transfer an active GTA gene cluster, although the DNA coding for the genes required for making the packaging agent itself, theoretically fits into a BaGTA particle. Could the authors comment on that? I think it would be helpful to add the sizes of the different gene clusters and the distance between them in Fig. 2A. The ROR amplified region spans 500kb, is the capsid gene cluster within this region?
We thank the reviewer for bringing up this interesting point. The bgt cluster (capsid cluster) is approximately 9.2 kb in size and could feasibly be packaged in its entirety into a GTA particle. In contrast, the ror gene cluster, which encodes the antiterminator BrrG, is approximately 20 kb in size—exceeding the packaging limit of GTA particles—and is separated from the bgt cluster by approximately 35 kb. Consequently, if the bgt cluster is transferred via a GTA particle into a recipient host that does not encode the ror gene cluster, the bgt cluster would not be expressed.
(2) Another side-note regarding the introduction: On page three the authors write: "GTAs encode bacteriophage-like particles and in contrast to phages transfer random pieces of host bacterial DNA". While packaging is not specific, certain biases in the packaging frequency are observed in both studied GTA families. For Bartonella this is ROR. In the two GTA-producing strains D. shibae and C. crescentus origin and terminus of replication are not packaged and certain regions are overrepresented (https://doi.org/10.1093/gbe/evy005, https://doi.org/10.1371/journal.pbio.3001790). Furthermore, D. shibae plasmids are not packaged but chromids are. I think the term "random" does not properly describe these observations. I would suggest using "not specific" instead.
We thank the reviewer for this suggestion and will adjust the working accordingly.
(3) Page 5: Remove "To address this". It is not needed as you already state "To test this hypothesis" in the previous sentence.
We will adjust the working accordingly.
(4) I think the manuscript would greatly benefit from a summary figure to visualize the Q-like antiterminator-dependent regulatory circuit for GTA control and its four components described on pages 15 and 16.
We thank the reviewer for this valuable suggestion and will include a summary figure illustrating the deduced regulatory mechanism in the revised manuscript.
(5) Page 17: It might be worth noting that GafA is highly conserved along GTAs in Rhodobacterales (https://doi.org/10.3389/fmicb.2021.662907) and so is probably regulatory integration into the ctrA network (https://doi.org/10.3389/fmicb.2019.00803). It's an old mechanism. It would be also interesting to know if it is a common feature of the two archetypical GTAs that the regulator is not part of the cluster itself.
We agree with the points raised by the reviewer and will address them in the revised manuscript. Specifically, we will highlight the high conservation of GafA in GTAs across Rhodobacterales and its regulatory integration within the ctrA network. Additionally, we will analyze whether the GafA-like antitermination regulator is typically located outside the regulated gene cluster, as we have demonstrated for BrrG of BaGTA in the Bartonellae.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
Summary:
In this work, Huang et al used SMRT sequencing to identify methylated nucleotides (6mA, 4mC, and 5mC) in Pseudomonas syringae genome. They show that the most abundant modification is 6mA and they identify the enzymes required for this modification as when they mutate HsdMSR they observe a decrease of 6mA. Interestingly, the mutant also displays phenotypes of change in pathogenicity, biofilm formation, and translation activity due to a change in gene expression likely linked to the loss of 6mA. Overall, the paper represents an interesting set of new data that can bring forward the field of DNA modification in bacteria.
Thank you for your valuable feedback on our paper exploring the impact of 6mA modification in P. syringae.
Major Concerns:
Most of the authors' data concern Psph pathovar. I am not sure that the authors' conclusions are supported by the two other pathovars they used in the initial 2 figures. If the authors want to broaden their conclusions to Pseudomonas syringe and not restrict it to Psph, the authors should have stronger methylation data using replicates. Additionally, they should discuss why Pss is so different than Pst and Psph. Could they do a blot to confirm it is really the case and not a sequencing artifact? Is the change of methylation during bacterial growth conserved between the pathovar? The authors should obtain mutants in the other pathovar to see if they have the same phenotype. The authors have a nice set of data concerning Psph but the broadening of the results to other pathovar requires further investigation.
We appreciate the reviewer’s insightful comments. While the majority of our data focuses on the Psph, we recognize the importance of validating these findings in Pss and Pst. To this end, we have performed additional experiments using dot blot and mutant construction to enhance our conclusions in other pathovars.
We agree that we should discuss why Pss is different from Psph and Pst. We performed a dot blot assay using genome DNA in Pss and Pst, presented in Figure S5A. Meanwhile, we compared the 6mA modification level of Pss and Pst in different growth phases. As shown in Figure S5A, the change of methylation during bacterial growth is conserved in Pst. The change was not obvious in Pss, which might be due to the lack of a type I R-M system.
“In accordance with previous studies showing that growth conditions affect the bacterial methylation status, we applied dot blot experiments using the same amount of DNA (1 μg) from these three P. syringae strains to detect the 6mA levels during both logarithmic and stationary phases. The results revealed that 6mA levels in the stationary phase were much higher compared to the logarithmic phase in Psph and Pst, but no significant change in Pss. Additionally, we found that during the stationary phase, 6mA methylation levels in Psph and Pst were higher than those in Pss. These findings were consistent with the MTases predication on these three strains, since Pss does not harbor any type I R-M systems, which are important for 6mA medication in bacteria.”
Please see Figure S5A and Lines 220-228 in the revised manuscript.
We also tried to construct an HsdM mutant in Pst to explore whether the influence of 6mA methylation was conserved in P. syringae, but it failed after multiple attempts. We did not construct a Pss mutant because no type I R-M system was predicted, and few methylation sites were identified via SMRT-seq in this strain. Therefore, we overexpressed HsdM in Pst instead. We have performed additional experiments in WT and the HsdM overexpression strains, including dot blot and growth curve assays.
Please see Figures S5B-C and Lines228-232 in the revised manuscript.
The authors should include proper statistical analysis of their data. A lot of terms are descriptive but not supported by a deeper analysis to sustain the conclusions. For example, in Figure 4E, we do not know if the overlap is significant or not. Are DEGs more overlapping to 6mA sites than non-DEGs? Here is a non-exhaustive list of terms that need to be supported by statistics: different level (L145), greater conservation (L162), significant conservation (L165), considerable similarity (L175), credible motifs (L189), Less strong (L277) and several "lower" and "higher" throughout the text.
Thank you for the insightful feedback. We have made the following revisions in the manuscript to ensure that the terms are more precise and do not require statistical significance testing.
(1) Statistical analysis: We have added statistical tests for the overlap between DEGs and 6mA sites in Figure 4E. We performed the Fisher test, and we found the overlap was not significant (p> 0.05). DEGs and non-DEGs were both non-significant overlapped 6mA sites. Please see Figure 4E and Lines 261-262.
“Less strong” was used to indicate the influence of HsdM on biofilm in Figure 5D. All Figures with “*” labels were analyzed using students' two-tailed t-tests with a significant change (p < 0.05).
(2) Revised wording: For terms used to describe comparisons, we have revised the wording to be clearer and ensure that the terminology used did not imply the need for statistical significance testing when not required. For example:
“Different level” has been removed. Please see Lines 143-144.
“Greater conservation” has been revised to “more conserved functional terms”. Please see Lines 161-162.
“Significant conservation” has been revised to “notable conservation”. Please see Line 165.
“Credible motifs” has been revised to “identified motifs”. Please see Line 186.
The authors performed SMRT sequencing of the delta hsdMSR showing a reduction of 6mA. Could they include a description of their results similar to Figures 1-2. How reduced is the 6mA level? Is it everywhere in the genome? Does it affect other methylation marks? This analysis would strengthen their conclusions.
Yes, we agree. We have provided additional analysis and descriptions to strengthen the conclusions regarding these valuable comments. We determined three methylation sites in the HsdMSR mutant strain and compared the overlapped genes within these modification patterns. Specifically, we focused on the 6mA sites in Psph WT, HsdMSR mutant, and HsdM motif CAGCN<sub>(6)</sub>CTC. As expected, we found almost all of the reduction 6mA sites in the ΔhsdMSR were from motif CAGCN<sub>(6)</sub>CTC. We also noticed that 5mC and 4mC sites in the mutant were relatively similar to that in WT, and the slight difference might be caused by sequencing errors. Overall, we propose that HsdMSR only catalyze the 6mA located on the motif CAGCN<sub>(6)</sub>CTC, but does not affect other 6mA sites and other modification types.
Please see Figures S4D-E and Lines 212-218 in the revised manuscript.
In Figure 6E to conclude that methylation is required on both strands, the authors are missing the control CAGCN6CGC construct otherwise the effect could be linked to the A on the complementary strand.
Thank you for your valuable suggestions. We have provided the control result on the complementary strand. Please see Figure 6C. The new result evidences the conclusion that 6mA methylation regulates gene transcription based on methylation on both strands.
Please see Figure 6C and Lines 329-330 in the revised manuscript.
Reviewer #2 (Public Review):
In the present manuscript, Huang et.al. revealed the significant roles of the DNA methylome in regulating virulence and metabolism within Pseudomonas syringae, with a particular focus on the HsdMSR system in this model strain. The authors used SMRT-seq to profile the DNA methylation patterns (6mA, 5mC, and 4mC) in three P. syringae strains (Psph, Pss, and Psa) and displayed the conservation among them. They further identified the type I restriction-modification system (HsdMSR) in P. syringae, including its specific motif sequence. The HsdMAR participated in the process of metabolism and virulence (T3SS & Biofilm formation), as demonstrated through RNA-seq analyses. Additionally, the authors revealed the mechanisms of the transcriptional regulation by 6mA. Strictly from the point of view of the interest of the question and the work carried out, this is a worthy and timely study that uses third-generation sequencing technology to characterize the DNA methylation in P. syringae. The experimental approaches were solid, and the results obtained were interesting and provided new information on how epigenetics influences the transcription in P. syringae. The conclusions of this paper are mostly well supported by data, but some aspects of data analysis and discussion need to be clarified and extended.
Thank you for your positive feedback and recognition of the importance of our study. We appreciate the suggestions for further clarification and extension of some aspects of data analysis and discussion. We added further analysis of the SMRT-seq result of the ΔhsdMSR and overexpressed HsdM in Pst to provide more information on conservation. We added these contents to the discussion in the revised manuscript. Please see Figure 6C and Figure S5.
Reviewer #3 (Public Review):
Summary:
The article by Huang et.al. presents an in-depth study on the role of DNA methylation in regulating virulence and metabolism in Pseudomonas syringae, a model phytopathogenic bacterium. This comprehensive research utilized single-molecule real-time (SMRT) sequencing to profile the DNA methylation landscape across three model pathovars of P. syringae, identifying significant epigenetic mechanisms through the Type-I restriction-modification system (HsdMSR), which includes a conserved sequence motif associated with N6-methyladenine (6mA). The study provides novel insights into the epigenetic mechanisms of P. syringae, expanding the understanding of bacterial pathogenicity and adaptation. The use of SMRT sequencing for methylome profiling, coupled with transcriptomic analysis and in vivo validation, establishes a robust evidence base for the findings
Strengths:
The results are presented clearly, with well-organized figures and tables that effectively illustrate the study's findings.
Weaknesses:
It would be helpful to add more details, especially in the methods, which make it easy to evaluate and enhance the manuscript's reproducibility.
Thank you for the positive evaluation of our study, as well as the constructive feedback provided. We have added more details in methods for RNA-seq analysis and Ribo-seq analysis. Please see Lines 484-515.
“Briefly, bacteria were cultured to an OD<sub>600</sub> of 0.4, at which point chloramphenicol was added to a final concentration of 100 µg/mL for 2 minutes. Cells were then pelleted and washed with pre-chilled lysis buffer [25 mM Tris-HCl, pH 8.0; 25 mM NH4Cl; 10 mM MgOAc; 0.8% Triton X-100; 100 U/mL RNase-free DNase I; 0.3 U/mL Superase-In; 1.55 mM chloramphenicol; and 17 mM GMPPNP]. The pellet was resuspended in lysis buffer, followed by three freeze-thaw cycles using liquid nitrogen. Sodium deoxycholate was then added to a final concentration of 0.3% before centrifugation. The resulting supernatant was adjusted to 25 A260 units and mixed with 2 mL of 500 mM CaCl<sub>2</sub> and 12 µL MNase, making up a total volume of 200 µL. After the digestion, the reaction was quenched with 2.5 mL of 500 mM EGTA. Monosomes were isolated using Sephacryl S400 MicroSpin columns, and RNA was purified using the miRNeasy Mini Kit (Qiagen). rRNA was removed using the NEBNext rRNA Depletion Kit, and the final library was constructed with the NEBNext Small RNA Library Prep Kit. For each sample, ribosome footprint reads were mapped to the Psph 1448A reference genome, and the translational efficiency was calculated by dividing the normalized Ribo-seq counts by the normalized RNA counts. Two biological replicates were performed for all experiments.”
Recommendations For The Authors:
Reviewer #1 (Recommendations For The Authors):
I would recommend the authors limit their manuscript to Psph pathovar and include statistical analysis supporting their conclusions.
Thank you for your suggestion.
Minor
• L104: "significantly" please add a p-value and explain the analysis.
Sorry for the confusion. We have added the p-value and explained the analysis in the method section. The p-value used for SMRT-seq was the modification quality value (QV) score, which were used to call the modified bases A (QV=50) and C (QV=100). Please see Lines 452-454.
• Figures 1B, D, F, and Figure 2A: make the Venn diagram to scale
Yes, we have revised.
• L110-111: missing p-value to say that the authors observe a bigger overlap in Pst than Psph as they observed more modified sites in Pst
Sorry for the confusion. We said it had a bigger overlap in Pst because the number 17.7 was bigger than the number of 15 in Psph. To avoid misunderstanding, we revised the wording to “more genes equipped with all three modification types were detected in Pst than Psph”. Please see Lines 110-111.
• L112: missing description of their Pss analysis (IDP, sites...)
We have added the information for Pss in the revised manuscript.
“Additionally, the methylome atlas of Pss revealed a lower incidence of methylation than those of Psph and Pst, particularly in terms of 6mA modifications, which were only seen in 457 significant 6mA occurrences under the same threshold (IPD > 1.5) and a total of 2,853 and 1,438 methylation sites were detected as 5mC and 4mC, respectively”. Please see Lines 114-116.
• L118: "modification" to "modified "
We have revised. Please see Line 119.
• L120: "modification sites" to "modified nucleotides"
We have revised. Please see Line 121.
• L142: correct the title "Methylated genes revealed highly functional conservation among three P. syringae strains" maybe to "Methylated genes are functionally conserved among ..."
We have revised. Please see Line 142.
• Figure 2C: not easy to read and interpret
Sorry for the confusion. Figure 2C revealed the significantly enriched functional pathways in GO and KEGG databases among three P. syringae strains. The specific names of each pathway were listed on the left, and each column with dots indicated the number of genes within one kind of methylation in one of three P. syringae strains. The larger the size, the bigger the number.
We have revised the legend of Figure 2C. Please see Lines 575-579.
“The dot plot revealed the significantly enriched functional pathways in GO and KEGG databases among three P. syringae strains. The specific names of each pathway were listed on the left, and each column with dots indicated the number of genes within one kind of methylation in one of three P. syringae strains. The size of the dots indicates the number of related genes.”
• Figure 6B-C: what is the difference between B 24h and C?
Figure 6B revealed the expression difference between WT and mutant during 24 hours. Figure 6C only showed a time point in 24 hours. To avoid repetition, we have removed Figure 6C.
• Figure 6C-D: if the same maybe remove Figure 2C
We have removed Figure 6D.
Reviewer #2 (Recommendations For The Authors):
The manuscript could be improved by addressing the following concerns:
(1) In line 146: How to understand the percentage conserved in "more than two of the strains"?
Sorry for the confusion, we planned to indicate the pattern that conserved in two strains and three strains. We have revised it to: “Notable, about 25% to 45% of methylated genes were conserved in two and three strains”. Please see Line 145.
(2) In line 178: Five conserved sequence motifs should be replaced by "Six conserved sequence motifs".
We have revised. Please see Line 176.
(3) In Figure 2B, specify the C1, C2 and C3. "m6A" should be replaced by "6mA".
Yes, we have revised.
(4) In Figure S2, "m6A" should be replaced by "6mA".
Yes, we have revised.
(5) In line 212, please add references for the previous studies showing that growth conditions affect bacterial methylation status.
Thank you for your suggestion. We have added the relevant references (Gonzalez and Collier, 2013), (Krebes et al., 2014), (Sanchez-Romero and Casadesus, 2020).
(6) In line 217, "illustrate" should be "illustrated".
Yes, we have revised. Please see Line 210.
(7) There are some genes colored in grey, revealing bigger differences between the two strains than those related to ribosomal protein, T3SS, and alginate synthesis in Fig. 4A. Do they have important functional roles as well?
Thank you for your suggestion. A total of 116 genes with bigger differences (|Log<sub>2</sub>FC| > 2) except for genes related to ribosomal protein, T3SS, and alginate synthesis. Among these genes, 31 were annotated as hypothetical proteins and 4 as transcription factors with unknown functions, and the remaining genes mostly encoded metabolism-related enzymes. These enzymes might have effects on growth defects in ΔhsdMSR. We added this information in the revised manuscript. Please see Line 249-254.
(8) The authors should discuss what will be the potential signals or factors that can regulate the activity of HsdMSR. In other words, what can decide the extent of methylation through activating or suppressing the expression of HsdMSR?
Thank you for your valuable suggestion. We have added this part in the discussion part. Please see Lines 404-415.
“Apart from the established roles of 6mA and HsdMSR in P. syringae, certain signals or factors may influence HsdMSR expression. For instance, we confirmed that the growth phase affects methylation levels in P. syringae. Previous studies have shown that increased temperatures can reduce methylation levels, as observed in PAO1(Doberenz et al., 2017). These findings suggest that HsdMSR expression may be responsive to both intrinsic cellular states and extrinsic environmental conditions. To further explore potential upstream TFs regulating the expression of HsdMSR, we searched for TF-binding sites in the HsdMSR promoter using our own databases (Fan et al., 2020; Shao et al., 2021; Sun et al., 2024). As a result, we found three candidate TFs (PSPPH_0061, PSPPH_3268, and PSPPH_3504) that might directly bind and regulate HsdMSR expression. Future studies on these TFs and their interactions with the HsdMSR promoter would help clarify the regulatory network governing HsdMSR activity.”
Reviewer #3 (Recommendations For The Authors):
(1) Some figures contain dense information, which may be overwhelming for readers. Streamlining the legend for Figure 1 and resizing the Venn diagrams within it could enhance clarity and visual appeal.
Thank you for your suggestion. We have scaled all the Venn plots in the revised version.
(2) Incorporating a discussion about the role of the restriction-modification (RM) system in bacterial defense against phage infection into the discussion section could enrich the manuscript's context and relevance.
Thank you for your valuable suggestion. We have added this part in the Discussion part. Please see Lines 416-427.
“RM systems are known for their intrinsic role as innate immune systems in anti-phage infection, and present in around 90% of bacterial genomes(Oliveira et al., 2014). RM systems protect bacteria self by recognizing and degrading foreign phage DNA via methylation-specific site and restriction endonucleases (REases) (Loenen et al., 2014). In addition, other phage-resistance systems are similar to RM systems but carry extra genes. One is called the phage growth limitation (Pgl) system, which modifies and cleaves phage DNA. However, the Pgl only modifies the phage DNA in the first infection cycle, and cleaves phage DNA in the subsequent infections, which gives a warn to the neighboring cells(Hampton et al., 2020; Hoskisson et al., 2015). To counteract RM and RM-like systems, phages have evolved strategies, including unusual modifications such as hydroxymethylation, glycosylation, and glucosylation. They can also encode their own MTases to protect their DNA or employ strategies to evade restriction systems and other anti-RM defenses.(Iida et al., 1987; Murphy et al., 2013; Vasu and Nagaraja, 2013).
(3) In line 152: What is the importance of the mentioned example of Cro/CI family TF?
Thank you for your comments. The Cro/CI are important TFs present in phages. The interaction between Cro and CI affects bacteria immunity status in Enterohemorrhagic Escherichia coli (EHEC) strains(Jin et al., 2022). RM systems are known as a kind of phage-defense system, and hence we mentioned it here. We have added this description in the revised manuscript. Please see Lines 152-154.
Reference:
(1) Doberenz, S., Eckweiler, D., Reichert, O., Jensen, V., Bunk, B., Sproer, C., Kordes, A., Frangipani, E., Luong, K., Korlach, J., et al. (2017). Identification of a Pseudomonas aeruginosa PAO1 DNA Methyltransferase, Its Targets, and Physiological Roles. mBio 8. 10.1128/mBio.02312-16.
(2) Fan, L., Wang, T., Hua, C., Sun, W., Li, X., Grunwald, L., Liu, J., Wu, N., Shao, X., Yin, Y., et al. (2020). A compendium of DNA-binding specificities of transcription factors in Pseudomonas syringae. Nat Commun 11, 4947. 10.1038/s41467-020-18744-7.
(3) Gonzalez, D., and Collier, J. (2013). DNA methylation by CcrM activates the transcription of two genes required for the division of Caulobacter crescentus. Mol Microbiol 88, 203-218. 10.1111/mmi.12180.
(4) Hampton, H.G., Watson, B.N., and Fineran, P.C. (2020). The arms race between bacteria and their phage foes. Nature 577, 327-336.
(5) Hoskisson, P.A., Sumby, P., and Smith, M.C. (2015). The phage growth limitation system in Streptomyces coelicolor A (3) 2 is a toxin/antitoxin system, comprising enzymes with DNA methyltransferase, protein kinase and ATPase activity. Virology 477, 100-109.
(6) Iida, S., Streiff, M.B., Bickle, T.A., and Arber, W. (1987). Two DNA antirestriction systems of bacteriophage P1, darA, and darB: characterization of darA− phages. Virology 157, 156-166.
(7) Jin, M., Chen, J., Zhao, X., Hu, G., Wang, H., Liu, Z., and Chen, W.-H. (2022). An engineered λ phage enables enhanced and strain-specific killing of enterohemorrhagic Escherichia coli. Microbiology Spectrum 10, e01271-01222.
(8) Krebes, J., Morgan, R.D., Bunk, B., Sproer, C., Luong, K., Parusel, R., Anton, B.P., Konig, C., Josenhans, C., Overmann, J., et al. (2014). The complex methylome of the human gastric pathogen Helicobacter pylori. Nucleic Acids Res 42, 2415-2432. 10.1093/nar/gkt1201.
(9) Loenen, W.A., Dryden, D.T., Raleigh, E.A., Wilson, G.G., and Murray, N.E. (2014). Highlights of the DNA cutters: a short history of the restriction enzymes. Nucleic Acids Res 42, 3-19.
(10) Murphy, J., Mahony, J., Ainsworth, S., Nauta, A., and van Sinderen, D. (2013). Bacteriophage orphan DNA methyltransferases: insights from their bacterial origin, function, and occurrence. Appl Environ Microb 79, 7547-7555.
(11) Oliveira, P.H., Touchon, M., and Rocha, E.P. (2014). The interplay of restriction-modification systems with mobile genetic elements and their prokaryotic hosts. Nucleic Acids Res 42, 10618-10631.
(12) Sanchez-Romero, M.A., and Casadesus, J. (2020). The bacterial epigenome. Nature reviews. Microbiology 18, 7-20. 10.1038/s41579-019-0286-2.
(13) Shao, X., Tan, M., Xie, Y., Yao, C., Wang, T., Huang, H., Zhang, Y., Ding, Y., Liu, J., Han, L., et al. (2021). Integrated regulatory network in Pseudomonas syringae reveals dynamics of virulence. Cell Rep 34, 108920. 10.1016/j.celrep.2021.108920.
(14) Sun, Y., Li, J., Huang, J., Li, S., Li, Y., Lu, B., and Deng, X. (2024). Architecture of genome-wide transcriptional regulatory network reveals dynamic functions and evolutionary trajectories in Pseudomonas syringae. bioRxiv, 2024.2001. 2018.576191.
(15) Vasu, K., and Nagaraja, V. (2013). Diverse functions of restriction-modification systems in addition to cellular defense. Microbiol Mol Biol Rev 77, 53-72. 10.1128/MMBR.00044-12.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public review):
Summary:
The authors present a modelling study to test the hypothesis that horizontal gene transfer (HGT) can modulate the outcome of interspecies competition in microbiomes, and in particular promote bistability in systems across scales. The premise is a model developed by the same authors in a previous paper where bistability happens because of a balance between growth rates and competition for a mutual resource pool (common carrying capacity). They show that introducing a transferrable element that gives a "growth rate bonus" expands the region of parameter space where bistability happens. The authors then investigate how often (in terms of parameter space) this bistability occurs across different scales of complexity, and finally under selection for the mobile element (framed as ABR selection).
Strengths:
The authors tackle an important, yet complex, question: how do different evolutionary processes impact the ecology of microbial ecosystems? They do a nice job at increasing the scales of heterogeneity and asking how these impact their main observable: bistability.
We appreciate the reviewer for agreeing with the potential value of our analysis. We are also grateful for the constructive comments and suggestions on further analyzing the influence of the model structure and the associated assumptions. We have fully addressed the raised issues in the updated manuscript and below.
Weaknesses:
The author's starting point is their interaction LV model and the manuscript then explores how this model behaves under different scenarios. Because the structure of the model and the underlying assumptions essentially dictate these outcomes, I would expect to see much more focus on how these two aspects relate to the specific scenarios that are discussed. For example:
A key assumption is that the mobile element conveys a multiplicative growth rate benefit (1+lambda). However, the competition between the species is modelled as a factor gamma that modulates the competition for overall resource and thus appears in the saturation term (1+ S1/Nm + gamma2*S2/Nm). This means that gamma changes the perceived abundance of the other species (if gamma > 1, then from the point of view of S1 it looks like there are more S2 than there really are). Most importantly, the relationship between these parameters dictates whether or not there will be bistability (as the authors state).
This decoupling between the transferred benefit and the competition can have different consequences. One of them is that - from the point of view of the mobile element - the mobile element competes at different strengths within the same population compared to between. To what degree introducing such a mobile element modifies the baseline bistability expectation thus strongly depends on how it modifies gamma and lambda.
Thus, this structural aspect needs to be much more carefully presented to help the reader follow how much of the results are just trivial given the model assumptions and which have more of an emergent flavour. From my point of view, this has an important impact on helping the reader understand how the model that the authors present can contribute to the understanding of the question "how microbes competing for a limited number of resources stably coexist". I do appreciate that this changes the focus of the manuscript from a presentation of simulation results to more of a discussion of mathematical modelling.
We thank the reviewer for the insightful suggestions. We agree with the reviewer that the model structure and the underlying assumptions need to be carefully discussed, in order to understand the generality of the theoretical predictions. In particular, the reviewer emphasized that how HGT affects bistability might depend on how mobile genetic elements modified growth rates and competition. In the main text, we have shown that when mobile genes only influence species growth rates, HGT is expected to promote multistability (Fig. 1 and 2). However, when mobile genes modify species interactions, the effect of HGT on multistability is dependent on how mobile genes change competition strength (Fig. 3a to f). When mobile genes increase competition, HGT promotes multistability (Fig. 3c and e). In contrast, when mobile genes relax competition, HGT is expected to reduce multistability (Fig. 3d and f).
In light of the reviewer’s comments, we have further generalized the model structure, by accounting for the scenario where mobile genes simultaneously modify growth rates and competition. The effect of mobile genes on growth rates is represented by the magnitude of 𝜆’s, and the influence on competition is described by another parameter 𝛿. By varying these two parameters, we can evaluate how the model structure and the underlying assumptions affect the baseline expectation. We performed additional simulations with broad ranges of 𝜆 and 𝛿 values. In particular, we analyzed whether HGT would promote the likelihood of bistability in two-species communities compared with the scenario without gene transfer (Fig. 3g-i). Our results suggested that: (1) With or without HGT, reducing 𝜆 (increasing neutrality) promotes bistability; (2) With HGT, increasing 𝛿 promotes bistability; (2) Compared with the population without HGT, gene transfer promotes bistability when 𝛿 is zero or positive, while reduces bistability when 𝛿 is largely negative. These results agree with the reviewer’s comment that the baseline bistability expectation depends on how HGT modifies gamma and lambda. In the updated manuscript, we have thoroughly discussed how the model structure and the underlying assumptions can influence the predictions (line 238-253).
We further expanded our analysis, by calculating how other parameters, including competition strength, growth rate ranges, and death/dilution rate, would affect the multistability of communities undergoing horizontal gene transfer (Fig. S2, S3, S9, S10, S11, S12, S13, S15). Together with the results presented in the first draft, these analysis enables a more comprehensive understanding of how different mechanisms, including but not limited to HGT, collectively shaped community multistability. In the updated manuscript, the reviewer can see the change of focus from exploring the effects of HGT to a more thorough discussion of the mathematical model. The revised texts highlighted in blue and the supplemented figures reflect such a change.
Reviewer #2 (Public review):
Summary:
In this work, the authors use a theoretical model to study the potential impact of Horizontal Gene Transfer on the number of alternative stable states of microbial communities. For this, they use a modified version of the competitive Lotka Volterra model-which accounts for the effects of pairwise, competitive interactions on species growth-that incorporates terms for the effects of both an added death (dilution) rate acting on all species and the rates of horizontal transfer of mobile genetic elements-which can in turn affect species growth rates. The authors analyze the impact of horizontal gene transfer in different scenarios: bistability between pairs of species, multistability in communities, and a modular structure in the interaction matrix to simulate multiple niches. They also incorporate additional elements to the model, such as spatial structure to simulate metacommunities and modification of pairwise interactions by mobile genetic elements. In almost all these cases, the authors report an increase in either the number of alternative stable states or the parameter region (e.g. growth rate values) in which they occur.
In my opinion, understanding the role of horizontal gene transfer in community multistability is a
very important subject. This manuscript is a useful approach to the subject, but I'm afraid that a thorough analysis of the role of different parameters under different scenarios is missing in order to support the general claims of the authors. The authors have extended their analysis to increase their biological relevance, but I believe that the analysis still lacks comprehensiveness.
Understanding the origin of alternative stable states in microbial communities and how often they may occur is an important challenge in microbial ecology and evolution. Shifts between these alternative stable states can drive transitions between e.g. a healthy microbiome and dysbiosis. A better understanding of how horizontal gene transfer can drive multistability could help predict alternative stable states in microbial communities, as well as inspire novel treatments to steer communities towards the most desired (e.g. healthy) stable states.
Strengths:
(1) Generality of the model: the work is based on a phenomenological model that has been extensively used to predict the dynamics of ecological communities in many different scenarios.
(2) The question of how horizontal gene transfer can drive alternative stable states in microbial communities is important and there are very few studies addressing it.
We thank the reviewer for the positive comments on the potential novelty and conceptual importance of our work. We are also grateful for the constructive suggestions on the generality and comprehensiveness of our analysis. In particular, we agree with the reviewer that a thorough analysis of the role of different parameter could further improve the rigor of this work. We have fully addressed the raised issues in the updated manuscript and below.
Weaknesses:
(1) There is a need for a more comprehensive analysis of the relative importance of the different model parameters in driving multistability. For example, there is no analysis of the effects of the added death rate in multistability. This parameter has been shown to determine whether a given pair of interacting species exhibits bistability or not (see e.g. Abreu et al 2019 Nature Communications 10:2120). Similarly, each scenario is analyzed for a unique value of species interspecies interaction strength-with the exception of the case for mobile genetic elements affecting interaction strength, which considers three specific values. Considering heterogeneous interaction strengths (e.g. sampling from a random distribution) could also lead to more realistic scenarios - the authors generally considered that all species pairs interact with the same strength. Analyzing a larger range of growth rates effects of mobile genetic elements would also help generalize the results. In order to achieve a more generic assessment of the impact of horizontal gene transfer in driving multistability, its role should be systematically compared to the effects of the rest of the parameters of the model.
We appreciate the suggestions. For each of the parameters that the reviewer mentioned, we have performed additional simulations to evaluate its importance in driving multistability.
For the added death rate, we have calculated the bistability feasibility of two-species populations under different values of 𝐷. Our results suggested that (1) varying death rate indeed changed the bistability probability of the system; (2) when the death rate was zero, mobile genetic elements that only modify growth rates would have no effects on system’s bistability. These results highlighted the importance of added death rate in driving multistability (Fig. S2, line 136-142).
For the interspecies interaction strength, we first extended our analysis on two-species populations. By calculating the bistability probability under different values of 𝛾, we showed that when interspecies interaction strength was smaller than 1, the influence of HGT on population bistability became weak (Fig. S3, line 143-147). We also considered heterogenous interaction strengths in multispecies communities, by randomly sampling 𝛾<sub>ij</sub> values from uniform distributions. While our results suggested the heterogeneous distribution of 𝛾<sub>ij</sub> didn’t fundamentally change the main conclusion, the mean value and variance of 𝛾<sub>ij</sub> affected the influence of HGT on multistability. The effects of HGT on community multistability becomes stronger when the mean value of 𝛾<sub>ij</sub> gets larger than 1 and the variance of 𝛾<sub>ij</sub> is small (Fig. S12, line 190-196).
We also analyzed different ranges of growth rates effects of mobile genetic elements. In particular, we sampled 𝜆<sub>ij</sub> values from uniform distributions with given widths. Greater width led to larger range of growth rate effects. We used five-species populations as an example and tested different ranges. Our results suggested that multistability was more feasible when the growth rate effects of MGEs were small. The qualitative relationship between HGT and community was not dependent on the range of growth rate effects (Fig. S13, line 197-205).
(2) The authors previously developed this theoretical model to study the impact of horizontal gene transfer on species coexistence. In this sense, it seems that the authors are exploring a different (stronger interspecies competition) range of parameter values of the same model, which could potentially limit novelty and generality.
We appreciate the comment. In a previous work (PMID: 38280843), we developed a theoretical model that incorporated horizontal gene transfer process into the classic LV framework. This model provides opportunities to investigate the role of HGT in different open questions of microbial ecology. In the previous work, we considered one fundamental question: how competing microbes coexist stably. In this work, however, we focused on a different problem: how alternative stable states emerge in complex communities. While the basic theoretical tool that we applied in the two works were similar, the scientific questions, application contexts and the implications of our analysis were largely different. The novelty of this work arose from the fact that it revealed the conceptual linkage between alternative stable states and a ubiquitous biological process, horizontal gene transfer. This linkage is largely unknown in previous studies. Exploring such a linkage naturally required us to consider stronger interspecies competitions, which in general would diminish coexistence but give rise to multistability. We believe that the analysis performed in this work provide novel and valuable insights for the field of microbial ecology.
With all the supplemented simulations that we carried out in light of the all the reviewer’s comments, we believe the updated manuscript also provide a unified framework to understand how different biological processes collectively shaped the multistability landscape of complex microbiota undergoing horizontal gene transfer. The comprehensive analyses performed and the diverse scenarios considered in this study also contribute to the novelty and generality of this work.
(3) The authors analyze several scenarios that, in my opinion, naturally follow from the results and parameter value choices in the first sections, making their analysis not very informative. For example, after showing that horizontal gene transfer can increase multistability both between pairs of species and in a community context, the way they model different niches does not bring significantly new results. Given that the authors showed previously in the manuscript that horizontal gene transfer can impact multistability in a community in which all species interact with each other, one might expect that it will also impact multistability in a larger community made of (sub)communities that are independent of (not interacting with) each-which is the proposed way for modelling niches. A similar argument can be made regarding the analysis of (spatially structured) metacommunities. It is known that, for smaller enough dispersal rates, space can promote regional diversity by enabling each local community to remain in a different stable state. Therefore, in conditions in which the impact of horizontal gene transfer drives multistability, it will also drive regional diversity in a metacommunity.
Thanks. Based on the reviewer’s comments, we have move Fig. 3 and 4 to Supplementary Information. In the updated manuscript, we have focused more on analyzing the roles of different parameters in shaping community multistability.
(4) In some cases, the authors consider that mobile genetic elements can lead to ~50% growth rate differences. In the presence of an added death rate, this can be a relatively strong advantage that makes the fastest grower easily take over their competitors. It would be important to discuss biologically relevant examples in which such growth advantages driven by mobile genetic elements could be expected, and how common such scenarios might be.
We appreciate the suggestion. Mobile genetic elements can drive large growth rate differences when they encode adaptative traits like antibiotic resistance (line 197-198).
We also analyzed different ranges of growth rates effects of mobile genetic elements, by sampling 𝜆<sub>ij</sub> values from uniform distributions with given widths. Our results suggested that multistability was more feasible when the fitness effects of MGEs were small (Fig. S13b). The qualitative relationship between HGT and community was not dependent on the range of growth rate effects (Fig. S13a and b). We discussed these results in line 197-205 of the updated main text.
Reviewer #3 (Public review):
Hong et al. used a model they previously developed to study the impact of horizontal gene transfer (HGT) on microbial multispecies communities. They investigated the effect of HGT on the existence of alternative stable states in a community. The model most closely resembles HGT through the conjugation of incompatible plasmids, where the transferred genes confer independent growth-related fitness effects. For this type of HGT, the authors find that increasing the rate of HGT leads to an increasing number of stable states. This effect of HGT persists when the model is extended to include multiple competitive niches (under a shared carrying capacity) or spatially distinct patches (that interact in a grid-like fashion). Instead, if the mobile gene is assumed to reduce between-species competition, increasing HGT leads to a smaller region of multistability and fewer stable states. Similarly, if the mobile gene is deleterious an increase in HGT reduces the parameter region that supports multistability.
This is an interesting and important topic, and I welcome the authors' efforts to explore these topics with mathematical modeling. The manuscript is well written and the analyses seem appropriate and well-carried out. However, I believe the model is not as general as the authors imply and more discussion of the assumptions would be helpful (both to readers + to promote future theoretical work on this topic). Also, given the model, it is not clear that the conclusions hold quite so generally as the authors claim and for biologically relevant parameters. To address this, I would recommend adding sensitivity analyses to the manuscript.
We thank the reviewer for the agreeing that our work addressed an important topic and was wellconducted. We are also grateful for the suggestion on sensitivity analysis, which is very helpful to improve the rigor and generality of our conclusion. All the raised issues have been fully addressed in the updated manuscript and below.
Specific points
(1) The model makes strong assumptions about the biology of HGT, that are not adequately spelled out in the main text or methods, and will not generally prove true in all biological systems. These include:
a) The process of HGT can be described by mass action kinetics. This is a common assumption for plasmid conjugation, but for phage transduction and natural transformation, people use other models (e.g. with free phage that adsorp to all populations and transfer in bursts).
b) A subpopulation will not acquire more than one mobile gene, subpopulations can not transfer multiple genes at a time, and populations do not lose their own mobilizable genes. [this may introduce bias, see below].
c) The species internal inhibition is independent of the acquired MGE (i.e. for p1 the self-inhibition is by s1).
These points are in addition to the assumptions explored in the supplementary materials, regarding epistasis, the independence of interspecies competition from the mobile genes, etc. I would appreciate it if the authors could be more explicit in the main text about the range of applicability of their model, and in the methods about the assumptions that are made.
We are grateful for the reviewer’s suggestions. In main text and methods of the updated manuscript, we have made clear the assumptions underlying our analysis. For point (a), we have clarified that our model primarily focused on plasmid transfer dynamics (line 74, 101, 517). Therefore, the process of HGT can be described by mass action kinetics, which is commonly assumed for plasmid transfer (line 537-538). For point (b), our model allows a cell to acquire more than one mobile genes. Please see our response to point (3) for details. We have also made it clear that we assumed the populations would not lose their own mobile gene completely (line 526-527). For (c), we have also clarified it in the updated manuscript (line 111-112, 527-528).
We have also performed a series of additional simulations to show the range of applicability of our model. In particular, we discuss the role of other mechanisms, including interspecies interaction strength, the growth rate effects of MGEs, MGE epistasis and microbial death rates in shaping the multistability of microbial communities undergoing HGT. These results were provided in Fig. S2, S3, S9, S10, S11, S12, S13 and S15.
(2) I am not surprised that a mechanism that creates diversity will lead to more alternative stable states. Specifically, the null model for the absence of HGT is to set gamma to zero, resulting in pij=0 for all subpopulations (line 454). This means that a model with N^2 classes is effectively reduced to N classes. It seems intuitive that an LV-model with many more species would also allow for more alternative stable states. For a fair comparison, one would really want to initialize these subpopulations in the model (with the same growth rates - e.g. mu1(1+lambda2)) but without gene mobility.
We appreciate the insightful comments. The reviewer was right that in our model HGT created additional subpopulations in the community. However, with or without HGT, we calculated the species diversity and multistability based on the abundances of the 𝑁 species (s<sub>i</sub> in our model), instead of all the p<sub>ij</sub> subpopulations. Therefore, although there exist more ‘classes’ in the model with HGT, the number of ‘classes’ considered when we calculated community diversity and multistability was equal. In light of the reviewer’s suggestion, we have also performed additional simulations, where we initialized the subpopulations in the model with nonzero abundances. Our results suggested that initializing the p<sub>ij</sub> subpopulations with non-zero abundances didn’t change the main conclusion (Fig. S11, line 188-189).
(3) I am worried that the absence of double gene acquisitions from the model may unintentionally promote bistability. This assumption is equivalent to an implicit assumption of incompatibility between the genes transferred from different species. A highly abundant species with high HGT rates could fill up the "MGE niche" in a species before any other species have reached appreciable size. This would lead to greater importance of initial conditions and could thus lead to increased multistability.
This concern also feels reminiscent of the "coexistence for free" literature (first described here http://dx.doi.org/10.1016/j.epidem.2008.07.001 ) which was recently discussed in the context of plasmid conjugation models in the supplementary material (section 3) of https://doi.org/10.1098/rstb.2020.0478 .
We appreciate the comments. Our model didn’t assume the incompatibility between MGEs transferred from different species. Instead, it allows a cell to acquire more than one MGEs. In our model, p<sub>ij</sub> described the subpopulation in the 𝑖-th species that acquired the MGE from the 𝑗th species. Here, p<sub>ij</sub> can have overlaps with p<sub>ik</sub> (𝑗 ≠ 𝑘). In other words, a cell can belong to p<sub>ij</sub> and p<sub>ik</sub> at the same time. The p<sub>ij</sub> subpopulation is allowed to carry the MGEs from the other species. In the model, we used
to describe the influence of the other MGEs on the growth of p<sub>ij</sub>.
We also thank the reviewer for bringing two papers into our attention. We have cited and discussed these papers in the updated manuscript (line 355-362).
(4) The parameter values tested seem to focus on very large effects, which are unlikely to occur commonly in nature. If I understand the parameters in Figure 1b correctly for instance, lambda2 leads to a 60% increase in growth rate. Such huge effects of mobile genes (here also assumed independent from genetic background) seem unlikely except for rare cases. To make this figure easier to interpret and relate to real-world systems, it could be worthwhile to plot the axes in terms of the assumed cost/benefit of the mobile genes of each species.
Thanks for the comments. In the main text, we presented one simulation results that assumed relatively large effects of MGE on species fitness, as the reviewer pointed out. In the updated manuscript, we have supplemented numerical simulations that considered different ranges of fitness effects, including the fitness effect as small as 10% (Fig. S13a). We have also plotted the relationship between community multistability and the assumed fitness effects of MGEs, as the reviewer suggested (Fig. S13b). Our results suggested that multistability was more feasible when the fitness effects of MGEs were small, and changing the range of MGE fitness effects didn’t fundamentally change our main conclusion. These results were discussed in line 197-205 of the updated main text.
Something similar holds for the HGT rate (eta): given that the population of E. coli or Klebsiella in the gut is probably closer to 10^9 than 10^12 (they make up only a fraction of all cells in the gut), the assumed rates for eta are definitely at the high end of measured plasmid transfer rates (e.g. F plasmid transfers at a rate of 10^-9 mL/CFU h-1, but it is derepressed and considered among the fastest - https://doi.org/10.1016/j.plasmid.2020.102489 ). To adequately assess the impact of the HGT rate on microbial community stability it would need to be scanned on a log (rather than a linear) scale. Considering the meta-analysis by Sheppard et al. it would make sense to scan it from 10^-7 to 1 for a community with a carrying capacity around 10^9.
We thank the reviewer for the constructive suggestion. We have carried out additional simulations by scanning the 𝜂 value from 10<sup>-7</sup> to 1. The results suggested that increasing HGT rates started to promote multistability when 𝜂 value exceeded 10<sup>-2</sup> per hour (Fig. S9, line 337-346). This corresponds to a conjugation efficiency of 10<sup>-11</sup> cell<sup>-1</sup> ∙ mL<sup>-1</sup>∙ mL when the maximum carrying capacity equals 10<sup>9</sup> cells ∙ mL<sup>-1</sup>, or a conjugation efficiency of 10<sup>-14</sup> cell<sup>-1</sup> ∙ hr<sup>-1</sup>∙ mL when the maximum carrying capacity equals 10<sup>12</sup> cells ∙ mL<sup>-1</sup>.
(5) It is not clear how sensitive the results (e.g. Figure 2a on the effect of HGT) are to the assumption of the fitness effect distribution of the mobile genes. This is related to the previous point that these fitness effects seem quite large. I think some sensitivity analysis of the results to the other parameters of the simulation (also the assumed interspecies competition varies from figure to figure) would be helpful to put the results into perspective and relate them to real biological systems.
We appreciate the comments. In light of the reviewer’s suggestion, we have changed the range of the fitness effects and analyzed the sensitivity of our predictions to this range. As shown in Fig. S13, changing the range of MGE fitness effects didn’t alter the qualitative interplay between HGT and community multistability. We have also examined the sensitivity of the results to the strength of interspecies competition strength (Fig. S3, S10, S12). These results suggested that while the strength of interspecies interactions played an important role in shaping community multistability, the relationship between HGT rate and multistability was not fundamentally changed by varying interaction strength. In addition, we examined the role of death rates (Fig. S2). In the updated manuscript, we discussed the sensitivity of our prediction to these parameters in line 136-147, 190205, 335-354.
Recommendations for the authors:
Reviewer #2 (Recommendations for the authors):
Please find below a few suggestions that, in my opinion, could help improve the manuscript.
TITLE
It might not be clear what I 'gene exchange communities' are. Perhaps it could be rewritten for more specificity (e.g. '...communities undergoing horizontal gene transfer').
We have updated the title as the reviewer suggested.
ABSTRACT
The abstract could also be edited to improve clarity and specificity. Terms like 'complicating factors' are vague, and enumerating specific factors would be better. The results are largely based on simulations, no analytical results are plotted, so I find that the sentence starting with 'Combining theoretical derivation and numerical simulations' can be a bit misleading.
We appreciate the suggestions. We have enumerated the specific factors and scenarios in the updated abstract (line 18-26). We have also replaced 'Combining theoretical derivation and numerical simulations' with ‘Combining mathematical modeling and numerical simulations’.
INTRODUCTION
- Line 42, please revise this paragraph. The logical flow is not so clear, it seems a bit like a list of facts, but the main message might not be clear enough. Also, it would be good to define 'hidden' states or just rewrite this sentence.
We appreciate the suggestion. In the updated manuscript, we have rewritten this paragraph to improve the logical flow and clarity (line 46-52).
- Line 54, there is little detail about both theoretical models and HGT in this paragraph, and mixing the two makes the paragraph less focused. I suggest to divide into two paragraphs and expand its content. For example, you could explain a bit some relevant implications of MGE.
We appreciate the suggestion. In the updated manuscript, we have divided this paragraph into two paragraphs, focusing on theoretical models and HGT, respectively (line 55-71). In particular, we have added explanations on the implications of MGEs (line 66-69), as the reviewer suggested.
- Line 72, as mentioned in the abstract, it would be better to explicitly mention which confounding factors are going to be discussed.
Thanks for the suggestion. We have rewritten this part as “We further extended our analysis to scenarios where HGT changed interspecies interactions, where microbial communities were subjected to strong environmental selections and where microbes lived in metacommunities consisting of multiple local habitats. We also analyzed the role of different mechanisms, including interspecies interaction strength, the growth rate effects of MGEs, MGE epistasis and microbial death rates in shaping the multistability of microbial communities. These results created a comprehensive framework to understand how different dynamic processes, including but not limited to HGT rates, collectively shaped community multistability and diversity” (line 75-82).
RESULTS
- The basic concepts (line 77) should be explained with more detail, keeping the non-familiar reader in mind. The reader might not be familiar with the concept of bistability in terms of species abundance. Also, note that mutual inhibition does not necessarily lead to positive feedback, as an interaction strength between 0 and 1 might still be considered inhibition. In any case, in Figure 1 it is not obvious how the positive feedback is represented, the caption should explain it. Note that neither the main text nor the caption explains the metaphor of the landscape and the marble that you are using in Figure 1a.
We have rewritten this paragraph to provide more details on the basic concepts (line 86-99). We have removed the statement about ‘mutual inhibition’ to avoid being misleading. We have also updated the caption of Fig. 1a to explain the metaphor of the landscape and the marble (line 389396).
- In the classical LV model, bistability does not depend on growth rates, but only on interaction strength. Therefore, I think that much of the results are significantly influenced by the added death rate. I believe that if the death rate is set to zero, mobile genetic elements that only modify growth rates will have no effect on the system's bistability. Because of this, I think that a thorough analysis of the role of the added death (dilution) rate and the distribution of growth rates is especially needed.
We are grateful for the reviewer’s insightful comments. In the updated manuscript, we have thoroughly analyzed the role of the added death (dilution) rate on the bistability of communities composed of two species (Fig. S2). Indeed, as the reviewer pointed out, if the death rate equals zero, mobile genetic elements that only modify growth rates will have no effect on the system's bistability. We have discussed the role of death rate in line 136-142 of the updated manuscript.
We have also expanded our analysis on the distribution of growth rates. In particular, we considered different ranges of growth rates effects of mobile genetic elements, by sampling 𝜆<sub>ij</sub> values from uniform distributions with given widths (Fig. S13). Greater width led to larger range of growth rate effects. We used five-species populations as an example and tested different ranges.
Our results suggested that multistability was more feasible when the growth rate effects of MGEs were small (Fig. S13b). The qualitative relationship between HGT and community was not dependent on the range of growth rate effects (Fig. S13a). These results are discussed in line 197205 of the updated manuscript.
- The analysis uses gamma values that, in the absence of an added death rate, render a species pair bistable. Therefore, multistability would be quite expected for a 5 species community. Note that, multistability is possible in communities of more than 2 species even if all gamma values are smaller than 1. Analyzing a wide range of interaction strength distributions would really inform on the relative role of HGT in multistability across different community scenarios.
We are grateful for the reviewer’s suggestion. In light of the reviewer’s comments, in the updated manuscript, we have performed additional analysis by focusing on a broader range of interaction strengths (Fig. S3, S10, S12), especially the gamma values below 1 (Fig. S10). Our results agreed with the reviewer’s notion that multistability was possible in communities of more than 2 species even if all gamma values were smaller than 1 (Fig. S10).
- I would recommend the authors extend the analysis of the model used for Figures 1 and 2. Figures 3 and 4 could be moved to the supplement (see my point in the public review), unless the authors extend the analysis to explain some non-intuitive outcomes for niches and metacommunities.
Thanks. In the updated manuscript we have performed additional simulations to extend the analysis in Figure 1 and 2. These results were presented in Fig. S2, S3, S9, S10, S11, S12, and S13. We have also moved Figure 3 and 4 to SI as the reviewer suggested.
- The authors seem to refer to fitness and growth rates as the same thing. This could lead to confusion - the strongest competitor in a species pair could also be interpreted as the fittest species despite being the slowest grower. I think there's no need to use fitness if they refer to growth rates. In any case, they should define fitness if they want to use this concept in the text.
We are grateful for the insightful suggestion. To avoid confusion, we have used ‘growth rate’ throughout the updated manuscript.
- Across the text, the language needs some revision for clarity, specificity, and scientific style. In lines 105 - 109 there are some examples, like the use of 'in a lot of systems', and ' interspecies competitions' (I believe they mean interspecies interaction strengths).
We appreciate the reviewer for pointing them out. We have thoroughly checked the text and made the revisions whenever applicable to improve the clarity and specificity.
- Many plots present the HGT rate on the horizontal axis. Could the authors explain why is it that the rate of HGT is relatively important for the number of alternative stable states? I understand how from zero to a small positive number there is a qualitative change. Beyond that, it shouldn't affect bistability too much, I think. If I am right, then other parameters could be more informative to plot in the horizontal axis. If I am wrong, I think that providing an explanation for this would be valuable.
Thanks. To address the reviewer’s comment, we have systematically analyzed the effects of HGT on community multistability, by scanning the HGT rate from 10<sup>-7</sup> to 10<sup>0</sup>hr<sup>-1</sup> . In communities of two or multiple species, our simulation results showed that multistability gradually increased with HGT rate when HGT rate exceeded 10<sup>2</sup>hr<sup>-1</sup>. These results, presented in Fig. S9 and discussed in line 337-346, provided a more quantitative relationship between multistability and HGT rate.
While in this work we showed the potential role of HGT in modulating community multistability, our results didn’t exclude the role of the other parameters. Motivated by the comments raised by the reviewers, in the updated manuscript, we have performed additional simulations to analyze the influence of other parameters in shaping community multistability. These parameters include death or dilution rate (Fig. S2), interaction strength (Fig. S3, S9, S10, S11, S12, S14, S15), 𝜆 range (Fig. S13, S15) and 𝛿 value (Fig. 3g, h, i). In many of the supplemented results (Fig. S2b, S3b, S13b, Fig. 3g, 3h and 3i), we have also plotted the data by using these parameters as the x axis. We believe the updated work now provided a more comprehensive framework to understand how different mechanisms, including but not limited to HGT, might shape the multistability of complex microbiota. These points were discussed in line 136-147, 190-205, 238-253, 334-354 of the updated main text.
- My overall thoughts on the case of antibiotic exposure are similar to those of previous sections. Very few of the different parameters of the model are analyzed and discussed. In this case, the authors increased the interaction strength to ~0.4 times higher compared to previous sections. Was this necessary, and why?
Thanks for the comments. In the previous draft, the interaction strength 𝛾=1.5 was tested as an example. Motivated by the reviewer’s comments, in the updated manuscript, we have examined different interaction strengths, including the strength ( 𝛾 = 1.1 ) commonly tested in other scenarios. The prediction equally held for different 𝛾 values (Fig. S15). We have also analyzed different 𝜆 ranges (Fig. S15). These results, together with the analyses presented in the earlier version of the manuscript, suggested the potential role of HGT in promoting multistability for communities under strong selection. The supplemented results were presented in Fig. S15 and discussed in line 293-295 of the updated manuscript.
- Line 195, if a gene encodes for the production of a public good, why would its HGT reduce interaction strength? I can think of the opposite scenario: the gene is a public good, and without HGT there is only one species that can produce it. Let's imagine that the public good is an enzyme that deactivates an antibiotic that is present in the environment, and then the species that produces has a positive interaction with another species in a pairwise coculture. If HGT happens, the second species becomes a producer and does not need the other one to survive in the presence of antibiotics anymore. The interaction can then become more competitive, as e.g. competition for resources could become the dominant interaction.
We are grateful for pointing it out. In the updated manuscript, we have removed this statement.
DISCUSSION
- L 267 "by comparison with empirical estimates of plasmid conjugation rates from a previous study [42], the HGT rates in our analysis are biologically relevant in a variety of natural environments". The authors are using a normalized model and the relevance of other parameter values is not discussed. If the authors want to claim that they are using biologically relevant HGT, they should also discuss whether the rest of the parameter values are biologically relevant. I recommend relaxing this statement about HGT rates.
We appreciate the suggestion. We agree with the reviewer that other parameters including the death/dilution rate, interactions strength and 𝜆 ranges are also important in shaping community multistability. We have performed additional analysis to show the effects of these parameters. In light of the reviewer’s suggestion, we have relaxed this statement and thoroughly discussed the context-dependent effect of HGT as well as the roles of different parameters (line 334-354).
- Last sentence: "Therefore, inhibiting the MGE spread using small molecules might offer new opportunities to reshape the stability landscape and narrow down the attraction domains of the disease states". It is not clear what procedure/technique the authors are suggesting. If they want to keep this statement, the authors should give more details on how small molecules can be/are used to inhibit MGE.
We appreciated the comments. Previous studies have shown some small molecules like unsaturated fatty acids can inhibit the conjugative transfer of plasmids. By binding the type IV secretion traffic ATPase TrwD, these compounds limit the pilus biogenesis and DNA translocation. We have provided more details regarding this statement in the updated manuscripts (line 376-379).
METHODS
- Line 439, mu_i should be presented as the maximum 'per capita' growth rate.
We have updated the definition of 𝜇i following the suggestion (line 529).
- Line 444, this explanation is hard to follow, please expand it to provide more details. You could provide an example, like explaining that all individuals from S1 have the MGE1 and therefore they have mu_1 = mu_01 ... After HGT, their fitness changes if they get the plasmid from S2, so a term lambda2 appears.
Thanks. In the updated manuscript, we have expanded the explanation by providing an example as the reviewer suggested (line 534-537).
- The normalization assumes a common carrying capacity Nm (Eqs 1-4) and then it's normalized (Eqs. 5-8). It would be better to start from a more general scenario in which each species has a different carrying capacity and then proceed with the normalization.
We appreciate the suggestion. In the updated manuscript, we have started our derivation from the scenario where each species has a different carrying capacity before proceeding with the normalization (section 1 of Methods, line 516-554). The same equations can be obtained after normalization.
- I think that the meaning of kappa (the plasmid loss rate) is not explained in the text.
Thanks for pointing it out. We have explained the meaning of kappa in the updated text (line 108, 154, 539-541, 586-587, 607).
SUPPLEMENT
- Figure S4, what are the different colors in panel b?
In panel b of Fig. S4, the different colors represent the simulation results repeated with randomized growth rates. We have made it clear in the updated SI.
Reviewer #3 (Recommendations for the authors):
(1) Please extend your description of the model, so it is easier to understand for readers who have not read the first paper. Especially the choice to describe the model as species and subpopulations, as opposed to writing it as MGE-carrying and MGE-free populations of each species makes it quite complicated to understand which parameters influence each other.
Thanks for the suggestion. We have extended the model description in the updated manuscript, which provides a more detailed introduction on model configurations and parameter definitions (line 86-99, 101-113, 151-159). We have also updated the Methods to extend the model description.
(2) Please define gamma_ji in equation 13 and eta_jki in equation 14 (how to map the indices onto the assumed directionality of the interaction).
We have defined these two parameters in the updated manuscript (line 584-586, 630-632).
(3) Line 511: please add at the beginning of this paragraph that you are assuming a grid-like arrangement of patches which will be captured by dispersal term H.
We have updated this paragraph to make this assumption clear (line 636-637).
(4) Line 540: "used in our model" (missing a word).
We have corrected it in the updated manuscript.
(5) Currently the analyses looking at the types of growth effects HGT brings (Figures 5-7) feel very "tacked on". These are not just "confounding factors", but rather scenarios that are much more biologically realistic than the assumption of independent effects. I would introduce them earlier in the text, as I think many readers may not trust your results until they know this was considered (+ how it changes the conclusions).
We are grateful for the suggestion. We agree with the reviewer that these biologically realistic scenarios should be introduced earlier in the text. In the updated manuscript, we have moved these analyses forward, as sections 3, 4 and 5. We have also avoided the term “confounding factors”. Instead, in the updated manuscript, we have separated these analyses into different sections, and clearly described each scenario in the section title (line 217-218, 254, 275).
(6) In some places the manuscript refers to HGT, in others to MGE presence (e.g. caption of Figure 6). These are not generally the same thing, as HGT could also occur due to extracellular vesicles or natural transformation etc. Please standardize the nomenclature and make it clearer which type of processes the model describes.
We appreciate the comment. The model in this work primarily focused on the process of plasmid transfer. We have made it clear throughout the main text.
(7) In many figures the y-axis starts at a value other than 0. This is a bit misleading. In addition, I would recommend changing the title "Area of bistability region" to "Area of bistability" or perhaps even "Area of multistability" (since more than two species are considered).
Thanks for the suggestion. We have updated all the relevant figures to make sure that their y-axes start at 0. We have also changed the title “Area of bistability region” to “Area of multistability”, whenever it is applicable.
(8) Figure 7: what are the assumed fitness effects of the mobile genes in the simulation? Which distribution were they drawn from? Please add this info to the figure caption here and elsewhere.
In Figure 7, we explored an extreme scenario of the fitness effects of the mobile genes, where the population was subjected to strong environmental selection and only cells carrying the mobile gene could grow. Therefore, the carriage of the mobile gene changed the species growth rate from 0 to a positive value µ<sub>i</sub>. When calculating the number of stable states in the communities, we randomly drew the µ<sub>i</sub> values from a uniform distribution between 0.3 and 0.7 hr<sup>-1</sup>. We had added this information in the figure caption (line 505-508) and method (line 615-617) of the updated manuscript.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
We thank the editors and reviewers for the comments and suggestions on our manuscript. The main point that we wished to convey in this paper was the concept and the kinetic model that enabled the estimation of nuclear export rate from an image of single mRNAs localised in single cells. By studying the influenza viral transcripts with this model, we report the variation in the mRNA nuclear export rate of the eight viral segments. Of note, the hemagglutinin and neuraminidase mRNAs were the slowest among the eight segments in exiting the nucleus. We agree that the potential mechanism and the biological impact of this observation require further validation, as the reviewers pointed out. We revised our manuscript to describe these points separately (Lines 21-25, Abstract; Lines 86-91, Introduction; Lines 316-320, Results; Lines 372-381, Discussion). We also highlight below, the revisions that we made to address the specific points raised by the reviewers.
Influenza viral transcription
The authors used specific settings for their virology experiments and several assumptions regarding their mathematical modelling, so it's extremely important that the reader has the viral life cycle clearly understood before immersing themselves in the results. Thus, a detailed explanation of the viral life cycle, including the kinetics of each step, would be extremely helpful if included in the introduction section. Reviewer #1
We have included the molecular composition of influenza vRNP and the mechanism of viral transcription in the revised manuscript (Lines 46-53).
Line 45: "Eight viral RNA segments are transcribed by the same set of molecular machinery" (Ref. 7). What's known about the arrival of the viral RNA segments in the nucleus? Is it synchronized? The authors will understand that my concern is related to the fact that a differential arrival would indeed impact the transcription and export processes. Reviewer #1
The arrival of eight vRNPs in the nucleus is not synchronised, with each of the eight vRNPs arriving independently (Chou et al. PLOS Pathogens 2013) (Lakadamyali et al, PNAS 2003). This does not compromise our model, as our model estimates the export rate of each mRNA species individually (also please see our response in Model assumption below). This is included in the second paragraph of the Discussion section (Lines 390-400).
Model assumption
Even though I do not have the expertise to assess the authors' mathematical model, I do not doubt its robustness. Even so, I find some virological concerns related to the set-up of their experiments. According to what I understand, the authors performed non-synchronized 2 h-long infections with the WSN strain of influenza A virus. They did this to avoid cRNA production (and cross-reaction of the probes), which they claim to occur "much later than mRNA synthesis". Then they omit the degradation of the mRNAs for their model without giving an explanation for having done so. So, taking all these into account, it seems to me that too many assumptions are made without a strong argument. I understand that they are made in order to simplify their model, but I strongly consider that the model would gain strength if some of these events were experimentally considered. Thus, would it be possible to perform synchronized infections? Would it be possible to empirically demonstrate that cRNA production does not occur within the first 2 hours of infection and/or separate transcription and replication? Would it be possible to incorporate a degradation inhibitor of the mRNAs into their infections? If all these could be achieved, then the results coming out of the mathematical model would be enormously reinforced. Reviewer #1
* The study lacks experimental data that would help support the conclusions. For instance, perturbations are many times used to prove a point related to gene expression. An example for Fig. 2 for such an experiment could be to treat the cells with transcription inhibitors (e.g. DRB, 5,6-dichloro1-beta-D-ribofuranosylbenzimidazole). Preventing transcription leaves only mature RNAs in the nucleus, and then using this system one can compare the export rate of different RNAs. Reviewer #2
We agreed that the primary concern in our model was the assumption that the mRNA degradation could be omitted. Synchronised infection is not necessary; in fact, non-synchronised infection is preferred, as we explain later in our response. Additionally, the dominance of mRNA production over the cRNA production has been documented elsewhere. To address mRNA degradation and validate our model estimation, we performed a time-course measurement using baloxavir. Baloxavir efficiently blocks the viral transcription by inhibiting the nuclease activity in PA. DRB, suggested by the reviewer, allows influenza viral transcription and causes viral transcripts to accumulate in the nucleus for unknown mechanisms (Amorim et al. Traffic 2007 and our observation using smFISH, not shown). The additional experiment, now presented in Fig. 5 in the revised manuscript, indicated that the mRNA degradation is minimal, and the export rate estimated in our model and the time-course experiment agreed well for the HA segment. The experiment raised the possibility that the time-course measurement underestimates the export rate of transcripts that exit the nucleus rapidly, such as NP. A real-time imaging of single transcripts would be necessary to directly measure the true nuclear export rate; however, this is beyond the scope of our paper. The new result is now presented in Fig. 5, Supplementary figures 3 and 4, and in the main text (Lines 322-360). An alteration was also made in Line 286 to guide to Fig. 5. The Materials and Methods section was updated (Lines 478-482).
We note that our model does not require synchronised infection. Even under synchronised infection, such as incubating cells with the virus at 4°C to facilitate attachment and subsequently shifting to 37°C to allow viral entry, the inherent heterogeneity in vRNP migration to the nucleus still remains. This randomness does not compromise our model; rather, our model exploits this random arrival of each vRNP in each cell in the system. This variation, in turn, generates cells carrying varying amounts of transcripts, enabling the estimation of nuclear export rate. Importantly, more variation ensures the broader distribution of transcript levels, enabling more precise parameter fitting in our model. It is also important to note that our model does not require the correlation between segments. Our model estimates the export rate of each mRNA species individually. These important points were explained in the Discussion section (Lines 390-400).
* There is no concrete value given for the export rates and what they might mean biologically (e.g. time present/stuck in the nucleus) - Fig. 4D. This leaves the reader in the dark. Reviewer #2
The export rate lambda (previously denoted as k) in our model (Fig. 4) and the decay constant k in the time-course measurement (Fig. 5) represent the proportion of mRNAs exported from the nucleus in an infinitesimal time, defining the nuclear export rate. This has been clarified in the revised manuscript (Lines 314-316), with some alterations to make the parameter use more comprehensive.
- The Greek letter k previously used in Fig. 4 and the associated equations was consistently replaced with lambda to avoid the confusion with the parameter k that is subsequently used for the exponent decay in Fig. 5 in the revised manuscript.
- The Greek letter epsilon (previously used to represent export) was replaced with mu, slightly more common for representing the rate of transport.
- The term “velocity” was consistently replaced with “rate” in the context of the nuclear export (Lines 163, 215, 320, 441).
- The phrase “molar concentrations of mRNAs” was corrected for “molecules of mRNAs” (Line 282).
Also, we have now described our model in two sections: “Conceiving the model” and “Implementing a kinetic model to estimate the nuclear export rate” in the Result. The first section outlines the conceptual framework of the model, and the second focuses on its implementation and the parameter extraction (Lines 227 and 277).
Applicability of the model
Lines 27-29. "Our framework presented in this study can be widely used for investigating the nuclear retention of nascent transcripts produced in a transcription burst." In my opinion, this is the strongest point of the manuscript: developing a mathematical model to analyze nuclear export retention as a mechanism of protein expression control, which could lay the foundation for further biological processes. The authors revisit this idea in the Discussion section. However, which would be those processes for which the model could be helpful? I consider that a more conspicuous discussion on this topic would broaden the readers scope, a crucial point under the eLife scope. Reviewer #1
* Could this framework be used to quantify the nuclear export rate of cellular RNAs? According to the explanation in the Discussion, it would seem that this approach is limited to quantifying the export rate of influenza RNAs. Reviewer #2
Our model is not limited to the influenza virus infection. Our model is applicable for systems where transcription is initiated concurrently, such as when stimuli trigger the activation of a certain set of genes for transcription. Therefore, this makes it particularly valuable for quantifying the nuclear retention of mRNAs in a transcription burst. This point is reiterated in Line 383-390.
Potential mechanisms for differential nuclear export rate of viral segments
* There is no mechanistic insight in the study. The idea driven by this study is that gene expression is regulated by the RNA export rate. But how is that explained? Is there any molecular pathway or explanation for this model? If the transcripts are ready for export, why do the mRNAs stay inside the nucleus? One option to consider are the export factors. Viral RNAs are exported by different pathways as mentioned (line 362), or by TREX2 (Bhat P et al Nat Comm 2023). The data shows that there is no difference observed in the export rate of different pathways. How about knocking down an important export factor to show how this affects the export rates. Or the opposite, overexpress a certain factor, would this change the nucleus/cytoplasm distribution of the retained RNAs. Reviewer #2
As we discussed in the paper, we are beginning to consider that each viral segment has an intrinsic sequence that determines its nuclear export rate, because previous studies on the export factors does not fully explain the variation in the nuclear export rate observed in our study. As the reviewer suggested, a recent study (Bhat et al. Nature Communications 2023) exactly pointed out the internal sequence in the HA segment, aligning with our working hypothesis. This point is discussed and their work (Bhat et al. 2023) has been cited in the Discussion section in the revised manuscript (Lines 446-449).
Biological impact of the nuclear retention
The authors mention several times throughout the manuscript that the virus might use the nuclear retention of mRNA for HA and NA to postpone the expression of these antigenic molecules. At this point, I need to admit that a great question mark appeared in my mind, maybe related to the fact that some knowledge is lacking in my analysis. Lines 328-330: "On the other hand, pushing back the expression of viral antigens HA and NA would be beneficial for the virus to delay the host immune response against the infected cells in which the virus is being replicated." As I tend to understand, the host immune response recognizes HA and NA within the viral particle, if so and independently of the time that HA and Na arrive at the virus assembly step, the progeny' viral particles that are complete and extruded from the cells would be those awakening the host immunity response. If this is right, how would the delayed export of those proteins from the nucleus (and their late expression) be beneficial for delaying the immune response? I would appreciate an explanation for this point, and if I am wrong, then there could exist a relationship between nuclear export rate and the pathogenicity of different strains of influenza A virus. If so, could the authors challenge their model with additional viral strains showing a differential immune response pattern? A deeper analysis in this direction would greatly strengthen the message in their manuscript. Reviewer #1
* Is the timing of viral protein appearance in accordance with the time the mRNA is exported to the cytoplasm. It is logical that the first mRNA to go to the cytoplasm would be the first to become a protein. Can the authors show that nuclear retention of mRNA would push back the expression of the viral antigens HA and NA. Reviewer #2
Three types of immune reactions are being studied extensively. The first is the humoral immune response, where antibodies target the viral antigens HA and NA on the viral envelope, coating and inactivating the viral particles. The second is the cytotoxic T cell response. There is growing evidence that cytotoxic T cells react against NP, eliciting cross-reaction to broader range of influenza viral strains. This reaction is not specific to HA and NA, and antigens are processed in the cytoplasm and presented on the MHC. The third is antibody-dependent cellular cytotoxicity (ADCC), where antibodies recognise the viral proteins on the cellular surface (HA and NA) of infected cells, facilitating their elimination by the NK cells. Although protein translation may begin as soon as the first mRNA exits the nucleus, the virus may delay the peak of the antigen production and therefore, postpone the NK-mediated ADCC. This specific point, along with references to ADCC in influenza virus infection, has been clarified in the Discussion section (Lines 377-381).
Data analysis and presentation
Lines 99-101. "Viral mRNAs were detected as single diffraction-limited spots in the three-dimensional image stacks, allowing for absolute mRNA quantification (Fig. 1B)". What do the authors mean to say by "absolute mRNA quantification"? Do they refer to the total spots or the total mRNAs? Is it assumed that one spot corresponds to a single mRNA transcript? This is not clear at all for this reviewer, which could be the situation for a potential reader. Since it's the beginning of the story, this should be clearly stated in the manuscript. Reviewer #1
Each spot of fluorescent signal corresponds to a single molecule of viral mRNA. We quantified the absolute number of transcripts in each cell. This is clarified in the revised manuscript (Lines 104-106).
* Line 151: does the baseline change according to the RNA in question? The authors say that the "baseline is defined by the median of the Z distribution of peripheral mRNAs" - it seems that the number 0.731 refers only to one type of RNA (which is not mentioned at all not in the text and not in the legend). Reviewer #2
The baseline was set using the NP mRNAs in the cytoplasm because the NP mRNA showed the widest distribution across the cytoplasm (Line 157).
* Also, what is all the signal that is seen outside the marked cells in Fig. 2B? There seems to be significant background in the field, does this mean much false-positive in the multiplex FISH? If so, then how do the authors know that the staining inside the cells isn't to some degree non-specific? It would be necessary to back this up with some other type of quantitative assay like qRT-PCR. Reviewer #2
The cells were removed from the analysis if the cytoplasmic boundary touched any edge of the field-of-view, while the signals were recovered across the entire field-of-view. This is clarified in the figure legend (Lines 194-195).
Others
* The meaning and explanation for Figure 1H -are unclear. Rephrase and make the legend more reader friendly. Reviewer #2
We made alterations to the legend (Lines 132-134) and the relevant lines in the main text (Lines 148-151).
* Fig. 2E: Is this the total transcript count or only in the nucleus? Would it be possible to find some correlation between the segments if a pair-wise analysis is performed according to nuclear-cytoplasm distribution? Reviewer #2
The total counts are presented. This is clarified in the legend (Lines 199-200).
* Abstract -"A mathematical modelling indicated that the relationship between the nuclear ratio and the total count of mRNAs in single cells is dictated by a proxy for the nuclear export rate." - this sentence is very unclear. Reviewer #2
The sentence was removed in the revised manuscript (Line 21). This removal did not affect the overall meaning in the abstract. We also made an alteration to Line 279 that contained a similar phrase.
* The use of the word "acutely" (lines 16 and 35) is strange. Reviewer #2
They have been removed (now Lines 15, 33).
* Line 157 - "This result indicates that the velocity of viral mRNA export from the nucleus varies according to the viral segments." - not velocity, maybe timing. Reviewer #2
We consistently replaced “velocity” with “rate” (Lines 163, 215, 320, 441).
* Reference for line 41. Reviewer #2
A reference (Waker et al. Trends Microbiol. 2019) has been cited (Line 39).
* Reference for lines 105-106. Reviewer #2
The gene length of each segment was indicated in the sentence (Line 137).
* Line 264- why here is 0.02 M.O.I used compared to line 97 where 2 is used? Reviewer #2
We used M.O.I. of 0.02 to allow for spot quantification over longer periods of observation (Lines 269-270).
* NS1 is expressed at late infection times and might alter the nuclear export of viral mRNAs (line 352). Need to show that indeed it is not expressed in the experiments done here. Reviewer #2
It is not possible to definitely prove that NS1 is not expressed due to the sensitivity limitations. However, we minimised the its impact by investigating at the early time point (Lines 415416).
* Line 459- 30% formamide? Is this correct or should it be 10%? Reviewer #2
This is correct. The probes used were longer than the others for smFISH. Therefore, we washed away the probes with the stringent condition.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the previous reviews.
Public Reviews:
Reviewer #1 (Public review):
This paper presents a model of the whole somatosensory non-barrel cortex of the rat, with 4.2 million morphologically and electrically detailed neurons, with many aspects of the model constrained by a variety of data. The paper focuses on simulation experiments, testing a range of observations. These experiments are aimed at understanding how the multiscale organization of the cortical network shapes neural activity.
Strengths:
(1) The model is very large and detailed. With 4.2 million neurons and 13.2 billion synapses, as well as the level of biophysical realism employed, it is a highly comprehensive computational representation of the cortical network.
(2) Large scope of work - the authors cover a variety of properties of the network structure and activity in this paper, from dendritic and synaptic physiology to multi-area neural activity.
(3) Direct comparisons with experiments, shown throughout the paper, are laudable.
(4) The authors make a number of observations, like describing how high-dimensional connectivity motifs shape patterns of neural activity, which can be useful for thinking about the relations between the structure and the function of the cortical network.
(5) Sharing the simulation tools and a "large subvolume of the model" is appreciated.
We thank the reviewer for these comments and are pleased they appreciated these aspects of the work.
Weaknesses:
(1) A substantial part of this paper - the first few figures - focuses on single-cell and single-synapse properties, with high similarity to what was shown in Markram et al., 2015. Details may differ, but overall it is quite similar.
We thank the reviewer for this useful comment and agree that it is important to better highlight the incremental improvements to the model’s low-level physiology. The validity of any model can continuously be improved at all spatial scales and the validity of emergent network activity increases with improved validity at lower levels. For this reason, we felt it was valuable to improve the low-level physiology of the model.
Regarding neuron physiology, we have added the following in Section 2.1 on page 5:
“2.1 Improved modeling and validation of neuron physiology
Similarly to Markram et al. (2015), electrical properties of single neurons were modelled by optimizing ion channel densities in specific compartment-types (soma, axon initial segment (AIS), basal dendrite, and apical dendrite) (Figure 2B) using an evolutionary algorithm (IBEA; Van Geit et al., 2016) so that each neuron recreates electrical features of its corresponding electrical type (e-type) under multiple standardized protocols. Compared to Markram et al. (2015), electrical models were optimized and validated using 1) additional in vitro data, features and protocols, 2) ion channel and electrophysiological data corrected for the liquid junction potential, and 3) stochastic channels (StochKv3) now including inactivation profiles. The methodology and resulting electrical models are described in Reva et al. (2023) (see Methods), and generated quantitatively more accurate electrical activity, including improved attenuation of excitatory postsynaptic potentials (EPSPs) and back-propagating action potentials.”
And page 8:
“The new neuron models saw a 5-fold improvement in generalizability compared to Markram et al. (2015) (Reva et al., 2023).”
We have also made the descriptions of the improvements to synaptic physiology more explicit in Section 2.2 on page 9:
“2.2 Improved modeling and validation of synaptic physiology
The biological realism of synaptic physiology was improved relative to Markram et al. (2015) using additional data sources and by extending the stochastic version of the Tsodyks-Markram model (Tsodyks and Markram, 1997; Markram et al., 1998; Fuhrmann et al., 2002; Loebel et al., 2009) to feature multi-vesicular release, which in turn improved the accuracy of the coefficient of variations (CV; std/mean) of postsynaptic potentials (PSPs) as described in Barros-Zulaica et al. (2019) and Ecker et al. (2020). The model assumes a pool of available vesicles that is utilized by a presynaptic action potential, with a release probability dependent on the extracellular calcium concentration ([Ca2+]o; Ohana and Sakmann, 1998; Rozov et al., 2001; Borst, 2010). Additionally, single vesicles spontaneously release as an additional source of variability with a low frequency (with improved calibration relative to Markram et al. (2015)). The utilization of vesicles leads to a postsynaptic conductance with bi-exponential kinetics. Short-term plasticity (STP) dynamics in response to sustained presynaptic activation are either facilitating (E1/I1), depressing (E2/I2), or pseudo-linear (I3). E synaptic currents consist of both AMPA and NMDA components, whilst I currents consist of a single GABAA component, except for neurogliaform cells, whose synapses also feature a slow GABAB component. The NMDA component of E synaptic currents depends on the state of the Mg2+ block (Jahr and Stevens, 1990), with the improved fitting of parameters to cortical recordings from Vargas-Caballero and Robinson (2003) by Chindemi et al. (2022).”
(2) Although the paper is about the model of the whole non-barrel somatosensory cortex, out of all figures, only one deals with simulations of the whole non-barrel somatosensory cortex. Most figures focus on simulations that involve one or a few "microcolumns". Again, it is rather similar to what was done by Markram et al., 2015 and constitutes relatively incremental progress.
We thank the reviewer for this comment and have added the following text to the Discussion on page 33 to explain our rationale:
“In keeping with the philosophy of compartmentalization of parameters and continuous model refinement (see Introduction), it was essential to improve validity at the columnar scale (relative to Markram et al. (2015)) as part of demonstrating validity of the full nbS1. Indeed, improved parametrization and validation at smaller scales was essential to parameterizing background input which generated robust nbS1 activity within realistic [Ca<sup>2+</sup>]<sub>o</sub> and firing rate ranges. We view this as a major achievement, as it was unknown whether the model would achieve a stable and meaningful regime at the start of our investigation. Whilst we would have liked to go further, our primary goal was to publish a well characterized model as an open resource that others could use to undertake further in-depth studies. In this regard, we are pleased that the parametrization of the nbS1 model has already been used to study EEG signals (Tharayil et al., 2024), as well as propagation of activity between two subregions (Bolaños-Puchet and Reimann, 2024).”
We also make it clearer in the Introduction on page 4 that the improved validation of the emergent columnar regime was essential to stable activity at the larger scale:
“These initial validations demonstrated that the model was in a more accurate regime compared to Markram et al. (2015) – an essential step before testing more complex or larger-scale validations. For example, under the same parameterization we then observed selective propagation of stimulus-evoked activity to downstream areas, and…”
(3) With a model like this, one has an opportunity to investigate computations and interactions across an extensive cortical network in an in vivo-like context. However, the simulations presented are not addressing realistic specific situations corresponding to animals performing a task or perceiving a relevant somatosensory stimulus. This makes the insights into the roles of cell types or connectivity architecture less interesting, as they are presented for relatively abstract situations. It is hard to see their relationship to important questions that the community would be excited about - theoretical concepts like predictive coding, biophysical mechanisms like dendritic nonlinearities, or circuit properties like feedforward, lateral, and feedback processing across interacting cortical areas. In other words, what do we learn from this work conceptually, especially, about the whole non-barrel somatosensory cortex?
We thank the reviewer for this comment and agree that it would be very interesting to explore such topics. In the Introduction on page 4, we have updated the list of papers which have so far used the model for more in depth studies:
“…propagation of activity between cortical areas (Bolaños-Puchet and Reimann, 2024) the role of non-random connectivity motifs on network activity (Pokorny et al., 2024) and reliability (Egas Santander et al., 2024), the composition of high-level electrical signals such as the EEG (Tharayil et al., 2024), and how spike sorting biases population codes (Laquitaine et al., 2024).”
In the Discussion on page 33 we also add our additional thoughts on this topic:
“Whilst we would have liked to go further, our primary goal was to publish a well characterized model as an open resource that others could use to undertake further in-depth studies. In this regard, we are pleased that the parametrization of the nbS1 model has already been used to study EEG signals (Tharayil et al., 2024), as well as propagation of activity between two subregions (Bolaños-Puchet and Reimann, 2024). Investigation, improvement and validation must be continued at all spatial scales in follow up papers with detailed description, figures and analysis, which cannot be covered in this manuscript. Each new study increases the scope and validity of future investigations. In this way, this model and paper act as a stepping stone towards more complex questions of interest to the community such as perception, task performance, predictive coding and dendritic processing. This was similar for Markram et al. (2015) where the initial paper was followed by more detailed studies. Unlike the Markram et al. (2015) model, the new model can also be exploited by the community and has already been used in a number of follow up papers studying (Ecker et al., 2024a,b; Bolaños-Puchet and Reimann, 2024; Pokorny et al., 2024; Egas Santander et al., 2024; Tharayil et al., 2024; Laquitaine et al., 2024). We believe that the number of use cases for such a general model is vast, and is made larger by the increased size of the model.”
(4) Most comparisons with in vivo-like activity are done using experimental data for whisker deflection (plus some from the visual stimulation in V1). But this model is for the non-barrel somatosensory cortex, so exactly the part of the cortex that has less to do with whiskers (or vision). Is it not possible to find any in vivo neural activity data from the non-barrel cortex?
We agree with the reviewer that this is a weakness. We have expanded our discussion of the need to mix data sources to also consider our view for network level activity:
“This paper and its companion paper serve to present a methodology for modeling micro- and mesoscale anatomy and physiology, which can be applied for other cortical regions and species. With the rapid increase in openly available data, efforts are already in progress to build models of mouse brain regions with reduced reliance on data mixing thanks to much larger quantities of available atlas-based data. This also includes data for the validation of emergent network level activity. Here we chose to compare network-level activity to data mostly from the barrel cortex, as well as a single study from primary visual cortex. Whilst a lot of the data used to build the model was from the barrel cortex, the barrel cortex also represents a very well characterized model of cortical processing for simple and controlled sensory stimuli. The initial comparison of population-wise responses in response to accurate thalamic input for single whisker deflections was essential to demonstrating that the model was closer to in vivo, and we were unaware of similar data for nonbarrel somatosensory regions. Moreover, our optogenetic & lesion study demonstrated the capacity to compare and extend studies of canonical cortical processing in the whisker system.”
(5) The authors almost do not show raw spike rasters or firing rates. I am sure most readers would want to decide for themselves whether the model makes sense, and for that, the first thing to do is to look at raster plots and distributions of firing rates. Instead, the authors show comparisons with in vivo data using highly processed, normalized metrics.
We thank the reviewer for this comment and agree that better visualizations of the network activity under different conditions is essential for helping the reader assess the work. In addition to raster plots in Video 1, Video 3, Fig 6, Fig 5C, Fig S9a, S16a, we have additionally:
a) Changed the histograms of spontaneous activity in Fig 4G on page 13 to raster plots for the seven column subvolume for two contrasting meta-parameter regimes.
b) Added 4 new videos (Video 6a,b and 8a,b) showing all spontaneous and evoked meta-parameter combinations in hex0 and hex39 of the nbS1:
We have added improved plots showing the distributions of firing rates in the seven column subvolume on page 74:
With more detailed consideration in the Results on page 15:
“Long-tailed population firing rate distributions with means ∼ 1Hz
To study the firing rate distributions of different subpopulations and m-types, we ran 50s simulations for the meta-parameter combinations: [Ca<sup>2+</sup>]<sub>o</sub>: 1.05mM, R<sub>OU</sub>: 0.4,P<sub>FR</sub>: 0.3, 0.7 (Figure S4). Different subpopulations showed different sparsity levels (proportion of neurons spiking at least once) ranging from 6.6 to 42.5%. Wohrer et al. (2013) considered in detail the biases and challenges in obtaining ground truth firing rate distributions in vivo, and discuss the wide heterogeneity of reports in different modalities using different recording techniques. They conclude that most evidence points towards longtailed distributions with peaks just below 1Hz. We confirmed that spontaneous firing rate distributions were long-tailed (approximately lognormally distributed) with means on the order of 1Hz for most subpopulations. Importantly the layer-wise means were just below 1Hz in all layers for the P<sub>FR</sub> = 0.3 meta-parameter combination. Moreover, our recent work applying spike sorting to extracellular activity using this meta-parameter combination found spike sorted firing rate distributions to be lognormally distributed and very similar to in vivo distributions obtained using the same probe geometry and spike sorter (Laquitaine et al., 2024).
(6) While the authors claim that their model with one set of parameters reproduces many experimentally established metrics, that is not entirely what one finds. Instead, they provide different levels of overall stimulation to their model (adjusting the target "P_FR" parameter, with values from 0 to 1, and other parameters), and that influences results. If I get this right (the figures could really be improved with better organization and labeling), simulations withP<sub>FR</sub> closer to 1 provide more realistic firing rate levels for a few different cases, however, P<sub>FR</sub> of 0.3 and possibly above tends to cause highly synchronized activity - what the authors call bursting, but which also could be called epileptic-like activity in the network.
We thank the reviewer for this comment. We can now see that the motivation for P<sub>FR</sub> parameter was introduced very briefly in the results and that the results of the calibration and analysis of the spontaneous activity regime are not interpreted in relation to this parameter.
To address this, we have given more detail where it is first introduced in the Results on page 12:
“to account for uncertainty in the firing rate bias during spontaneous activity from extracellular spike sorted recordings…”
We then reconsider that it represents an unknown bias when interpreting the calibration and spontaneous activity results on page 15:
“We reemphasize that the [Ca<sup>2+</sup>]<sub>o</sub>, R<sub>OU</sub> and P<sub>FR</sub> meta-parameters account for uncertainty of in vivo extracellular calcium concentration, the nature of inputs from other brain regions and the bias of extracellularly recorded firing rates. Whilst estimates for [Ca<sup>2+</sup>]<sub>o</sub> are between 1.0 - 1.1mM (Jones and Keep, 1988; Massimini and Amzica, 2001; Amzica et al., 2002; Gonzalez et al., 2022) and estimates for PFR are in the range of 0.1 - 0.3 (Olshausen and Field, 2006), combinations of these parameters supporting in vivo-like stimulus responses in later sections will offer a prediction for the true values of these parameters. Both these later results and our recent analysis of spike sorting bias using this model (Laquitaine et al., 2024) predict a spike sorting bias corresponding to P<sub>FR</sub> ∼ 0.3, confirming the prediction of Olshausen and Field (2006).”
And in relation to the stimulus evoked responses on page 17:
“Specifically, simulations with PFR from 0.1 to 0.5 robustly support realistic stimulus responses, with the middle of this range (0.3) corresponding with estimates of in vivo recording bias; both the previous estimates of Olshausen and Field (2006) and from a spike sorting study using this model (Laquitaine et al., 2024).”
Following these considerations, the remainder of the experiments using the seven column subvolume only use a single meta-parameter on page 19.
For the full nbS1 we further discuss the importance of a P_FR value between 0.1 and 0.3 in the Results on page 26:
“Stable spontaneous activity only emerges in nbS1 at predicted in vivo firing rates
After calibrating the model of extrinsic synaptic input for the seven column subvolume, we tested to what degree the calibration generalizes to the entire nbS1. Notably, this included the addition of mid-range connectivity (Reimann et al., 2024). The total number of local and mid-range synapses in the model was 9138 billion and 4075 billion, i.e., on average full model simulations increased the number of intrinsic synapses onto a neuron by 45%. Particularly, we ran simulations for P<sub>FR</sub></i ∈ [0.1, 0.15, ..., 0.3] using the OU parameters calibrated for the seven column subvolume for [Ca<sup>2+</sup>]<sub>o</sub> = 1.05mM and R<sub>OU</sub> = 0.4. Each of these full nbS1 simulations produced stable non-bursting activity (Figure 8A), except for the simulation for P<sub>FR</sub></i = 0.3, which produced network-wide bursting activity (Video 6). Activity levels in the simulations of spontaneous activity were heterogeneous (Figure 8B, Video 7). In some areas, firing rates were equal to the target P<sub>FR</sub>, whilst in others they increased above the target (Figure 8C). In the more active regions, mean firing rates (averaged over layers) were on the order of 30-35% of the in vivo references for the maximum non-bursting P<sub>FR</sub> simulation (target P<sub>FR</sub> : 0.25). This range of firing rates again fits with the estimate of firing rate bias from our paper studying spike sorting bias (Laquitaine et al., 2024) and the meta-parameter range supporting realistic stimulus responses in the seven column subvolume. This also predicts that the nbS1 cannot sustain higher firing rates without entering a bursting regime.
Finally, we also added to our discussion of biases in extracellular firing rates in the Discussion on page 32:
“This is also inline with our recent work using the model, which estimated a spike sorting bias corresponding to PFR = 0.3 using virtual extracellular electrodes (Laquitaine et al., 2024).”
We also thank the reviewer for pointing out that we did not define the term “bursting” in the main text. We have added the following definition and discussion in the Results on page 15:
“Note that the most correlated meta-parameter combination [Ca<sup>2+</sup>]<sub>o</sub>: 1.1mM, R<sub>OU</sub>: 0.2, P<sub>FR</sub>: 1.0 produced network-wide “bursting” activity, which we define as highly synchronous all or nothing events (Video 1). Such activity, which may be characteristic of epileptic activity, can be studied with the model but is not the focus of this study.”
(7) The authors mention that the model is available online, but the "Resource availability" section does not describe that in substantial detail. As they mention in the Abstract, it is only a subvolume that is available. That might be fine, but more detail in appropriate parts of the paper would be useful.
Firstly, we are pleased to say that the full nbS1 model is now available to download, in addition to the seven hexagon subvolume. In the manuscript, we have:
a) Added to the Introduction at the bottom of page 4:
“To provide a framework for further studies and integration of experimental data, the full model is made available with simulation tools, as well as a smaller subvolume with the optional new connectome capturing inhibitory targeting rules from electron microscopy”.
b) Updated the open source panel of Figure 1:
Secondly, we thank the reviewer for noticing that the description of the available model is not well described in the “Resource availability” statement and have addressed this by:
a) Adding the following to the “Resource availability” statement on page 36:
“Both the full nbS1 model and smaller seven hexagon subvolume are available on Harvard Dataverse and Zenodo respectively in SONATA format (Dai et al., 2020) with simulation code. DOIs are listed under the heading ``Final simulatable models'' in the Key resources table. An additional link is provided to the SM-Connectome with instructions on how to use it with the seven hexagon subvolume model.”
b) Creating a new subheading in the “Key resources table” titled: “Final simulatable models” to make it clearer which links refer to the final models.
Reviewer #2 (Public review):
Summary:
This paper is a companion to Reimann et al. (2022), presenting a large-scale, data-driven, biophysically detailed model of the non-barrel primary somatosensory cortex (nbS1). To achieve this unprecedented scale of a bottom-up model, approximately 140 times larger than the previous model (Markram et al., 2015), they developed new methods to account for inputs from missing brain areas, among other improvements. Isbister et al. focus on detailing these methodological advancements and describing the model's ability to reproduce in vivo-like spontaneous, stimulus-evoked, and optogenetically modified activity.
Strengths:
The model generated a series of predictions that are currently impossible in vivo, as summarized in Table S1. Additionally, the tools used in this study are made available online, fostering community-based exploration. Together with the companion paper, this study makes significant contributions by detailing the model's constraints, validations, and potential caveats, which are likely to serve as a basis for advancing further research in this area.
We thank the reviewer for these comments, and are pleased they appreciate these aspects of the work.
Weaknesses:
That said, I have several suggestions to improve clarity and strengthen the validation of the model's in vivo relevance.
Major:
(1) For the stimulus-response simulations, the authors should also reference, analyze, and compare data from O'Connor et al. (2010; https://pubmed.ncbi.nlm.nih.gov/20869600/) and Yu et al .(2016; https://pubmed.ncbi.nlm.nih.gov/27749825/) in addition to Yu et al. 2019, which is the only data source the authors consider for an awake response. The authors mentioned bias in spike rate measurements, but O'Connor et al. used cell-attached recordings, which do not suffer from activity-based selection bias (in addition, they also performed Ca2+ imaging of L2/3). This was done in the exact same task as Yu et al., 2019, and they recorded from over 100 neurons across layers. Combining this data with Yu et al., 2019 would provide a comprehensive view of activity across layers and inhibitory cell types. Additionally, Yu et al. (2016) recorded VPM neurons in the same task, alongside whole-cell recordings in L4, showing that L4 PV neurons filter movement-related signals encoded in thalamocortical inputs during active touch. This dataset is more suitable for extracting VPM activity, as it was collected under the same behavior and from the same species (Unlike Diamond et al., 1992, which used anesthetized rats). Furthermore, this filtering is an interesting computation performed by the network the authors modeled. The validation would be significantly strengthened and more biologically interesting if the authors could also reproduce the filtering properties, membrane potential dynamics, and variability in the encoding of touch across neurons, not just the latency (which is likely largely determined by the distance and number of synapses).
We thank the reviewer for pointing out these very useful studies. We have taken on board this suggestion for a future model of the mouse barrel cortex.
(2) The authors mention that in the model, the response of the main activated downstream area was confined to L6. Is this consistent with in vivo observations? Additionally, is there any in vivo characterization of the distance dependence of spiking correlation to validate Figure 8I?
We are not aware of data confirming the propagation of activity to downstream areas being confined to layer 6 but have considered the connectivity further between these two regions on page 27, as well as studying this further in follow up work:
“Stable propagation of evoked activity through mid-range connectivity only emerges in nbS1 at predicted in vivo firing rates
We repeated the previous single whisker deflection evoked activity experiment in the full model, providing a synchronous thalamic input into the forelimb sub-region (S1FL; Figure 8E; Video 8 & 9). Responses in S1FL were remarkably similar to the ones in the seven column subvolume, including the delays and decays of activity (Figure 8F). However, in addition to a localized primary response in S1FL within 350μm of the stimulus, we found several secondary responses at distal locations (Figure 8E; Video 9), which was suggestive of selective propagation of the stimulus-evoked signal to downstream areas efferently connected by mid-range connectivity. The response of the main activated downstream area (visible in Figure 8E) was confined to L6 (Figure 8G). In a follow up study using the model to explore the propagation of activity between cortical regions (Bolaños-Puchet and Reimann, 2024), it is described how the model contains both a feedforward projection pattern, which projects to principally to synapses in L1 & L23, and a feedback type pattern, which principally projects to synapses in L1 & L6. On visualizing the innervation profile from the stimulated hexagon to the downstream hexagon we can see that we have stimulated a feedback pathway (Figure S16)”
With referenced Figure S16 on page 85:
We did find in vivo evidence of similar layer-wise and distance dependence of correlations in the somatosensory cortex discussed on page 27 of the Results:
“The distance dependence of correlations followed a similar profile to that observed in a dataset characterizing spontaneous activity in the somatosensory cortex (Reyes-Puerta et al., 2015a) (compare red line in Figure 8I with Figure S16). In the in vivo dataset spiking correlation was also low but highest in lower layers, with short “up-states” in spiking activity constrained to L5 & 6 (see Figure 1E,F in (Reyes-Puerta et al., 2015a)). In the model, they are constrained to L6.”
With Figure S16a on page 85 showing the distance dependence of correlations in the anaesthetized barrel cortex during spontaneous activity (digitization from the reference paper):
(3) Across the figures, activity is averaged across neurons within layers and E or I cell types, with a limited description of single-cell type and single-cell responses. Were there any predictions regarding the responses of particular cell types that significantly differ from others in the same layer? Such predictions could be valuable for future investigations and could showcase the advantages of a data-driven, biophysically detailed model.
We thank the review for this comment. In addition to new analyses at higher granularity addressed in other comments, we have added the following comparison of stimulus-evoked membrane potential dynamics in different subpopulations for the original connectome and SM-connectome in Figure 7 on page 24.
This gave interesting results discussed in a new subsection on page 26:
“EM targeting trends hyperpolarize Sst+ and HT3aR+ late response, and disinhibit L5/6 E
Studying somatic membrane potentials for different subpopulations in response to whisker deflections shows that PV+, L23E and L4E subpopulations are largely unaffected in the SM-connectome (Figure 7E). Interestingly, Sst+ and 5HT3aR+ subpopulations show a strong hyperpolarization in the late response that isn’t present in the original connectome. Interestingly, this corresponds with a stronger late response in L5/6 E populations, which could be caused by disinhibition due to the Sst+ and 5HT3aR+ hyperpolarization. This could be explored further in follow up studies using our connectome manipulator tool (Pokorny et al., 2024).”
(4) 2.4: Are there caveats to assuming the OU process as a model for missing inputs? Inputs to the cortex are usually correlated and low-dimensional (i.e., communication subspace between cortical regions), but the OU process assumes independent conductance injection. Can (weakly) correlated inputs give rise to different activity regimes in the model? Can you add a discussion on this?
We agree with the reviewer that there are caveats to assuming an OU process for the model of missing inputs and have added the following to the Discussion on page 31:
“The calibration framework could optimize per population parameters for other compensation methods, whilst still offering an interpretable spectrum of firing rate regimes at different levels of P<sub>FR</sub>. For example, more realistic compensation schemes could be explored which introduce a) correlations between the inputs received by different neurons and b) compensation distributed across dendrites, as well as at the soma. We predict that such changes would make spontaneous activity more correlated at the lower spontaneous firing rates which supported in vivo like responses (P<sub>FR</sub> : 0.1 − 0.5), which would in turn make stimulus-responses more noise correlated.”
(5) 2.6: The network structure is well characterized in the companion paper, where the authors report that correlations in higher dimensions were driven by a small number of neurons with high participation ratios. It would be interesting to identify which cell types exhibit high node participation in high-dimensional simplices and examine the spiking activity of cells within these motifs. This could generate testable predictions and inform theoretical cell-type-specific point neuron models for excitatory/inhibitory balanced networks and cortical processing.
We thank the reviewer for this suggestion. We have added two supplementary figures to address this suggestion, which are discussed in the Results on Page 16:
“Additionally, we studied the structural effect on the firing rate (here measured as the inverse of the inter-spike interval, ISI, which can be thought of as a proxy of non-zero firing rate). We found that for the connected circuit, the firing rate increases with simplex dimension; in contrast with the disconnected circuit, where this relationship remains flat (see Figure S6 red vs. blue curves and Methods).
This also demonstrates high variability between neurons, in line with biology, both structurally (Towlson et al., 2013; Nigam et al., 2016) and functionally (Wohrer et al., 2013; Buzs´aki and Mizuseki, 2014). We next identified the cell types that are overexpressed in the group of neurons that have the 5% highest values of node participation across dimensions (Figure S7). This could inform theoretical point neuron models with cell-type specificity, for example. We found that while in dimension one (i.e., node degree) this consists mostly of inhibitory cells, in higher dimensions the cell types concentrate in layers 4, 5 and 6, especially for TPC neurons. This is in line with our structural layer-wise findings in Figure 8B in Reimann et al. (2024).”
Which reference new Figures S6 and S7:
With the methodology for S6 described on page 49 of the Methods:
“For any numeric property of neurons, e.g., firing rate, we evaluate the effect of dimension on it by taking weighted averages across dimensions. That is for each dimension k, we take the weighted average of the property across neurons where the weights are given by node participation on dimension k. More precisely, let N be the number of neurons and −→V ∈ RN, be a vector of a property on all the neurons e.g., the vector of firing rates. Then in each dimension k we compute
Where
is the vector of node participation on dimension k for all neurons and ・ is the dot product.
To measure the over and underexpression of the different m-types among those with the highest 5% of values of node participation, we used the hypergeometric distribution to determine the expected distribution of m-types in a random sample of the same size. More precisely, for each dimension k and m-type m, let N<sub>total</sub> be the total number of neurons in the circuit, Nm be the number of neurons of m-type m in the circuit, Ctop be the number of neurons with the highest 5% values of node participation in dimension k, Cm the number of neurons of mtype m among these, and let P = hypergeom(N<sub>total</sub<,N<sub>m</sub>,C<sub>top</sub>) be the hypergeometric distribution.
By definition, P(x) describes the probability of sampling x neurons of m-type m in a random sample of size C<sub>top</sub>. Therefore, using the cumulative distribution F(x) = P(Counts ≤ x), we can compute the p-values as follows:
Small values indicate under and over representation respectively….”
Minor:
(1) Since the previous model was published in 2015, the neuroscience field has seen significant advancements in single-cell and single-nucleus sequencing, leading to the clustering of transcriptomic cell types in the entire mouse brain. For instance, the Allen Institute has identified ~10 distinct glutamatergic cell types in layer 5, which exceeds the number incorporated into the current model. Could you discuss 1) the relationship between the modeled me-types and these transcriptomic cell types, and 2) how future models will evolve to integrate this new information? If there are gaps in knowledge in order to incorporate some transcriptome cell types into your model, it would be helpful to highlight them so that efforts can be directed toward addressing these areas.
We thank the reviewer for this suggestion, particularly the idea to describe what types of data would be valuable towards improving the model in future. We have added the following to the Discussion on page 33:
“In our previous work (Roussel et al., 2023) we linked mouse inhibitory me-models to transcriptomic types (t-types) in a whole mouse cortex transcriptomic dataset (Gouwens et al., 2019). This can provide a direct correspondence in future large-scale mouse models. As we model only a single electrical type for pyramidal cells there is no one-to-one correspondence between our me-models and the 10 different pyramidal cell types identified there. We are not currently aware of any method which can recreate the electrical features of different types of pyramidal cells using only generic ion channel models. To achieve the firing pattern behavior of more specific electrical types, usually ion channel kinetics are tweaked, and this would violate the compartmentalization of parameters. In future we hope to build morpho-electric-transcriptomic type (met-type) models by selecting gene-specific ion channel models (Ranjan et al., 2019, 2024) based on the met-type’s gene expression. Data specific to different neuron sections (i.e. soma, AIS, apical/basel dendrites) of different met-types, such as gene expression, distribution of ion channels, and voltage recordings under standard single cell protocols would be particularly useful.”
(2) For the optogenetic manipulation, it would be interesting if the model could reproduce the paradoxical effects (for example, Mahrach et al. reported paradoxical effects caused by PV manipulation in S1; https://pubmed.ncbi.nlm.nih.gov/31951197/). This seems a more relevant and non-trivial network phenomenon than the V1 manipulation the authors attempted to replicate.
We thank the reviewer for this valuable idea. Indeed, our model is able to reproduce paradoxical effects under certain conditions. We added the following new supplementary Figure S12 demonstrating this finding (black arrows).
Which we discuss in the Results on page 22:
“However, at high contrasts, we observed a paradoxical effect of the optogenetic stimulation on L6 PV+ neurons, reducing their activity with increasing stimulation strength (Figure S12B; cf. Mahrach et al. (2020)). This effect did not occur under grey screen conditions (i.e., at contrast 0.0) with a constant background firing rate of 0.2 Hz or 5 Hz respectively (not shown). The individual…”
and added to the Discussion on page 32:
“Also, we predicted a paradoxical effect of optogenetic stimulation on L6 PV+ interneurons, namely a decrease in firing with increased stimulus strength. This is reminiscent of the paradoxical responses found by Mahrach et al. (2020) in the mouse anterior lateral motor cortex (in L5, but not in L2/3) and barrel cortex (no layer distinction) respectively. While Mahrach et al. (2020) conducted their recordings in awake mice not engaged in any behavior, we observed this effect only when drifting grating patterns with high contrast were presented. Nevertheless, consistent with their findings, we found the effect only in deep but not in superficial layers, and only for PV+ interneurons but not for PCs. Our model could therefore be used to improve the understanding of this paradoxical effect in follow up studies. These examples demonstrate that the approach of modeling entire brain regions can be used to further probe the topics of the original articles and cortical processing.”
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
My specific comments are in the Public Review. The summarizing point is that this is a sprawling paper, and it is easy for readers to get confused. Focusing on specific connections between known functional properties and findings in this model, especially for the full-scale model, will be helpful.
We thank the reviewer for this comment and for their related recommendation (4) below, and have added subheadings through-out the results.
Reviewer #2 (Recommendations for the authors):
(1) P4. What are the 10 free parameters?
We thank the reviewer for pointing out that it would be useful to summarize the 10 parameters at this stage of the text, and have adjusted the sentence to:
“As a result, the emerging in-vivo like activity is the consequence of only 10 free parameters representing the strength of extrinsic input from other brain regions into 9 layer-specific excitatory and inhibitory populations, and a parameter controlling the noise structure of this extrinsic input.”
(2) Table 1 and S1 are extremely useful. Could you provide a table summarizing the major assumptions or gaps in the model, their potential influence on the results, and possible ways to collect data that could support or challenge these assumptions? Currently, this information is scattered throughout the manuscript.
We thank the reviewer for this very useful suggestion and have added a Table S8 on page 68:
(3) Figure 4F is important, but the legend is unclear. What is the unit on the x-axis? The values seem too large to represent per-neuron measurements.
Thank you to the reviewer for raising this. Indeed the values are estimated mean numbers of missing number synapses per neuron by population. Such numbers are difficult to estimate but we have further discussed our rationale, justification and consideration of whether these numbers are accurate in the Results, as follows:
“Heterogeneity in synaptic density within and across neuron classes and sections makes estimating the number of missing synapses challenging (DeFelipe and Fariñas, 1992). Changing the assumed synaptic density value of 1.1 synapses/μm would only change the slope of the relationship, however. Estimates of mean number of existing and missing synapses per population were within reasonable ranges; even the larger estimate for L5 E (due to higher dendritic length; Figure S3) was within biological estimates of 13,000 ± 3,500 total afferent synapses (DeFelipe and Fariñas, 1992).”
This text references the new supplementary Figure S3:
Moreover, these numbers represent the number of synapses, rather than the number of connections. The number of connections is usually used for quantifications such as indegree, and are usually much lower.
We have also updated the caption and axis labels of the original figure:
(4) Including additional subsections or improving the indexing in the Results section could be beneficial. In its current format, it's difficult to distinguish where the model description ends and where the validation begins. Some readers may want to focus more on the validation than other parts, so clearer segmentation would improve readability.
We have addressed this comment with the opening comment in the authors “Recommendations for authors”.
(5) P4. 2nd paragraph. Original vs rewired connectome. The term "rewired connectome" may give the impression that it refers to an artificial manipulation rather than a modification based on the latest data. It might be helpful to use a different term (e.g., SM-connectome as described later in the paper?).
We have adjusted the text in the introduction:
“Additionally, we generated a new connectome which captured recently characterized spatially-specific targeting rules for different inhibitory neuron types (Schneider-Mizell et al., 2023) in the MICrONS electron microscopy dataset (MICrONS-Consortium et al., 2021), such as increased perisomatic targeting by PV+ neurons, and increased targeting of inhibitory populations by VIP+ neurons. Comparing activity to the original connectome gave predictions about the role of these additional targeting rules.”
(6) Figures 7 B, C, D: what is v1/v2? Original vs SM-Connectome?
We thank the reviewer for noticing this and have corrected the figure to use “Orig” and “SM” consistent with the rest of the figure.
(7) Page 23, 2.10: what is phi?
We thank the reviewer for noticing this inconsistency with the earlier text, and have updated the text to read: “Particularly, we ran simulations for PF R ∈ [0.1, 0.15, ..., 0.3] using the OU para-maters calibrated for the seven column subvolume for [Ca<sup>2+</sup>] = 1.05 mM and R<sub>OU</sub> = 0.4.”
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Public Reviews:
Reviewer #1 (Public review):
Summary:
Giménez-Orenga et al. investigate the origin and pathophysiology of myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) and fibromyalgia (FM). Using RNA microarrays, the authors compare the expression profiles and evaluate the biomarker potential of human endogenous retroviruses (HERV) in these two conditions. Altogether, the authors show that HERV expression is distinct between ME/CFS and FM patients, and HERV dysregulation is associated with higher symptom intensity in ME/CFS. HERV expression in ME/CFS patients is associated with impaired immune function and higher estimated levels of plasma cells and resting CD4 memory T cells. This work provides interesting insights into the pathophysiology of ME/CFS and FM, creating opportunities for several follow-up studies.
Strengths:
(1) Overall, the data is convincing and supports the authors' claims. The manuscript is clear and easy to understand, and the methods are generally well-detailed. It was quite enjoyable to read.
(2) The authors combined several unbiased approaches to analyse HERV expression in ME/CFS and FM. The tools, thresholds, and statistical models used all seem appropriate to answer their biological questions.
(3) The authors propose an interesting alternative to diagnosing these two conditions. Transcriptomic analysis of blood samples using an RNA microarray could allow a minimally invasive and reproducible way of diagnosing ME/CFS and FM.
Weaknesses:
(1) The cohort analysed in this study was phenotyped by a single clinician. As ME/CFS and FM are diagnosed based on unspecific symptoms and are frequently misdiagnosed, this raises the question of whether the results can be generalised to external cohorts.
Thank you for your comment. Surely the study of larger cohorts will determine the external validity of these results in a clinical scenario. However, this pilot study, first of its kind, was designed to maximize homogeneity across participants which seemed primarily ensured by inclusion of females only diagnosed by a single experienced observer.
(2) The analyses performed to unravel the causes and effects of HERV expression in ME/CFS and FM are solely based on sequencing data. Experimental approaches could be used to validate some of the transcriptomic observations.
Certainly, experimental approaches may add robustness to our findings. We in fact consider taking this avenue to deepen in the observations presented here. However, the limited knowledge of HERV-mediated physiological functions may hinder the task of revealing causes and effects of HERV expression in ME/CFS and FM in the short term.
Reviewer #2 (Public review):
Summary:
Giménez-Orenga carried out this study to assess whether human endogenous retroviruses (HERVs) could be used to improve the diagnosis of Myalgic Encephalomyelitis/Chronic Fatigue Syndrome (ME/CFS) and Fibromyalgia (FM). To this end, they used the HERV-V3 array developed previously, to characterize the genome-wide changes in the expression of HERVs in patients suffering from ME/CFS, FM, or both, compared to controls. In turn, they present a useful repertoire of HERVs that might characterize ME/CFS and FM. For the most part, the paper is written in a manner that allows a natural understanding of the workflow and analyses carried out, making it compelling. The figures and additional tables present solid support for the findings. However, some statements made by the authors seem incomplete and would benefit from a more thorough literature review. Overall, this work will be of interest to the medical community seeking in better understanding of the co-occurrence of these pathologies, hinting at a novel angle by integrating HERVs, which are often overlooked, into their assessment.
Strengths:
(1) The work is well-presented, allowing the reader to understand the overall workflow and how the specific aims contribute to filling the knowledge gap in the field.
(2) The analyses carried out to understand the potential impact on gene expression mediated by HERVs are in line with previous works, making it solid and robust in the context of this study.
Weaknesses:
(1) The authors claim to obtain genome-wide HERV expression profiles. However, the array used was developed using hg19, while the genomic analysis of this work are carried out using a liftover to hg38. It would improve the statement and findings to include a comparison of the differences in HERVs available in hg38, and how this could impact the "genome-wide" findings.
This is an important point. However, the low number of probes that were excluded from our analysis by lack of correspondence with hg38, less than 100 among the 1,290,800 probesets, was interpreted as insignificant for "genome-wide" claims. An aspect that will be detailed in the revised version of this manuscript.
(2) The authors in some points are not thorough with the cited literature. Two examples are:
a) Lines 396-397 the authors say "the MLT1, usually found enriched near DE genes (Bogdan et al., 2020)". I checked the work by Bogdan, and they studied bacterial infection. A single work in a specific topic is not sufficient to support the statement that MLT1 is "usually" in close vicinity to differentially expressed genes. More works are needed to support this.
b) After the previous statement, the authors go on to mention "contributing to the coding of conserved lncRNAs (Ramsay et al., 2017)". First, lnc = long non-coding, so this doesn't make sense. Second, in the work by Ramsay they mention "that contributed a significant amount of sequence to primate lncRNAs whose expression was conserved", which is different from what the authors in this study are trying to convey. Again, additional work and a rephrasing might help to support this idea.
Certainly, these two sentences need rephrasing to better adjust statements to current evidence and will be replaced in the revised version of this manuscript.
(3) When presenting the clusters, the authors overlook the fact that cluster 4 is clearly control-specific, and fail to discuss what this means. Could this subset of HERV be used as bona fide markers of healthy individuals in the context of these diseases? Are they associated with DE genes? What could be the impact of such associations?
Using control DE HERV as bona fide markers of healthy individuals seems like an interesting possibility worth exploring. Control DE HERVs (cluster 4) are indeed associated with DE genes involved in apoptosis, T cell activation and cell-cell adhesion (modules 1 and 6) (Figure 3A). The impact of which deserves further study.
Appraisals on aims:
The authors set specific questions and presented the results to successfully answer them. The evidence is solid, with some weaknesses discussed above that will methodologically strengthen the work.
Likely impact of work on the field:
This work will be of interest to the medical community looking for novel ways to improve clinical diagnosis. Although future works with a greater population size, and more robust techniques such as RNA-Seq, are needed, this is the first step in presenting a novel way to distinguish these pathologies.
It would be of great benefit to the community to provide a table/spreadsheet indicating the specific genomic locations of the HERVs specific to each condition. This will allow proper provenance for future researchers interested in expanding on this knowledge, as these genomic coordinates will be independent of the technique used (as was the array used here).
We agree with the reviewer that sharing genomic locations of DE HERVs in these pathologies would contribute to further development of our findings. Unfortunately, we do not hold the rights to share probe coordinates from this custom HERV-V3 microarray which we used under MTA agreement with its developer.
Reviewer #3 (Public review):
The authors find that HERV expression patterns can be used as new criteria for differential diagnosis of FM and ME/CFS and patient subtyping. The data are based on transcriptome analysis by microarray for HERVs using patient blood samples, followed by differential expression of ERVs and bioinformatic analyses. This is a standard and solid data processing pipeline, and the results are well presented and support the authors' claim.
-
-
www.medrxiv.org www.medrxiv.org
-
Author response:
Thank you to the reviewers and editors for their positive and constructive comments. Based on this feedback, we can see that we need to clarify that the primary goal of this paper is a test of potential changes in public health policy rather than a test of technical improvements to forecasting models. We briefly summarize the primary goal below to address these public reviews and list our proposed revisions to the manuscript based on reviewer feedback.
All real-time forecasting models contend with 2 major constraints:
(1) How far into the future they have to predict
(2) How rapidly the data used for predictions become available in real time
In the case of evolutionary influenza forecasts, the current values of these constraints are 1) 12 months into the future and 2) an average lag of ~3 months for hemagglutinin (HA) sequences to become available after sample collection. Regardless of the predictors we use in these models (genetic or phenotypic), our units of prediction always depend on HA: the HA protein is the primary target of our immunity, HA is the only gene whose composition is determined by the vaccine selection process, and influenza diversity is historically defined by clades in HA phylogenies.
Our primary goal of this study was to understand the relative effect sizes of these two common constraints on forecasting while holding all other variables as constant as possible. With this understanding, we hoped to better inform public health priorities and set realistic expectations for current and future forecasting efforts regardless of the technical specifications of each forecasting model. In other words, the goal of this study was not to optimize prediction methods but to estimate the effects of potential policy changes on forecast accuracy.
We found that reducing how far into the future we need to predict consistently reduced our forecasting error in simulated populations (where we knew the true fitness of each virus) and in natural populations (where we either estimated fitness from genetic predictors or we knew the true fitness of each virus based on its future success). Figure 6 and its first supplemental figure show these effect sizes for natural and simulated populations, respectively, when the future fitness of each virus is known at the time of prediction. By definition, we cannot hope to improve our estimates of viral fitness for these forecasts by using other genetic or phenotypic information.
Figure 6 shows that reducing how far into the future we need to predict from 12 to 6 months improves our forecasting accuracy 3 times as much as reducing the lag between sample collection and HA sequence submission to public databases. The impact of this finding is the confirmation that a faster vaccine development process would improve our forecast accuracy substantially more than faster turnaround between sample collection and sequence submission. If our public health goal is to make better predictions of future influenza populations, then this result indicates that our main priority is to speed up the vaccine development process.
If our public health goal is to better understand the composition of currently circulating influenza populations (the units of our forecasts), then Figure 3 shows that reducing the lag between sample collection and HA sequence submission from ~3 months on average to 1 month on average reduces our uncertainty in current clade frequency estimates by half. This impact is also independent of the predictors we use in our forecasting models and is not lessened by the lack of other genetic or phenotypic information in our analyses.
We realize that neither a 6-month vaccine development process nor a 1-month average sequence submission lag exist yet, but we believe that these are realistic and achievable goals for scientific and public health communities. We also realize that these public health goals are not mutually exclusive. By measuring the effects of these realistic changes to current policy through our forecasting experiments, we hope to inspire and motivate researchers and decision-makers who are empowered to make both of these goals a reality.
Finally, we want to emphasize that the use of phenotypic data in forecasts introduces additional delays caused by the lag between when genetic sequences become available and when serological experiments can be performed. Most WHO influenza collaborating centers use a "sequence-first" approach where they characterize the genetic sequence and use available sequences to prioritize phenotypic experiments with serology. This additional lag in availability of phenotypic data means that a forecasting model based on genetic and phenotypic data will necessarily have a greater lag in data availability than a model based on genetic data only. This lag is important for practical forecasts, too, but because the lag reflects specific characteristics of each collaborating center and not a global policy change, we believe this topic falls outside of the scope of this study.
Based on these public reviews and the private recommendations from reviewers, we plan to make the following revisions to this manuscript.
● Clarify the introduction, discussion, and abstract to emphasize the primary goal for this study to test effects of realistic changes to public health policy and note that this study does not cover improvements to forecasting models. As part of these changes, we will include a rationale for our choice of a genetic-information-only approach rather than a model that integrates phenotypic data. We will also refine Figure 1 to more clearly communicate the two factors we tested in this study.
● Provide a clearer explanation for the subsampling approach we use, include supplemental materials to communicate the geographic and temporal biases that exist in available HA sequence data, and discuss potential effects of different subsampling strategies.
● Evaluate the robustness of our results to different randomly subsampled data. We will perform additional technical replicates of our analysis workflow for natural populations, and summarize the effects of realistic interventions across replicates in a supplemental figure and the main text of the results.
● Investigate time-dependent effects of forecast horizons and submission lags on model accuracy to identify any potential biases in accuracy during specific historical epochs or any seasonal trends in accuracy associated with predicting future populations for the Northern or Southern Hemispheres.
● In the discussion, clarify how reducing submission lags would practically improve the WHO's ability to select vaccine candidate viruses and minimize jargon that currently makes the discussion less accessible to the average reader.
● Investigate how changes in forecast horizons and submission lags change the distance between predicted and observed future populations at antigenic positions (i.e., "epitope" positions) to understand whether we see the same effects with that subset of positions as we see across all HA positions.
-
-
www.biorxiv.org www.biorxiv.org
-
Author Response:
We greatly appreciate the feedback provided by reviewers on this manuscript. One of our key objectives was to provide a comprehensive, detailed resource for researchers using single-cell transcriptomics to study arthritis, especially immune cells like macrophages. We strived to perform thorough, wide-ranging analyses that are both accessible and useful to other scientists in the field, and that we hope will serve as the basis for many future avenues of study. As such, we acknowledge that this work is a “first step”, providing a strong descriptive foundation with some mechanistic insight that we and others will continue pursuing. Preliminary studies in our laboratory seeking to dissect signaling mechanisms associated with the M-CSF pathway have illuminated how complex and context-dependent this signaling is, which is an important consideration for future in vivo investigations. Further, it is indeed true that attempting to harmonize transcriptomic data across studies, models, laboratories, and dissection/processing methods is fraught with difficulty and prone to misinterpretation – and we made an effort to highlight this in our manuscript, particularly with respect to where synovial immune cells were recovered from, and how. We encourage healthy discussion within the field for developing shared, unified protocols for harvests and processing upstream of transcriptomic experiments.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public Review)
Summary:
The authors wanted to use AlphaFold-multimer (AFm) predictions to reduce the challenge of physics-based protein-protein docking.
Strengths:
They found that two features of AFm predictions are very useful. 1) pLLDT is predictive of flexible residues, which they could target for conformational sampling during docking; 2) the interface-pLLDT score is predictive of the quality of AFm predictions, which allows the authors to decide whether to do local or global docking.
Weaknesses:
(1) As admitted by the authors, the AFm predictions for the main dataset are undoubtedly biased because these structures were used for AFm training. Could the authors find a way to assess the extent of this bias?
Indeed, the AFm training included most of the structures in the DB5 benchmark for its training as many structures (either unbound or bound) were deposited before the training cut-off period. One of the challenges of estimating this bias is the availability of new structures - both bound and unbound deposited after the training cut-off. Estimating the extent of training bias is therefore conditional on these factors and difficult. A few studies have attempted to address this bias (Yin et al, 2022, https://doi.org/10.1002/pro.4379).
In our study, we assess this bias by comparing the AFm structures to the bound and unbound forms and calculating their Ca RMSDs and TM-scores (new addition). We now elaborate in the Results:Dataset curation section and we have added a figure comparing the TM-scores in the supplement.
We added a clarifying text and a note about the TM-score calculation in the manuscript as follows:
“Since most of the benchmark targets in DB5.5 were included in AlphaFold training, there would be training bias associated with their predictions (i.e. our measured success rates are an upper bound).”
“We also calculated the TM-scores of the AFm predicted complex structures with respect to the bound and the unbound crystal structures (Supplementary Figure S2). As TM-scores reflect a global comparison between structures and are less sensitive to local structural deviations, no strong conclusions could be derived. This is in agreement with our intuition that since both unbound and bound states of proteins will share a similar fold, and AlphaFold can predict structures with high TM-scores in most cases, gauging the conformational deviations with TM-scores would be inconclusive.”
(2) For the CASP15 targets where this bias is absent, the presentation was very brief. In particular, it would be interesting to see how AFm helped with the docking. The authors may even want to do a direct comparison with docking results without the help of AFm.
Unfortunately since this was a CASP-CAPRI round, the structure of the unbound Antigen or the nanobodies was unavailable. Thus we cannot perform a comparison without using AF2 at all since we need a structure prediction tool to produce the unbound nanobody and the nanobody-antigen complex template structure to dock. This has been clarified in the main text for better understanding for the readers.
“Since the nanobody-antigen complexes were CASP targets, we did not have unbound structures, rather only the sequences of individual chains. Therefore, for each target, we employed the AlphaRED strategy as described in Fig 7.”
Reviewer #1 (Recommendations For The Authors):
For suggestions for major improvements, see comments under weaknesses. One additional suggestion: the authors found that pLLDT is predictive of flexible residues. Can they try to find AFm features that are predictive of the interface site? Such information may guide their docking to a local site.
This is a great idea that we and others have been thinking about considerably. Prior work by Burke et al. (Towards a structurally resolved human protein interaction network) examines AlphaFold’s ability to predict PPIs. For high-confidence predicted models of interacting protein complexes, the authors showed that pDockQ correlated reasonably well with correct protein interactions.
That being said, binding site identification, particularly in a partner-agnostic fashion, i.e. determining binding patches on a given protein, is an area of on-going research . We hope a future study examines AlphaFold3 or ESM3 specifically for this task.
“Further, we tested multiple thresholds to estimate the optimum cut-off for distinguishing near-native structures (defined as an interface-RMSD < 4 Å) from the predictions. Figure 3.B summarizes the performance with a confusion matrix for the chosen interface-pLDDT cutoff of 85. 79 % of the targets are classified accurately with a precision of 75%, thereby validating the utility of interface-pLDDT as a discriminating metric to rank the docking quality of the AFm complex structure predictions. With AlphaFold3 and ESM3 being released, investigating features that could predict flexible residues or interface site would be valuable, as this information may guide local docking.”
Minor:
Page 3, lines 73-77, state how many targets were curated from DB5.5.
We have now clarified this in the manuscript. All 254 targets curated from DB5.5 at the time of this benchmark study.
“For each protein target, we extracted the amino acid sequences from the bound structure and predicted a corresponding three-dimensional complex structure with the ColabFold implementation of the AlphaFold multimer v2.3.0 (released in March 2023) for the 254 benchmark targets from DB5.5.”
In Figure 1, the color used for medium is too difficult to distinguish from the grey color used for rigid.
We thank you for this suggestion. We have updated the color to olive. Further, based on Reviewer 2’s suggestions, we have moved this plot to the Supplementary.
Reviewer #2 (Public Review):
Summary:
In short, this paper uses a previously published method, ReplicaDock, to improve predictions from AlphaFold-multimer. The method generated about 25% more acceptable predictions than AFm, but more important is improving an Antibody-antigen set, where more than 50% of the models become improved.
When looking at the results in more detail, it is clear that for the models where the AFm models are good, the improvement is modest (or not at all). See, for instance, the blue dots in Figure 6. However, in the cases where AFm fails, the improvement is substantial (red dots in Figure 6), but no models reach a very high accuracy (Fnat ~0.5 compared to 0.8 for the good AFm models). So the paper could be summarized by claiming, "We apply ReplicaDock when AFm fails", instead of trying to sell the paper as an utterly novel pipeline. I must also say that I am surprised by the excellent performance of ReplicaDock - it seems to be a significant step ahead of other (not AlphaFold) docking methods, and from reading the original paper, that was unclear. Having a better benchmark of it alone (without AFm) would be very interesting.
We thank the reviewer for highlighting the performance of ReplicaDock. ReplicaDock alone is benchmarked in the original paper (10.1371/journal.pcbi.1010124), with full details on the 2022 version of DB5.5 in the supplement. Indeed ReplicaDock2 achieves the highest reported success rates on flexible docking targets reported in the literature (until this AlphaRED paper!).
Regarding this statement about “the paper could be summarized…” it might be helpful to give more context. ReplicaDock is a replica exchange Monte Carlo sampling approach for protein docking that incorporates flexibility in an induced-fit fashion. However, the choice of which backbone residues to move is solely dependent on contacts made during each docking trajectory. In the last section of the ReplicaDock paper, we introduced “Directed Induced-fit” where we biased the backbone sampling only towards those residues where we knew the backbone is flexible (this information is obtained because for the benchmark set, we had both unbound and bound structures and hence could cherry-pick the specific residues which are mobile). We agree with the reviewers that AlphaRED is essentially a derivative of ReplicaDock, however, the two major claims that we make in this paper are:
(1) AlphaFold pLDDT is an effective predictor of backbone flexibility for practical use in docking.
(2) We can automate the Directed InducedFit approach within ReplicaDock by utilizing this pLDDT information per residue for conformational sampling in protein docking; and in doing so, create a pipeline that would allow us to go from sequence-to-structure-to-complex, specifically capturing conformational changes.
To conclude these claims, we pose the following questions in the Introduction:
“(1) Do the residue-specific estimates from AF/AFm relate to potential metrics demonstrating conformational flexibility?
(2) Can AF/AFm metrics deduce information about docking accuracy?
(3) Can we create a docking pipeline for in-silico complex structure prediction incorporating AFm to convert sequence-to-structure-to-docked complexes?”
This work requires a pipeline, the center of which lies in ReplicaDock as a docking method, but has functionalities that were absent in prior work. The goal is also to develop a one-stop shop without manual intervention (a prerequisite for biasing backbone sampling in ReplicaDock) that could be utilized by structural biologists efficiently.
We clarify this points in the abstract and main text as follows:
Abstract: “In this work, we combine AlphaFold as a structural template generator with a physics-based replica exchange docking algorithm \add{to better sample conformational changes.”
Introduction:
“The overarching goal is to create a one-stop, fully-automated pipeline for simple, reproducible, and accurate modeling of protein complexes. We investigate the aforementioned questions and create a protocol to resolve AFm failures and capture binding-induced conformational changes. We first assess the utility of AFm confidence metrics to detect conformational flexibility and binding site confidence.”
These results also highlight several questions I try to describe in the weakness section below. In short, they boil down to the fact that the authors must show how good/bad ReplicaDock is at all targets (not only the ones where AFm fails. In addition, I have several more technical comments.
Strengths:
Impressive increase in performance on AB-AG set (although a small set and no proteins).
We thank the reviewer for their comments.
Weaknesses:
The presentation is a bit hard to follow. The authors mix several measures (Fnat, iRMS, RMSDbound, etc). In addition, it is not always clear what is shown. For instance, in Figure 1, is the RMSD calculated for a single chain or the entire protein? I would suggest that the author replace all these measures with two: TM-score when evaluating the quality of a single chain and DockQ when evaluating the results for docking. This would provide a clearer picture of the performance. This applies to most figures and tables.
We apologize for the lack of clarity owing to different metrics. Irms and fnat are standard performance metrics in the docking field, but we agree that DockQ would be simpler when the detail of the other metrics are not required. We have updated the figures Figure 5 and Figure 8 to also show DockQ comparisons.
Regarding Figure 1, as highlighted in Line 90 of the main-text, “Figure 1 shows the Ca-RMSD of all protein partners of the AFm predicted complex structures with respect to the bound and the unbound.” As suggested by the reviewer in their further comments, we have moved this FIgure to the Supplementary. We have also included TM-score comparison in the Supplementary ( SupFig S2) and included clarifying statements in the main text:
“We also tested TM-scores to measure the structural deviations of the AFm predicted complex structures with respect to the bound and unbound structures (Supplementary Figure S2). However, this metric is not sensitive enough to detect the subtle, local conformational changes upon binding.”
For instance, Figure 9 could be shown as a distribution of DockQ scores.
We have now updated Figure 5 to include DockQ scores in Panel D. Since DockQ is a function of iRMSD, fnat and L-RMSD, it shows cumulative improvement in performance. Some of the nuanced details, such as, the protocol improves i-RMSD considerably but fnat improvement is lacking, and can highlight whether backbone sampling is the challenge or is it sidechain refinement.Therefore, we need to retain the iRMSD and fnat metrics in panel A-C . But We have incorporated this in the main text as follows:
“Finally, to evaluate docking success rates, we calculate DockQ for top predictions from AFm and AlphaRED respectively (Figure 5D). AlphaRED demonstrates a success rate (DockQ>0.23) for 63% of the benchmark targets. Particularly for Ab-Ag complexes, AFm predicted acceptable or better quality docked structures in only 20% of the 67 targets. In contrast, the AlphaRED pipeline succeeds in 43% of the targets, a significant improvement.”
Further, we have reevaluated success rates in Figure 8 (previously Figure 9) and have updated the manuscript to report these updated success rates.
“By utilizing the AlphaRED strategy, we show that failure cases in AFm predicted models are improved for all targets (lower Irms for 97 of 254 failed targets) with CAPRI acceptable-quality or better models generated for 62% of targets overall (Fig 8)”.
The improvements on the models where AFm is good are minimal (if at all), and it is unclear how global docking would perform on these targets, nor exactly why the plDDT<0.85 cutoff was chosen.
We agree with the reviewers that the improvement on the models with good AFm predictions is minimal. We acknowledge this in the text now as follows:
“Most of the improvements in the success rates are for cases where AFm predictions are worse. For targets with good AFm predictions, AlphaRED refinement results in minimal improvements in docking accuracy.”
The choice of pLDDT cutoff = 85 is elaborated in the “Interface-pLDDT correlates with DockQ and discriminates poorly docked structures” section, paragraph 3. Briefly, we tested multiple metrics and the interface pLDDT had the highest AUC, indicating that it is the best metric for this task. For interface-pLDDT we tested multiple thresholds, and the cutoff of 85 resulted in the highest percentage of true-positive and true-negative rates. This is illustrated with the confusion matrix in Figure 3.B with the precision scores. We now clarify this in the text as follows:
“With interface-pLDDT as a discriminating metric, we tested multiple thresholds to estimate the optimum cut-off for distinguishing near-native structures (defined as an interface-RMSD < 4 Å) from the predictions. Figure 3B summarizes the performance with a confusion matrix for the chosen interface-pLDDT cutoff of 85. 79% of the targets are classified accurately with a precision of 75%, thereby validating the utility of interface-pLDDT as a discriminating metric to rank the docking quality of the AFm complex structure predictions.”
To better understand the performance of ReplicaDock, the authors should therefore (i) run global and local docking on all targets and report the results, (ii) report the results if AlphaFold (not multimer) models of the chains were used as input to ReplicaDock (I would assume it is similar). These models can be downloaded from AlphaFoldDB.
The performance of ReplicaDock on DB5.5 is tabulated in our prior work (https://doi.org/10.1371/journal.pcbi.1010124) and we direct the reviewers there for the detailed performance and results. In our opinion, the benchmark suggested by the reviewer would be redundant and not worth the computational expense.
The scope of this paper is to highlight a structure prediction + physics-based modeling pipeline for docking to adapt to the accuracy of up-and-coming structure prediction tools.
Using AlphaFold monomer chains as input and benchmarking on that, albeit interesting scientifically, will not be useful for either the pipeline or biologists who would want a complex structure prediction. We thank the authors for their comments but want to reemphasize that the end goal of this work is to increase the accuracy of complex structure predictions and PPIs obtained from computational tools.
Further, it would be interesting to see if ReplicaDock could be combined with AFsample (or any other model to generate structural diversity) to improve performance further.
We would like to highlight that ReplicaDock is a stand-alone tool for protein docking and here we demonstrate the ability of adapting it with metrics derived from AlphaFold or other structure prediction tools (say ESMFold) such as pLDDT for conformational sampling and improving docking accuracy. We definitely agree that adapting it to use with tools such as AFSample will be interesting but it is out of scope of this work.
The estimates of computing costs for the AFsample are incorrect (check what is presented in their paper). What are the computational costs for RepliaDock global docking?
The authors of the AFSample paper report that “AFsample requires more computational time than AF2, as it generates 240 models, and including the extra recycles, the overall timing is 1000 more costly than the baseline.” We have reported these exact numbers in our manuscript.
The computational costs of ReplicaDock are 8-72 CPU hours on a single node with 24 processors as reported in our prior work.
For AlphaRED, the costs are slightly higher owing to the structure prediction module in the beginning and are up to 100 CPU hrs for our largest (max Nres) target.
It is unclear strictly what sequences were used as input to the modelling. The authors should use full-length UniProt sequences if they were not done.
We report this in the methods section of the manuscript as well as in Figure 5. Full length complex sequences were used for the models that we extracted from DB5.5.
“As illustrated in Fig. 5, given a sequence of a protein complex, we use the ColabFold implementation of AF2-multimer to obtain a predictive template.”
We clarify this in the methods section as:
“For each target in the DB5.5 dataset, we first extracted the corresponding FASTA sequence for the bound complex and then obtained AlphaFold predicted models with the ColabFold v1.5.2 implementation of AlphaFold and AlphaFold-multimer (v.2.3.0).”
The antibody-antigen dataset is small. It could easily be expanded to thousands of proteins. It would be interesting to know the performance of ReplicaDock on a more extensive set of Antibodies and nanobodies.
This work demonstrates the performance on the docking benchmark, i.e. given unbound structure can you predict the bound complexes. With this regard, our analysis has been focussed on targets where both the unbound and bound structures are available so that we could evaluate the ability of AlphaRED on modeling protein flexibility and docking accuracy. For antibody-antigen complexes, there are only 67 structures with both unbound and bound complexes available and they constituted our dataset. Benchmarking AlphaRED on all antibody-antigen targets can give biased results as most Ab-Ag complexes are in AlphaFold training set. Further, our work is more aimed towards predicting conformational flexibility in docking and not rigid-body docked complexes, so benchmarking on existing bound Ab-Ag structures is out of scope for this work.
Using pLDDT on the interface region to identify good/bas models is likely suboptimal. It was acceptable (as a part of the score) for AlphaFold-2.0 (monomer), but AFm behaves differently. Here, AFm provides a direct score to evaluate the quality of the interaction (ipTM or Ranking Confidence). The authors should use these to separate good/bad models (for global/local docking), or at least show that these scores are less good than the one they used.
We thank the reviewers for this suggestion.
Reviewer #2 (Recommendations For The Authors):
Some Figures could be skipped/improved
Fig 1: Use TM-score instead a much better measure (and the figure is not necessary).
Figure 1 compares the bias of AlphaFold towards unbound or bound forms of the proteins. We believe that this figure highlights the slight inherent bias of AlphaFold towards bound structures over unbound.
As the reviewers have suggested we have included a plot comparing the TM-scores for the structures. Further, we have moved this figure to the Supplementary.
Fig 2. Skip B (why compare RMSD with pLDDT?). Add a figure to see how this correlates over all targets not just two.
RMSD and LDDT both represent metrics to evaluate conformational variability between two structures, such as the bound and unbound forms of the same protein structure. On one hand where RMSD measures overall deviation of residues, LDDT allows the estimation of relative domain orientations and concerted proteins. We have elaborated this in Methods as well as in the Results section titled “AlphaFold pLDDT provides a predictive confidence measure for backbone flexibility”.
The data for the benchmark targets is now included in the Supplementary (Supplementary Figures S3-S4).
Fig 3. Color the different chains of a protein differently. Thereby the Receptor/Ligand/Bound labels can be omitted.
We thank the reviewers for this suggestion. However, the color scheme is chosen to highlight (1) the relative orientation of protein partners relative to each other. We have ensured that the alignment is over one partner (Receptor) so that you could see the relative orientation of the other partner (Ligand) in the modeled protein over the bound structure (in one color). (2) The coloring of the receptor and ligand chain is by pLDDT (from red to blue) to highlight that for decoys with incorrectly predicted interfaces, the pLDDT scores of the interface residues are indeed lower and can be a discriminating metric. We elaborate this in the caption of Figure 3 as well as in the section “Interface-pLDDT correlates with DockQ and discriminates poorly docked structures”. Coloring the chains of a protein differently will obfuscate the point that we are aiming to make and will be inconclusive for the readers as they would need to rely only on quantitative metrics (Irms and DockQ) reported but won’t be able to visualize the interface pLDDT of the incorrectly bound structures. We hope that this justifies the choice of our color scheme.
Fig 4. Include RankConf, ipTM, pDockQ, and other measures in the plos (they are likely better). Include DockQ for the top targets. It is difficult to estimate for multi chain complexes.
We thank the reviewer for this suggestion. We have now included the DockQ performances for all targets in Figure 5 (previously Figure 6) as well as re-evaluated our final success rates based on the DockQ calculations in Figure 8 (previously Figure 9).
Fig 5. use a better measure to split (see above).
We have elaborated on the choice of the split for the comments above and the interface pLDDT threshold of 85 is a decision made post observation on the docking benchmark. We do want to highlight that the cut-off is arbitrary and in our online server (ROSIE) as well as in custom scripts, this cut-off can be tuned by the user as required. We would suggest a cut-off of 85 based on our observations but the users are welcome to tune this as per their needs.
Fig 6. Replace lrms/fnat with DockQ.
We have now included DockQ scores in our manuscript.
Fig 7. Color the different chains of a protein differently.
We have colored the protein chains differently. AlphaFold models are in Orange, Bound complexes are in Gray, and predicted proteins from AlphaRED are in Blue-Green indicating the two partners. All models are aligned over the receptor so relative orientations of the ligand protein can be observed.
Fig 8 Color the different chains of a protein differently.
The chains are colored differently. We would like the reviewer to elaborate more on what they would like to observe as we believe our color scheme makes intuitive sense for readers.
Fig 9. Use DockQ instead of CAPRI criteria.
The figure has been updated based on DockQ. To elaborate, the CAPRI criteria is set based on DockQ scores as elaborated in the figure caption.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
eLife Assessment <br /> This manuscript reports important findings that the methyltransferase METTL3 is involved in the repair of abasic sites and uracil in DNA, mediating resistance to floxuridine-driven cytotoxicity. The presented evidence for the involvement of m6A in DNA is incomplete and requires further validation with orthogonal approaches to conclusively show the presence of 6mA in the DNA and exclude that the source is RNA or bacterial contamination.
We thank the editors for recognizing the importance of our work and the relevance of METTL3 in DNA repair. However, we wholly disagree with the second sentence in the eLife assessment, and we want to clarify why our evidence for the involvement of 6mA in DNA is complete.
The identification of 6mA in DNA, upon DNA damage, is based first on immunofluorescence observations using an anti-m6A antibody. In this setting, removal of RNA with RNase treatment fails to reduce the 6mA signal, excluding the possibility that the source of signal is RNA. In contrast, removal of DNA with DNase treatment removes all 6mA signal, strongly suggesting that the species carrying the N6-methyladenosine modification is DNA (Figure 3D, E). Importantly, in Figure 3F, we provide orthogonal, quantitative mass spectrometry data that independently confirm this finding. Mass spectrometry-liquid chromatography of DNA analytes, conclusively shows the presence of 6mA in DNA upon treatment with DNA damaging agents and excludes that the source is RNA, based on exact mass. Reviewer #2 recognized the strengths of this approach to generate solid evidence for 6mA in DNA.
Cells only show the 6mA signal when treated with DNA damaging agents, and the 6mA is absent from untreated cells (Figure 3D, E, F). This provides strong evidence that the 6mA signal is not a result of bacterial contamination in our cell lines. Moreover, our cell lines are routinely tested for mycoplasma contamination. It could be possible that stock solutions of DNA damaging agents may be contaminated, but this would need to be true for all individual drugs and stocks tested. The data showing 6mA signal is not significantly different from untreated cells when a DNA damaging agent is combined with a METTL3 inhibitor (Figure 3G, H) provides strong evidence against bacterial contamination in our stocks.
In summary, we provide conclusive evidence, based on orthogonal methods, that the METTL3-dependent N6-methyladenosine modification is deposited in DNA, not RNA, in response to DNA damage.
Public Reviews: <br /> Reviewer #1 (Public review): <br /> Summary:
The authors sought to identify unknown factors involved in the repair of uracil in DNA through a CRISPER knockout screen.
Typo above: “CRISPER” should be “CRISPR”.
Strengths:
The screen identified both known and unknown proteins involved in DNA repair resulting from uracil or modified uracil base incorporation into DNA. The conclusion is that the protein activity of METTL3, which converts A nucleotides to 5mA nucleotides, plays a role in the DNA damage/repair response. The importance of METTL3 in DNA repair, and its colocalization with a known DNA repair enzyme, UNG2, is well characterized.
Typo above: “5mA” should be “6mA”.
Weaknesses: <br /> This reviewer identified no major weaknesses in this study. The manuscript could be improved by tightening the text throughout, and more accurate and consistent word choice around the origin of U and 6mA in DNA. The dUTP nucleotide is misincorporated into DNA, and 6mA is formed by methylation of the A base present in DNA. Using words like 6mA "deposition in DNA" seems to imply it results from incorporation of a methylated dATP nucleotide during DNA synthesis.
The increased presence of 6mA during DNA damage could result from methylation at the A base itself (within DNA) or from incorporation of pre-modified 6mA during DNA synthesis. Our data do not directly discriminate between these two mechanisms, and we will clarify this point in the discussion.
Reviewer #2 (Public review): <br /> Summary: <br /> In this work, the authors performed a CRISPR knockout screen in the presence of floxuridine, a chemotherapeutic agent that incorporates uracil and fluoro-uracil into DNA, and identified unexpected factors, such as the RNA m6A methyltransferase METTL3, as required to overcome floxuridine-driven cytotoxicity in mammalian cells. Interestingly, the observed N6-methyladenosine was embedded in DNA, which has been reported as DNA 6mA in mammalian genomes and is currently confirmed with mass spectrometry in this model. Therefore, this work consolidated the functional role of mammalian genomic DNA 6mA, and supported with solid evidence to uncover the METTL3-6mA-UNG2 axis in response to DNA base damage. <br /> Strengths: <br /> In this work, the authors took an unbiased, genome-wide CRISPR approach to identify novel factors involved in uracil repair with potential clinical interest.
The authors designed elegant experiments to confirm the METTL3 works through genomic DNA, adding the methylation into DNA (6mA) but not the RNA (m6A), in this base damage repair context. The authors employ different enzymes, such as RNase A, RNase H, DNase, and liquid chromatography coupled to tandem mass spectrometry to validate that METTL3 deposits 6mA in DNA in response to agents that increase genomic uracil. <br /> They also have the Mettl3-KO and the METTL3 inhibition results to support their conclusion. <br /> Weaknesses:<br /> Although this study demonstrates that METTL3-dependent 6mA deposition in DNA is functionally relevant to DNA damage repair in mammalian cells, there are still several concerns and issues that need to be improved to strengthen this research.
First, in the whole paper, the authors never claim or mention the mammalian cell lines contamination testing result, which is the fundamental assay that has to be done for the mammalian cell lines DNA 6mA study.
Our cell lines are routinely tested for bacterial contamination, specifically mycoplasma, and we plan to state this information in a revised version of the manuscript.
Importantly, we do not observe 6mA in untreated cells, strongly suggesting that the 6mA signal observed is dependent on the presence of DNA damage and not caused by contamination in the cell lines (Figure 3D, E, F). While it could be possible that stock solutions of DNA damaging agents may be contaminated, this would need to be the case for all individual drugs and stocks tested that induce 6mA, which seems very unlikely. Finally, the data showing 6mA signal is not significantly different from untreated cells when a DNA damaging agent is combined with a METTL3 inhibitor (Figure 3 G, H) provides strong evidence against bacterial contamination in our drug stocks.
Second, in the whole work, the authors have not supplied any genomic sequencing data to support their conclusions. Although the sequencing of DNA 6mA in mammalian models is challenging, recent breakthroughs in sequencing techniques, such as DR-Seq or NT/NAME-seq, have lowered the bar and improved a lot in the 6mA sequencing assay. Therefore, the authors should consider employing the sequencing methods to further confirm the functional role of 6mA in base repair.
While we agree that it could be important to understand the precise genomic location of 6mA in relation to DNA damage, this is outside the scope of the current study. Moreover, this exercise may prove unproductive. If 6mA is enriched in DNA at damage sites or as DNA is replicated, the genomic mapping of 6mA is likely to be stochastic. If stochastic, it would be impossible to obtain the read depth necessary to map 6mA accurately.
Third, the authors used the METTL3 inhibitor and Mettl3-KO to validate the METTL3-6mA-UNG2 functional roles. However, the catalytic mutant and rescue of Mettl3 may be the further experiments to confirm the conclusion.
We believe this to be an excellent suggestion from Reviewer #2 but we are unable to perform the proposed experiment at this time. We encourage future studies to explore the rescue experiment.
Reviewer #3 (Public review):
Summary:
The authors are showing evidence that they claim establishes the controversial epigenetic mark, DNA 6mA, as promoting genome stability.
Strengths:
The identification of a poorly understood protein, METTL3, and its subsequent characterization in DDR is of high quality and interesting.
Weaknesses:
(1) The very presence of 6mA (DNA) in mammalian DNA is still highly controversial and numerous studies have been conclusively shown to have reported the presence of 6mA due to technical artifacts and bacterial contamination. Thus, to my knowledge there is no clear evidence for 6mA as an epigenetic mark in mammals, and consequently, no evidence of writers and readers of 6mA. None of this is mentioned in the introduction. Much of the introduction can be reduced, but a paragraph clearly stating the controversy and lack of evidence for 6mA in mammals needs to be added, otherwise, the reader is given an entirely distorted view of the field.
These concerns must also be clearly in the limitations section and even in the results section which fails to nuance the authors' findings.
We agree with the reviewer that the presence and potential function of 6mA in mammalian DNA has been debated. Importantly, the debate regarding the presence and quantity of 6mA in DNA has been previously restricted to undamaged, baseline conditions. In complete agreement with this notion, we do not detect appreciable levels of 6mA in untreated cells. We will revise the introduction to introduce the debate about 6mA in DNA. We, however, want to highlight that our study provides for the first time, convincing evidence (based on orthogonal methods) that 6mA is present in DNA in response to a stimulus, DNA damage.
(2) What is the motivation for using HT-29 cells? Moreover, the materials and methods do not state how the authors controlled for bacterial contamination, which has been the most common cause of erroneous 6mA signals to date. Did the authors routinely check for mycoplasma?
HT-29 is a cell line of colorectal origin and chemotherapeutic agents that introduce uracil and uracil derivatives in DNA, as those used in this study, are relevant for the treatment of colorectal cancer. As indicated above, we do not observe 6mA in untreated cells, strongly suggesting that the 6mA signal observed is dependent on DNA damage and not caused by a potential bacterial contamination (Figure 3D, E, F). Additionally, our cell lines are routinely tested for bacterial contamination, specifically mycoplasma.
(3) The single-cell imaging of 6mA in various cells is nice but must be confirmed by orthogonal approaches. PacBio would provide an alternative and quantitative approach to assessing 6mA levels. Similarly, it is unclear why the authors have not performed dot-blots of 6mA for genomic DNA from the given cell lines.
We are confused by this point since an orthogonal approach to detect 6mA, mass spectrometry-liquid chromatography, was employed. This method does not use an antibody and confirms the increase of 6mA in DNA when cells were treated with DNA damaging agents. This data is presented in Figure 3F.
It is sensible to hypothesize that the localization of 6mA is consistent with DNA replication (like uracil deposition). In this event, the genomic mapping of 6mA is likely to be stochastic. This would make quantification with PacBio sequencing difficult because it would be very challenging to achieve the appropriate read depth to call a modified base.
Dot blots rely on an antibody and thus are not truly orthogonal to our immunofluorescence-based measurements. We preferred the mass spectrometry-liquid chromatography approach we took as a true orthogonal approach.
(4) The results of Figure 3 need further investigation and validation. If the results are correct the authors are suggesting that the majority of 6mA in their cell lines is present in the DNA, and not the RNA, which is completely contrary to every other study of 6mA in mammalian cells that I am aware of. This could suggest that the antibody is not, in fact, binding to 6mA, but to unmodified adenine, which would explain why the signal disappears after DNAse treatment. Indeed, binding of 6mA to unmethylated DNA is a commonly known problem with most 6mA antibodies and is well described elsewhere.
Based on this and the following comment, we are convinced that Reviewer #3 has overlooked two critical elements of our study:
First, the immunofluorescence work presented in Figure 3, showing 6mA signal in response to DNA damage, uses cells that were pre-extracted to remove excess cytoplasmic RNA. This method is often used in immunofluorescence experiments of this kind. The pre-extraction method removes most of the cytoplasmic content, and the majority of the cytoplasmic m6A RNA signal. Supplementary Figure 3D shows cells that have not been pre-extracted prior to staining. These images show the cytoplasmic m6A signal is abundant if we do not perform the pre-extraction step.
If the antibody used to label 6mA significantly reacted with unmodified adenine, we would expect a large signal in untreated or untreated and denatured conditions. In contrast, an increase in 6mA is not observed in either case.
Second, the orthogonal approach we employed, mass spectrometry coupled with liquid chromatography, measures 6mA DNA analytes specifically by exact mass. This approach does not depend on an antibody and yields results consistent with those from the immunofluorescence experiments.
(5) Given the lack of orthologous validation of the observed DNA 6mA and the lack of evidence supporting the presence of 6mA in mammalian DNA and consequently any functional role for 6mA in mammalian biology, the manuscript's conclusions need to be toned down significantly, and the inherent difficultly in assessing 6mA accurately in mammals acknowledged throughout.
Typo above: “difficultly” should be “difficulty”.
As discussed in response to prior comments, Figure 3 does provide two independent and orthologous methods that demonstrate 6mA presence in DNA specifically, and not RNA, in response to DNA damage. Complementary and orthogonal datasets are presented using either immunofluorescence microscopy or mass spectrometry-liquid chromatography of extracted DNA. The latter method does not rely on an antibody and can discriminate 6mA DNA versus RNA based on exact mass. We will revise the text to clarify that Figure 3F is a completely orthogonal approach.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public Review):
The authors of the study investigated the generalization capabilities of a deep learning brain age model across different age groups within the Singaporean population, encompassing both elderly individuals aged 55 to 88 years and children aged 4 to 11 years. The model, originally trained on a dataset primarily consisting of Caucasian adults, demonstrated a varying degree of adaptability across these age groups. For the elderly, the authors observed that the model could be applied with minimal modifications, whereas for children, significant fine-tuning was necessary to achieve accurate predictions. Through their analysis, the authors established a correlation between changes in the brain age gap and future executive function performance across both demographics. Additionally, they identified distinct neuroanatomical predictors for brain age in each group: lateral ventricles and frontal areas were key in elderly participants, while white matter and posterior brain regions played a crucial role in children. These findings underscore the authors' conclusion that brain age models hold the potential for generalization across diverse populations, further emphasizing the significance of brain age progression as an indicator of cognitive development and aging processes.
Strengths:
(1) The study tackles a crucial research gap by exploring the adaptability of a brain age model across Asian demographics (Chinese, Malay, and Indian Singaporeans), enriching our knowledge of brain aging beyond Western populations.
(2) It uncovers distinct anatomical predictors of brain aging between elderly and younger individuals, highlighting a significant finding in the understanding of age-related changes and ethnic differences.
Weaknesses:
(1) Clarity in describing the fine-tuning process is essential for improved comprehension.
(2) The analysis often limits its findings to p-values, omitting the effect sizes crucial for understanding the relationship with cognition.
(3) Employing a predictive framework for cognition using brain age could offer more insight than mere statistical correlations.
(4) Expanding the study's scope to evaluate the model's generalisability to unseen Caucasian samples is vital for establishing a comparative baseline.
In summary, this paper underscores the critical need to include diverse ethnicities in model testing and estimation.
Reviewer #1 (Recommendations for the authors):
Comment #1 - Fine-Tuning Process Clarity: Enhanced clarity in the fine-tuning process documentation is crucial for understanding how models are adapted to new datasets. This involves explaining parameter adjustments and choices, which facilitates replication and application in further research.
We thank Reviewer #1 for this pertinent point. As advised, we have added a Supplementary Methods section with more details on the finetuning process. This includes the addition of Supplementary Figure S6, which shows examples of learning curves that helped inform our parameter adjustments and choices. We have added a reference to this section in Section 5.2 of the Methods.
Comment #2 - Effect Sizes Reporting: The emphasis on reporting effect sizes alongside p-values addresses the need to quantify the strength of observed effects, particularly the relationship between brain age and cognition. Effect sizes provide insights into the practical significance of findings, crucial for clinical and practical applications.
We thank Reviewer #1 for raising this important comment. As suggested, we have added standardized regression coefficients (as measures of effect size) alongside p-values in Figures 3 – 4, Supplementary Figures S2 – S4, Supplementary Tables S4 – S15, and the text of Sections 2.2 – 2.3 of the Results. We have additionally added 95% confidence intervals to Supplementary Tables S4 – S15.
Comment #3 - Predictive Framework for Cognition: Adopting a predictive framework for cognition using brain age moves the research from mere correlation to actionable prediction, offering potentials based on predictive analytics.
We thank Reviewer #1 for this insightful suggestion. Adopting a predictive framework would certainly be a useful and exciting avenue for the application of brain age. However, we note that the current study was primarily interested in the generalizability and interpretability of brain age in Asian children and older adults, as well as the added value of longitudinal measures of brain age. Thus, we believe our correlation-based analysis effectively demonstrated that deviations of brain age from chronological age were not merely random errors, but were informative of cognition. Furthermore, ongoing changes to these deviations were informative of future cognition. This helps to establish the brain age gap as a biomarker for aging, independent of chronological age. Additionally, we expect that the accurate prediction of future cognition would require a multitude of factors, in addition to T1-based brain age, as well as a large sample size to train and test. We believe such a dataset would be a promising avenue for future work, but it is outside the scope of the current study.
Nonetheless, we were able to conduct a preliminary analysis using the current longitudinal data from SLABS and GUSTO. We extracted the same variables used in the original analyses of future cognition, corresponding to Figures 3D and 4B in the main text. To implement a predictive framework, we split the data into 10 stratified cross-validation folds. We also used kernel ridge regression (KRR) as the predictive model, as it has previously shown promising performance in behavioral and cognitive prediction [1]. We used a cosine kernel and nested 5-fold cross-validation to pick the optimal regularization strength (alpha).
To investigate the added value of BAG and longitudinal changes in BAG, we compared 3 predictive models for each cognitive domain. The baseline model consisted of the demographic covariates used in the original analyses (i.e. chronological age, sex, and years of education for older adults). A second model combined demographics with baseline BAG, and the third model incorporated demographics, baseline BAG, and the (early) annual rate of change in BAG. Predictions were extracted from each test fold, and performance was measured by the correlation between test predictions and actual values of future cognition (or change in cognition). Models were statistically compared using the corrected resampled t-test for machine learning models [1], [2], [3]. The Benjamini-Hochberg procedure was used to correct for multiple comparisons.
Author response image 1 shows the prediction results for SLABS and GUSTO. Notably, adding the early change in BAG significantly improves the prediction of future change in executive function in SLABS. There is also an improvement in predicting the future inhibition score in GUSTO, but this is not significant after multiple comparison correction. Encouragingly, these are the same domains that showed significant associations with the change in BAG in the original analyses. This suggests that longitudinal brain age continues to contribute information, independent of baseline factors, in a predictive framework. We hope that future work can expand on this analysis with, for instance, larger sample sizes, more varied and informative predictors, and state-of-the-art prediction methods, in order to establish actionable predictions of future cognition.
Author response Image 1.
Predictive framework for cognition similarly suggests value of longitudinal change in BAG. Prediction performance (Pearson's correlation) of KRR across future cognitive outcomes. Each boxplot shows the distribution of performance over cross-validation folds. Model performances are statistically compared for each outcome. Significant outcomes from the original analyses are bolded. (A) Results for SLABS using the early change in BAG and future change in cognitive scores (non-overlapping). Early change in BAG again shows benefit for predicting future change in executive function. (B) Results for GUSTO using the early change in BAG (from 4.5-7.5 years old) and future cognitive score (at 8.5 years old). Early change in BAG again shows benefit for predicting future inhibition, but it is not significant after multiple comparison correction. Key - **: p < 0.01; * (ns): p < 0.05 but p<sub>corr</sub> > 0.05 after multiple comparison correction; ns: p > 0.05
Comment #4 - Generalizability to Unseen Caucasian Samples: Evaluating the model's performance on unseen (longitudinal) Caucasian samples is important for benchmarking.
We thank Reviewer #1 for this important comment. We agree that generalizability should be benchmarked against performance on unseen Caucasian samples. In the SFCN model paper [4], they conducted an out-of-sample test on unseen Caucasian samples from ages 13 to 95. In this age range, they reported a high correlation (r = 0.975) and low MAE (MAE = 3.90). This favorable generalization performance was verified in adults by independent evaluations [5], [6]. This is also in line with what we observed in Asian older adults, taking into account the different age ranges and sample sizes involved [7].
However, this also highlights the difficulty in evaluating on younger ages in the range of GUSTO (4.5 – 10.5 years old). Most accessible developmental datasets (e.g. HBN, PING) were already included in model training, preventing an unbiased evaluation on these samples. Datasets such as PNC and ABCD were not included in training, but they primarily consist of an older age range than GUSTO. Holm et al. [8] previously tested the SFCN model in ABCD and reported satisfactory performance (low MAE) from 9 – 13 years old. However, to the best of our knowledge, there are no reported generalization results (for any ethnicity) from 4.5 – 7.5 years old, which is where we found the most performance degradation in GUSTO. We are also not aware of any datasets in this age range we could access to test this, unfortunately, but it would be an important area for future work.
While benchmarking in Caucasian children is difficult, we were able to conduct a preliminary analysis with older adults using the ADNI dataset (which was not included in the model training [4]). We selected a longitudinal subset with cognitive data available and no dementia at baseline (N = 137). We used composite cognitive scores covering memory, executive function, language, and visuospatial function [9], [10], [11]. We followed the same methodology (e.g. preprocessing, finetuning, statistical analysis) as the main analyses on EDIS, SLABS, and GUSTO. To maximize the data available, we tested associations with future cognition (taken at the last available time point), similar to GUSTO. We again included chronological age, sex, and years of education as demographic covariates.
Author response image 2 shows the brain age predictions for the pretrained and finetuned models on ADNI. Similar to Singaporean older adults, the pretrained model performs well, producing a high correlation (r = 0.8053; compared to r = 0.7389 for EDIS and r = 0.8136 for SLABS) and somewhat low MAE (MAE = 4.9735; compared to MAE = 3.9895 for EDIS and MAE = 3.4668 for SLABS). After finetuning, the MAE improves (MAE = 3.6837; compared to MAE = 3.3232 for EDIS and MAE = 3.2653 for SLABS) with a similar correlation (r = 0.7854; compared to r = 0.7445 for EDIS and r = 0.8138 for SLABS). This suggests that generalization to unseen Singaporean older adults is in line with the generalization to unseen Caucasian older adults.
Author response image 2.
Brain age predictions on unseen Caucasian sample of older adults. Predictions from the A) pretrained and B) finetuned brain age models on ADNI participants. Compare to Figure 2 of the main text.
For the associations with future cognition, we again find that baseline BAG does not associate with future cognition (Author response tables 1 and 2). However, encouragingly, we find that the early annual rate of change in BAG does associate with future memory, which is significant after multiple comparison correction for the finetuned model (Author response tables 2 and 3). This suggests a degree of replicability to the original results, but interestingly, in a different domain (memory vs. executive function). In contrast to SLABS, which consists of healthy older adults recruited from the community, ADNI consists of participants at risk of AD recruited from memory clinics. Thus, this difference in domain could be due to factors such as a stronger signal for memory in the testing battery or greater variations in memory function and decline. However, it could also reflect other population differences between ADNI and SLABS. This is an intriguing area for future study, ideally with larger sample sizes and more diverse populations included.
Author response table 1.
Linear relationship between pretrained baseline BAG and future cognitive score in ADNI. Compare to Supplementary Tables S4 – S15 of the original text.
Author response table 2.
Linear relationship between finetuned baseline BAG and future cognitive score in ADNI. Compare to Supplementary Tables S4 – S15 of the original text.
Author response table 3.
Linear relationship between pretrained change in BAG and future cognitive score in ADNI. Compare to Supplementary Tables S4 – S15 of the original text.
Author response table 4.
Linear relationship between finetuned change in BAG and future cognitive score in ADNI. Compare to Supplementary Tables S4 – S15 of the original text.
References
(1) L. Q. R. Ooi et al., “Comparison of individualized behavioral predictions across anatomical, diffusion and functional connectivity MRI,” NeuroImage, vol. 263, p. 119636, Nov. 2022, doi: 10.1016/j.neuroimage.2022.119636.
(2) C. Nadeau and Y. Bengio, “Inference for the Generalization Error,” Mach. Learn., vol. 52, no. 3, pp. 239–281, Sep. 2003, doi: 10.1023/A:1024068626366.
(3) R. R. Bouckaert and E. Frank, “Evaluating the Replicability of Significance Tests for Comparing Learning Algorithms,” in Advances in Knowledge Discovery and Data Mining, H. Dai, R. Srikant, and C. Zhang, Eds., Berlin, Heidelberg: Springer, 2004, pp. 3–12. doi: 10.1007/978-3-540-24775-3_3.
(4) E. H. Leonardsen et al., “Deep neural networks learn general and clinically relevant representations of the ageing brain,” NeuroImage, vol. 256, p. 119210, Aug. 2022, doi: 10.1016/j.neuroimage.2022.119210.
(5) R. P. Dörfel et al., “Prediction of brain age using structural magnetic resonance imaging: A comparison of accuracy and test-retest reliability of publicly available software packages,” Neuroscience, preprint, Jan. 2023. doi: 10.1101/2023.01.26.525514.
(6) J. L. Hanson, D. J. Adkins, E. Bacas, and P. Zhou, “Examining the reliability of brain age algorithms under varying degrees of participant motion,” Brain Inform., vol. 11, no. 1, p. 9, Apr. 2024, doi: 10.1186/s40708-024-00223-0.
(7) A.-M. G. de Lange et al., “Mind the gap: Performance metric evaluation in brain-age prediction,” Hum. Brain Mapp., vol. 43, no. 10, pp. 3113–3129, Jul. 2022, doi: 10.1002/hbm.25837.
(8) M. C. Holm et al., “Linking brain maturation and puberty during early adolescence using longitudinal brain age prediction in the ABCD cohort,” Dev. Cogn. Neurosci., vol. 60, p. 101220, Feb. 2023, doi: 10.1016/j.dcn.2023.101220.
(9) P. K. Crane et al., “Development and assessment of a composite score for memory in the Alzheimer’s Disease Neuroimaging Initiative (ADNI),” Brain Imaging Behav., vol. 6, no. 4, pp. 502–516, Dec. 2012, doi: 10.1007/s11682-012-9186-z.
(10) L. E. Gibbons et al., “A composite score for executive functioning, validated in Alzheimer’s Disease Neuroimaging Initiative (ADNI) participants with baseline mild cognitive impairment,” Brain Imaging Behav., vol. 6, no. 4, pp. 517–527, Dec. 2012, doi: 10.1007/s11682-012-9176-1.
(11) S.-E. Choi et al., “Development and validation of language and visuospatial composite scores in ADNI,” Alzheimers Dement. Transl. Res. Clin. Interv., vol. 6, no. 1, p. e12072, 2020, doi: 10.1002/trc2.12072.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
We thank the reviewers for their time and thoughtful comments. We believe that the further analyses suggested have made the results clearer and more robust. Below, we briefly highlight the key points addressed in the revision and the new evidence supporting them. Then, we address each reviewer’s critiques point-by-point.
- Changes in variability with respect to time/experience
Both reviewers #1 and #3 asked whether the variability in grid properties observed was dependent on time or experience. This is an important point, given that such a dependence on time could lead to interesting hypotheses about the underlying dynamics of the grid code. However, in the new analyses we performed, we do not observe changes in grid variability within a session (Fig S5 of the revised manuscript), suggesting that the grid variability seen is constant within the timescale of the data set.
- The assumption of constant grid parameters in the literature
Reviewer #2 pointed out that it had been appreciated by experimentalists that grid properties are variable within a module. We agree that we may have overstated the universality of this assumption in the original manuscript, and we have toned down the language in the revision. However, we note that many previous theoretical studies assumed these properties to be constant, within a given module. We provide some examples below, and have added evidence of this assertion, with citations to the theoretical literature, to the revised manuscript .
- Additional sources of variability
Reviewer #3 pointed out additional sources that might explain the variability observed in the paper (beyond time and experience). These sources include: field width, border location, and the impact of conjunctive cells. We have run additional analyses and have found no significant impact on the observed variability from any of these factors. We believe that these are important controls, and have added them to the manuscript (Fig S4-S7 of the revised manuscript)
- Analysis of computational models
Reviewer #3 noted that our results could be strengthened by performing similar analyses on the output of computational models of grid cells. This is a good idea. We have now measured the variability of grid properties in a recent normative recurrent neural network (RNN) model that develops grid cells when trained to perform path integration (Sorscher et al., 2019). This model has been shown to develop signatures of a 2D toroidal attractor (Sorscher et al., 2023) and achieves a high accuracy on a simple path integration task. Interestingly, the units with the greatest grid scores also exhibit a range of grid spacings and grid orientations (Fig S8 of the revised manuscript). Furthermore, by decreasing the amount of sparsity (through decreasing the weight decay regularization), we found an increase in the variability of the grid properties. This analysis demonstrates a heretofore unknown similarity between the RNN models trained to perform path integration and recorded grid cells from MEC. It additionally provides a framework for computational analysis of the emergence of grid property variability.
Reviewer #1:
(1) Is the variability in grid spacing and orientation that the authors found intrinsically organized or is it shaped by experience? Previous research has shown that grid representations can be modified through experience (e.g., Boccara et al., Science 2019). To understand the dynamics of the network, it would be important to investigate whether robust variability exists from the beginning of the task period (recording period) or whether variability emerges in an experience-dependent manner within a session.
This is an interesting question that was not addressed in the paper. To test this, we performed additional analysis to resolve whether the variability changes across a session.
Using a sliding window, we have measured changes in variability with respect to recording time (Fig S5A). To this end, we compute grid orientation and spacing over a time-window whose length is half the total length of the recording. From the population distribution of orientation and spacing values, we compute the standard deviation as a measure of variability. We repeat the same procedure, sliding the window forward until the variability for the second half of the recording is computed.
We applied this approach to recording ID R12 (the same as in Figs 2-4) given that this recording session was significantly longer than the rest (nearly two hours). Results are shown in Fig S5B-C. For both orientation and spacing, no changes of variability with respect to time can be observed. Similar results were found for other modules (see caption of Fig S5 for statistics).
We also note that the rats were already familiarized with the environment for 10-20 sessions prior to the recordings, so there may not be further learning during the period of the grid cell recordings. No changes in variability can be seen in Rat R across days (e.g., in Fig 5B R12 and R22 have similar distributions of variability). However, we note that it may be possible that there are changes in grid properties at time-scales greater than the recordings.
(2) It is important to consider the optimal variability size. The larger the variability, the better it is for decoding. On the other hand, as the authors state in the
Discussion, it is assumed that variability does not exist in the continuous attractor model. Although this study describes that it does not address how such variability fits the attractor theory, it would be better if more detailed ideas and suggestions were provided as to what direction the study could take to clarify the optimal size of variability.
We appreciate this suggestion and agree that more discussion is warranted on how our results can be reconciled with previously observed attractor dynamics. To explore this, we studied the recurrent neural network (RNN) model from Sorscher et al. (2019), which develops grid responses when trained on path integration. This network has previously been found to develop signatures of toroidal topology (Sorscher et al., 2023), yet we find its grid responses also contain heterogeneity in grid properties (Fig S8). By decreasing the strength of the weight decay regularization (which leads to denser connectivity in the recurrent layer), we find an increase in the grid property variability. Interestingly, decreasing the weight decay regularization has been previously found to lead to weaker grid responses and worse ability of the RNN to perform path integration on environments larger than it was trained on. This approach not only provides preliminary evidence to our claim that too much variability can lead to weaker continuous attractor structure, but also provides a modeling framework with which future work can explore this question in more detail. We have added discussion of this issue to the manuscript text (Discussion).
Reviewer #2:
(1) Even though theoreticians might have gotten the mistaken impression that grid cells are highly regular, this might be due to an overemphasis on regularity in a subset of papers. Most experimentalists working with grid cells know that many if not most grid cells show high variability of firing fields within a single neuron, though this analysis focuses on between neurons. In response to this comment, the reviewers should tone down and modify their statements about what are the current assumptions of the field (and if possible provide a short supplemental section with direct quotes from various papers that have made these assumptions).
We agree that some experimentalists are aware of variability in the recorded grid response patterns and that this work may not come as a complete surprise to them. We have toned down our language in the Introduction, changing “our results challenge a long-held assumption” to “our results challenge a frequently made assumption in the theoretical literature”. Additionally, we have added a caveat that “experimentalists have been aware” of the observed variability in grid properties.
We would like to emphasize that the lack of work carefully examining the robustness of this variability has prevented a firm understanding of whether this is an inherent property of grid cells or due to measurement noise. The impact of this can be seen in theoretical neuroscience work where a considerable number of articles (including recent publications) start with the assumption that all grid cells within a module have identical properties, with the exception of phase shift and noise. We have now cited a number of these papers in the Introduction, to provide specific references. To further illustrate the pervasiveness of this assumption being explicitly made in theoretical neuroscience, below we provide quotes from a few important papers:
“Cells with a common spatial period also share a common grid orientation; their responses differ only by spatial translations, or different preferred firing phases, with respect to their common response period” (Sreenivasan and Fiete, 2011)”
“Grid cells are organized into discrete modules; within each module, the spatial scale and orientation of the grid lattice are the same, but the lattice for different cells is shifted in space.” (Stemmler et al., 2015)”
“Recently, it was shown that grid cells are organized in discrete modules within which cells share the same orientation and periodicity but vary randomly in phase” (Wei et al., 2015)”
“...cells within one module have receptive fields that are translated versions of one another, and different modules have firing lattices of different scales and orientations” (Dorrell et al., 2023)”
In these works, this assumption is used to derive properties relating to the computational properties of grid cells (e.g., error correction, optimal scaling between grid spacings in different modules).
In addition, since grid cells are assumed to be identical in the computational neuroscience community, there has been little work on quantifying how much variability a given model produces. This makes it challenging to understand how consistent different models are with our observations. This is illustrated in our analysis of a recent recurrent neural network (RNN) model of grid cells (Fig S8), which does exhibit variability.
(2) The authors state that "no characterization of the degree and robustness of variability in grid properties within individual modules has been performed." It is always dangerous to speak in absolute terms about what has been done in scientific studies. It is true that few studies have had the number of grid cells necessary to make comparisons within and between modules, but many studies have clearly shown the distribution of spacing in neuronal data (e.g. Hafting et al., 2005; Barry et al., 2007; Stensola et al., 2012; Hardcastle et al., 2015) so the variability has been visible in the data presentations. Also, most researchers in the field are well aware that highly consistent grid cells are much rarer than messy grid cells that have unevenly spaced firing fields. This doesn't hurt the importance of the paper, but they need to tone down their statements about the lack of previous awareness of variability (specific locations are noted in the specific comments).
We have toned down our language in the Introduction. However, we note that our point that no detailed analysis had been done on measuring the robustness of this variability stands. Thus, for the general community, it has not been clear whether this previously observed variability is noise or a real feature of the grid code.
(3) The methods section needs to have a separate subheading entitled: How grid cells were assigned to modules" that clearly describes how the grid cells were assigned to a module (i.e. was this done by Gardner et al., or done as part of this paper's post-processing?
We thank the reviewer for pointing out this missing information. We have added a new subsection in the Materials and Methods section, entitled “Grid module classification” to clarify how the grid cells are assigned to modules. In short, this was done by Gardner et al. (2022) using an unsupervised clustering approach that was viewed as enabling a less biased identification of modules. We did not perform any additional processing steps on module identity.
Reviewer #3:
(1) One possible explanation of the dispersion in lambda (not in theta) could be variability in the typical width of the field. For a fixed spacing, wider fields might push the six fields around the center of the autocorrelogram toward the outside, depending on the details of how exactly the position of these fields is calculated. We recommend authors show that lambda does not correlate with field width, or at least that the variability explained by field width is smaller than the overall lambda variability.
We agree that this option had not been carefully ruled out by our previous analyses. To tackle this question, we compute the field width of a given cell using the value at the minima of its spatial autocorrelogram (Fig S4A-B). For all cells in recording ID R12, there is a non-significant negative linear correlation between grid field width and between-cell variability (Fig S4C) . The variability explained by the width of the field is 4% of the variability, as indicated by the R<sup>2</sup> value of the linear fit. Similar results were found for all other modules (see caption of Fig S4C for statistics). Therefore, we do not think that grid field width explains spacing variability.
(2) An alternative explanation could be related to what happens at the borders. The authors tackle this issue in Figure S2 but introduce a different way of measuring lambda based on three fields, which in our view is not optimal. We recommend showing that the dispersions in lambda and theta remain invariant as one removes the border-most part of the maps but estimating lambda through the autocorrelogram of the remaining part of the map. Of course, there is a limit to how much can be removed before measures of lambda and theta become very noisy.
We have performed additional analysis to explore the role of borders in grid property variability. To do so, we have followed the suggestion by the reviewer and have re-analyzed grid properties from the autocorrelogram when the border-most part of the maps are removed (Fig S6A-B). For all modules, we do not see any changes in variability (computed as the standard deviation of the population distribution) for either orientation or spacing. As predicted by the reviewer, after removing about 25% of the border-most part of the environment we start seeing changes in variability, as measures of theta and lambda become noisy and computed over a smaller spatial range. This result holds for all other modules (Fig S6C-D).
(3) A third possibility is slightly more tricky. Some works (for example Kropff et al, 2015) have shown that fields anticipate the rat position, so every time the rat traverses them they appear slightly displaced opposite to the direction of movement. The amount of displacement depends on the velocity. Maps that we construct out of a whole session should be deformed in a perfectly symmetric way if rats traverse fields in all directions and speeds. However, if the cell is conjunctive, we would expect a deformation mainly along the cell's preferred head direction. Since conjunctive cells have all possible preferred directions, and many grid cells are not conjunctive at all, this phenomenon could create variability in theta and lambda that is not a legitimate one but rather associated with the way we pool data to construct maps. To rule away this possibility, we recommend the authors study the variability in theta and lambda of conjunctive vs non-conjunctive grid cells. If the authors suspect that this phenomenon could explain part of their results, they should also take into account the findings of Gerlei and colleagues (2020) from the Nolan lab, that add complexity to this issue.
We appreciate the reviewer pointing out the possible role conjunctive cells may play. To investigate how conjunctive cells may affect the observed grid property variability, we have performed additional analyses taking into account if the grid cells included in the study are conjunctive. Comparing within- and between-cell variability of conjunctive vs. non-conjunctive cells in recording R12, we do not see any qualitative differences for either orientation or spacing (Fig S7A-B). When excluding conjunctive cells from the between-variability comparison, we do not see any significant difference compared to when these cells are included (Fig S7C-D). As such, it does not appear that conjunctive cells are the source of variability in the population.
We further note that the number of putative conjunctive cells varied across modules and recordings. For instance, in recording Q1 and Q2, Gardner et al. (2022) reported 3 (out of 97) and 1 (out of 66) conjunctive cells, respectively. Given that we see variability robustly across recordings (Fig 5), we do not believe that conjunctive cells can explain the presence of variability we observe.
(4) The results in Figure 6 are correct, but we are not convinced by the argument. The fact that grid cells fire in the same way in different parts of the environment and in different environments is what gives them their appeal as a platform for path integration since displacement can be calculated independently of the location of the animal. Losing this universal platform is, in our view, too much of a price to pay when the only gain is the possibility of decoding position from a single module (or non-adjacent modules) which, as the authors discuss, is probably never the case. Besides, similar disambiguation of positions within the environment would come for free by adding to the decoding algorithm spatial cells (non-hexagonal but spatially stable), which are ubiquitous across the entorhinal cortex. Thus, it seems to us that - at least along this line of argumentation - with variability the network is losing a lot but not gaining much.
We agree that losing the continuous attractor network (CAN) structure and the ability to path integrate would be a very large loss. However, we do not believe that the variability we observe necessarily destroys either the CAN or path integration. We argue this for two reasons. First, the data we analyzed [from Gardner et al. (2022)] is exactly the data set that was found to have toroidal topology and therefore viewed to be consistent with a major prediction of CANs. Thus, the amount of variability in grid properties does not rule out the underlying presence of a continuous attractor. Second, path integration may still be possible with grid cells that have variable properties. To illustrate this, we analyzed data from Sorscher et al. (2019) recurrent neural network model (RNN) that was trained explicitly on path integration, and found that the grid representations that emerged had variability in spacing and orientation (see point #6 below).
(5) In Figure 4 one axis has markedly lower variability. Is this always the same axis? Can the authors comment more on this finding?
We agree that in Fig 4 the first axis has lower variability. We believe that this is specific to the module R12 and does not reflect any differences in axis or bias in the methods used to compute the axis metrics. To test this, we have performed the same analyses for other modules, finding that other recordings do not exhibit the same bias. Results for the modules with the most cells are shown below (Author response image 1).
Author response image 1.
Grid propertied along Axis 1 are not less variable for many recorded grid modules. Same as Fig.4C-D, but for four other recorded modules. Note that the variability along each axis is similar.
(6) The paper would gain in depth if maps coming out of different computational models could be analyzed in the same way.
We agree with the reviewer that examining computational models using the same approach would strengthen our results and we appreciate the suggestion. To address this, we have analyzed the results from a previous normative model for grid cells [Sorscher et al., (2019)] that trained a recurrent neural network (RNN) model to perform path integration and found that units developed grid cell like responses. These models have been found to exhibit signatures of toroidal attractor dynamics [Sorscher et al. (2023)] and exhibit a diversity of responses beyond pure grid cells, making them a good starting point for understanding whether models of MEC may contain uncharacterized variability in grid properties.
We find that RNN units in these normative models exhibit similar amounts of variability in grid spacing and orientation as observed in the real grid cell recordings (Fig S8A-D). This provides additional evidence that this variability may be expected from a normative framework, and that the variability does not destroy the ability to path integrate (which the RNN is explicitly trained to perform).
The RNN model offers possibilities to assess what might cause this variability. While we leave a detailed investigation of this to future work, we varied the weight decay regularization hyper-parameter. This value controls how sparse the weights in the hidden recurrent layer are. Large weight decay regularization strength encourages sparser connectivity, while small weight decay regularization strength allows for denser connectivity. We find that increasing this penalty (and enforcing sparser connectivity) decreases the variability of grid properties (Fig S8E-F). This suggests that the observed variability in the Gardner et al. (2022) data set could be due to the fact that grid cells are synaptically connected to other, non-grid cells in MEC.
(7) Similarly, it would be very interesting to expand the study with some other data to understand if between-cell delta_theta and delta_lambda are invariant across environments. In a related matter, is there a correlation between delta_theta (delta_lambda) for the first vs for the second half of the session? We expect there should be a significant correlation, it would be nice to show it.
We agree this would be interesting to examine. For this analysis, it is essential to have a large number of grid cells, and we are not aware of other published data sets with comparable cell numbers using different environments.
Using a sliding window analysis, we have characterized changes in variability with respect to the recording time (Figure S5A). To do so, we compute grid orientation and spacing over a time-window whose length is half of the total length of the recording. From the population distribution of orientation and spacing values, we compute the standard deviation as a measure of between-cell variability. We repeat the same procedure, sliding the window forward until the variability for the second half of the recording is computed.
We applied this approach to recording ID R12 (the same as in Figs 2-4) given that this recording session was significantly longer than the rest (almost two hours). Results are shown in Fig S5 B-C. For both orientation and spacing, no systematic changes of variability with respect to time were observed. Similar results were found for other modules (see caption of Fig S5 for statistics).
We also note that the rats were already familiarized with the environment for 10-20 sessions prior to the recordings, so there may not be further learning during the period of the grid cell recordings. No changes in variability can be seen in Rat R across days (e.g., in Fig 5B R12 and R22 have similar distributions of variability). However, we note that it may be possible that there are changes in grid properties at time-scales greater than the recordings.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public review):
Hotinger et al. explore the population dynamics of Salmonella enterica serovar Typhimurium in mice using genetically tagged bacteria. In addition to physiological observations, pathology assessments, and CFU measurements, the study emphasizes quantifying host bottleneck sizes that limit Salmonella colonization and dissemination. The authors also investigate the genetic distances between bacterial populations at various infection sites within the host.
Initially, the study confirms that pretreatment with the antibiotic streptomycin before inoculation via orogastric gavage increases the bacterial burden in the gastrointestinal (GI) tract, leading to more severe symptoms and heightened fecal shedding of bacteria. This pretreatment also significantly reduces between-animal variation in bacterial burden and fecal shedding. The authors then calculate founding population sizes across different organs, discovering a severe bottleneck in the intestine, with founding populations reduced by approximately 10^6-fold compared to the inoculum size. Streptomycin pretreatment increases the founding population size and bacterial replication in the GI tract. Moreover, by calculating genetic distances between populations, the authors demonstrate that, in untreated mice, Salmonella populations within the GI tract are genetically dissimilar, suggesting limited exchange between colonization sites. In contrast, streptomycin pretreatment reduces genetic distances, indicating increased exchange.
In extraintestinal organs, the bacterial burden is generally not substantially increased by streptomycin pretreatment, with significant differences observed only in the mesenteric lymph nodes and bile. However, the founding population sizes in these organs are increased. By comparing genetic distances between organs, the authors provide evidence that subpopulations colonizing extraintestinal organs diverge early after infection from those in the GI tract. This hypothesis is further tested by measuring bacterial burden and founding population sizes in the liver and GI tract at 5 and 120 hours post-infection. Additionally, they compare orogastric gavage infection with the less injurious method of infection via drinking, finding similar results for CFUs, founding populations, and genetic distances. These results argue against injuries during gavage as a route of direct infection.
To bypass bottlenecks associated with the GI tract, the authors compare intravenous (IV) and intraperitoneal (IP) routes of infection. They find approximately a 10-fold increase in bacterial burden and founding population size in immune-rich organs with IV/IP routes compared to orogastric gavage in streptomycin-pretreated animals. This difference is interpreted as a result of "extra steps required to reach systemic organs."
While IP and IV routes yield similar results in immune-rich organs, IP infections lead to higher bacterial burdens in nearby sites, such as the pancreas, adipose tissue, and intraperitoneal wash, as well as somewhat increased founding population sizes. The authors correlate these findings with the presence of white lesions in adipose tissue. Genetic distance comparisons reveal that, apart from the spleen and liver, IP infections lead to genetically distinct populations in infected organs, whereas IV infections generally result in higher genetic similarity.
Finally, the authors investigate GI tract reseeding, identifying two distinct routes. They observe that the GI tracts of IP/IV-infected mice are colonized either by a clonal or a diversely tagged bacterial population. In clonally reseeded animals, the genetic distance within the GI tract is very low (often zero) compared to the bile population, which is predominantly clonal or pauciclonal. These animals also display pathological signs, such as cloudy/hardened bile and increased bacterial burden, leading the authors to conclude that the GI tract was reseeded by bacteria from the gallbladder bile. In contrast, animals reseeded by more complex bacterial populations show that bile contributes only a minor fraction of the tags. Given the large founding population size in these animals' GI tracts, which is larger than in orogastrically infected animals, the authors suggest a highly permissive second reseeding route, largely independent of bile. They speculate that this route may involve a reversal of known mechanisms that the pathogen uses to escape from the intestine.
The manuscript presents a substantial body of work that offers a meticulously detailed understanding of the population dynamics of S. Typhimurium in mice. It quantifies the processes shaping the within-host dynamics of this pathogen and provides new insights into its spread, including previously unrecognized dissemination routes. The methodology is appropriate and carefully executed, and the manuscript is well-written, clearly presented, and concise. The authors' conclusions are well-supported by experimental results and thoroughly discussed. This work underscores the power of using highly diverse barcoded pathogens to uncover the within-host population dynamics of infections and will likely inspire further investigations into the molecular mechanisms underlying the bottlenecks and dissemination routes described here.
Major point:
Substantial conclusions in the manuscript rely on genetic distance measurements using the Cavalli-Sforza chord distance. However, it is unclear whether these genetic distance measurements are independent of the founding population size. I would anticipate that in populations with larger founding population sizes, where the relative tag frequencies are closer to those in the inoculum, the genetic distances would appear smaller compared to populations with smaller founding sizes independent of their actual relatedness. This potential dependency could have implications for the interpretation of findings, such as those in Figures 2B and 2D, where antibiotic-pretreated animals consistently exhibit higher founding population sizes and smaller genetic distances compared to untreated animals.
Thank you for raising this important point regarding reliance on cord distances for gauging genetic distance in barcoded populations. The reviewer is correct that samples with more founders will be more similar to the inoculum and thus inherently more similar to other samples that also have more founders. However, creation of libraries containing very large numbers of unique barcodes can often circumvent this issue. In this case, the effect size of chance-based similarity is not large enough to change the interpretation of the data in Figures 2B and 2D. In our case, the library has ~6x10<sup>4</sup> barcodes, and the founding populations in Figure 2B are ~10<sup>3</sup>. Randomly resampling to create two populations of 10<sup>3</sup> cells from an initial population with 6x10<sup>4</sup> barcodes is expected to yield largely distinct populations with very little similarity. Thus, the similarity between streptomycin-treated populations in Figure 2D is likely the result of biology rather than chance.
Reviewer #2 (Public review):
In this paper, Hotinger et. al. propose an improved barcoded library system, called STAMPR, to study Salmonella population dynamics during infection. Using this system, the authors demonstrate significant diversity in the colonization of different Salmonella clones (defined by the presence of different barcodes) not only across different organs (liver, spleen, adipose tissues, pancreas, and gall bladder) but also within different compartments of the same gastrointestinal tissue. Additionally, this system revealed that microbiota competition is the major bottleneck in Salmonella intestinal colonization, which can be mitigated by streptomycin treatment. However, this has been demonstrated previously in numerous publications. They also show that there was minimal sharing between populations found in the intestine and those in the other organs. Upon IV and IP infection to bypass the intestinal bottleneck, they were able to demonstrate, using this library, that Salmonella can renter the intestine through two possible routes. One route is essentially the reverse path used to escape the gut, leading to a diverse intestinal population; while the other, through the bile, typically results in a clonal population. Although the authors showed that the STAMPR pipeline improved the ability to identify founder populations and their diversity within the same animal during infections, some of the conclusions appear speculative and not fully supported.
(1) It's particularly interesting how the authors, using this system, demonstrate the dominant role of the microbiota bottleneck in Salmonella colonization and how it is widened by antibiotic treatment (Figure 1). Additionally, the ability to track Salmonella reseeding of the gut from other organs starting with IV and IP injections of the pathogen provides a new tool to study population dynamics (Figure 5). However, I don't think it is possible to argue that the proximal and distal small intestine, Peyer's patches (PPs), cecum, colon, and feces have different founder populations for reasons other than stochastic variations. All the barcoded Salmonella clones have the same fitness and the fact that some are found or expanded in one region of the gastrointestinal tract rather than another likely results from random chance - such as being forced in a specific region of the gut for physical or spatial reasons-and subsequent expansion, rather than any inherent biological cause. For example, some bacteria may randomly adhere to the mucus, some may swim toward the epithelial layer, while others remain in the lumen; all will proliferate in those respective sites. In this way, different founder populations arise based on random localization during movement through the gastrointestinal tract, which is an observation, but it doesn't significantly contribute to understanding pathogen colonization dynamics or pathogenesis. Therefore, I would suggest placing less emphasis on describing these differences or better discussing this aspect, especially in the context of the gastrointestinal tract.
Thank you for helping us identify this area for further clarification. We agree with the reviewer’s interpretation that seeding of proximal and distal small intestine, Peyer's patches (PPs), cecum, colon, and feces with different founder populations is likely caused by stochastic variations, consistent with separate stochastic bottlenecks to establishing these separate niches. To clarify this point we have modified the text in the results section, “Streptomycin treatment decreases compartmentalization of S. Typhimurium populations within the intestine”.
Change to text:
“Except for the cecum and colon, in untreated animals the S. Typhimurium populations in different regions of the intestine were dissimilar (Avg. GD ranged from 0.369 to 0.729, 2D left); i.e., there is little sharing between populations in the intestine. These data suggest that there are separate bottlenecks in different regions of the intestine that cause stochastic differences in the identity of the founders. Interestingly, when these founders replicate, they do not mix, remaining compartmentalized with little sharing between populations throughout the intestinal tract (i.e., barcodes found in one region are not in other regions, Figure S3). This was surprising as the luminal contents, an environment presumably conducive to bacterial movement, were not removed from these samples.”
In this section we are interested in the underlying biology that occurs after the initial bottleneck to preserve this compartmentalization during outgrowth of the intestinal population. In other words, what prevents these separate populations from merging (e.g., what prevents the bacteria replicating in the proximal small intestine from traveling through the intestine and establishing a niche in the distal small intestine)? While we do not explore the mechanisms of compartmentalization, we observe that it is disrupted by streptomycin pretreatment, suggesting a microbiota-dependent biological cause.
(2) I do think that STAMPR is useful for studying the dynamics of pathogen spread to organs where Salmonella likely resides intracellularly (Figure 3). The observation that the liver is colonized by an early intestinal population, which continues to proliferate at a steady rate throughout the infection, is very interesting and may be due to the unique nature of the organ compared to the mucosal environment. What is the biological relevance during infection? Do the authors observe the same pattern (Figures 3C and G) when normalizing the population data for the spleen and mesenteric lymph nodes (mLN)? If not, what do the authors think is driving this different distribution?
Thank you for raising this interesting point. These data indicate that the liver is seeded from the intestine early during infection. The timing and source of dissemination have relevance for understanding how host and pathogen variables control the spread of bacteria to systemic sites. For example, our conclusion (early dissemination) indicates that the immune state of a host at the time of exposure to a pathogen, and for a short period thereafter, are what primarily influence the process of dissemination, not the later response to an active infection.
We observe that the liver and mucosal environments within the intestine have similar colonization behaviors. Both niches are seeded early during infection, followed by steady pathogen proliferation and compartmentalization that apparently inhibits further seeding. This results in the identity of barcodes in the liver population remaining distinct from the intestinal populations, and the intestinal populations remaining distinct from each other.
We observe a similar pattern to the liver in the spleen and MLN (the barcodes in the spleen and MLN are dissimilar to the population in the intestine). To clarify this point, we have modified the text (below) and added this analysis as a supplemental figure (S4).
Change to text:
Genetic distance comparison of liver samples to other sites revealed that, regardless of streptomycin treatment, there was very little sharing of barcodes between the intestine and extraintestinal sites (Avg. GD >0.75, Figure 3C). Furthermore, the MLN and spleen populations also lacked similarity with the intestine (Figure S4). These analyses strongly support the idea that S. Typhimurium disseminates to extraintestinal organs relatively early following inoculation, before it establishes a replicative niche in the intestine.
(3) Figure 6: Could the bile pathology be due to increased general bacterial translocation rather than Salmonella colonization specifically? Did the authors check for the presence of other bacteria (potentially also proliferating) in the bile? Do the authors know whether Salmonella's metabolic activity in the bile could be responsible for gallbladder pathology?
The reviewer raises interesting points for future work. We did not check whether other bacterial species are translocating during S. Typhimurium infection. The relevance of Salmonella’s metabolic activity is also very interesting, and we hope these questions will be answered by future studies.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
Minor points:
(1) P. 9/10 "... the marked delay in shedding after IP and IV relative to orogastric inoculation suggest that the S. Typhimurium population encounters substantial bottleneck(s) on the route(s) from extraintestinal sites back to the intestine.": Can you conclude that from the data? It could also be possible that there is a biological mechanism (other than chance events) that delays the re-entry to the intestine.
We propose that the delay in shedding indicates additional obstacles that bacteria face when re-entering the intestine, and that there are likely biological mechanisms that cause this delay. However, these unknown mechanisms effectively act as additional bottlenecks by causing a stochastic loss of population diversity.
(2) P. 11 "...both organs would likely contain all 10 barcodes. In contrast, a library with 10,000 barcodes can be used to distinguish between a bottleneck resulting in Ns = 1,000 and Ns = 10,000, since these bottlenecks result in a different number of barcodes in output samples. Furthermore, high diversity libraries reduce the likelihood that two tissue samples share the same barcode(s) due to random chance, enabling more accurate quantification of bacterial dissemination.": I agree with the general analysis, but I find it misleading to talk about the presence of barcodes when the analyses in this manuscript are based on the much more powerful comparison of relative abundance of individual tags instead of their presence or absence.
The reviewer raises an excellent point, and the distinction between relative abundance versus presence/absence is discussed extensively in the original STAMPR manuscript. Although relative abundance is powerful, the primary metric used in this study (Ns) is calculated principally from the number of barcodes, corrected (via simulations) for the probability of observing the same barcode across distinct founders. Although this correction procedure does rely on barcode abundance, the primary driver of founding population quantification is the number of barcodes.
(3) P.14 "the library in LB supplemented with SM was not significantly different than the parent strain" and Figure 2C: How was significance tested? How many times were the growth curves recorded? On my print-out, the red color has different shades for different growth curves.
Significance was tested with a Mann-Whitney and growth curves were performed 5 times. Growth curves are displayed with 50% opacity, and as a result multiple curves directly on top of each other appear darker. The legend to S2 has been modified accordingly.
(4) P.16: close bracket in the equation for FRD calculation.
Done
(5) Figure 2C "Average CFU per founder": I found the wording confusing at first as I thought you divided the average bacterial burden per organ by Ns, instead of averaging the CFU/Ns calculated for each mouse.
The wording has been clarified.
(6) Figure 3B: It would be helpful to include expected genetic distances in the schematic as it is difficult to infer the genetic distance when only two of three, respectively, different "barcode colors" are used. While I find the explanation in the main text intuitive, a graphical representation would have helped me.
Thank you for the suggestion. Unfortunately, using colors to represent barcodes is imperfect and limits the diversity that can be depicted. We have modified Figure 3B to further clarify.
(7) Figure 3C: Why do you compare the genetic distance to the liver, when you discuss the genetic distance of the intestinal population? Is it not possible that the intestinal populations are similar to the extraintestinal organs except the liver?
For clarity, we chose to highlight exclusively the liver. However, we observed a similar pattern to the liver in other extraintestinal organs. To clarify the generalizability of this point we have added a supplemental figure with comparisons to MLN and Spleen (Supplemental figure S4) as well as further text.
(8) Figure 3C & S5A: I found "+SM" and "+SM, Drinking" confusing and would have preferred "+SM, Gavage" and "+SM, Drinking" for clarity.
Done, thank you for the suggestion.
(9) Figure 3G&H: I find it worthy of discussion that the bacterial burden increases over time, while the founding population decreases. Does that not indicate that replication only occurs at specific sites leading to the amplification of only a few barcodes and thereby a larger change of the relative barcode abundance compared to the inoculum?
From 5h to 120h the size of the founding population decreases in multiple intestinal sites. This likely indicates that the impact of the initial bottleneck is still ongoing at 5h, although further temporal analysis would be required to define the exact timing of the bottleneck. Notably, the passage time through the mouse intestine is ~5h. Many of the founders observed at 5h could be a population that will never establish a replicative niche, and failing to colonize be shed in the feces, bottlenecking the population between 5h and 120h. To clarify this point we have added the following text:
Section “S. Typhimurium disseminates out of the intestine before establishing an intestinal replicative niche”.
“In contrast to the liver, there were more founders present in samples from the intestine (particularly in the colon) at 5 hours versus 120 hours (Figure 3H). These data likely indicate that many of the founders observed in the intestine at 5 hours are shed in the feces prior to establishing a replicative niche, and demonstrates that the forces restricting the S. Typhimurium population in the intestine act over a period of > 5 hours.”
(10) Figure S2A: I do not understand this figure. Why are there more than 70.000 tags listed? I was under the impression the barcode library in S. Typhimurium had 55.000 tags while only the plasmid pSM1 had more than 70.000 (but the plasmid should not be relevant here). Why are there distinct lines at approximately 10^-5 and a bit lower? I would have expected continuously distributed barcode frequencies.
During barcode analysis, each library is mapped to the total barcode list in the barcode donor pSM1, which contains ~70,000 barcodes. This enables consistent analysis across different bacterial libraries. The designation “barcode number” refers to the barcode number in pSM1, meaning many of the barcodes in the Salmonella library are at zero reads. This graph type was chosen to show there was no bias toward a particular barcode, however there is significant overlap of the points, making individual barcode frequencies difficult to see. We have changed the x-axis to state “pSM1 Barcode Number” and clarified in the figure legend.
Since the y-axes on these graphs is on a log10 scale, the lines represent barcodes with 1 read, 2 reads, 3 reads, etc. As the number of reads per barcode increases linearly, the space between them decreases on logarithmic axes.
(11) There are a few typos in the figure legends of the supplementary material. For example Figure S2: S. Typhimurium not italicized, ~7x105 no superscript. Fig. S4&5 ", Open circles" is "O" is capitalized.
Typos have been corrected.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public review):
This is an interesting manuscript where the authors systematically measure rG4 levels in brain samples at different ages of patients affected by AD. To the best of my knowledge this is the first time that BG4 staining is used in this context and the authors provide compelling evidence to show an association with BG4 staining and age or AD progression, which interestingly indicates that such RNA structure might play a role in regulating protein homeostasis as previously speculated. The methods used and the results reported seems robust and reproducible. There were two main things that needed addressing:
(1) Usually in BG4 staining experiments to ensure that the signal detected is genuinely due to rG4 an RNase treatment experiment is performed. This does not have to be extended to all the samples presented but having a couple of controls where the authors observe loss of staining upon RNase treatment will be key to ensure with confidence that rG4s are detected under the experimental conditions. This is particularly relevant for this brain tissue samples where BG4 staining has never been performed before.
(2) The authors have an association between rG4-formation and age/disease progression. They also observe distribution dependency of this, which is great. However, this is still an association which does not allow the model to be supported. This is not something that can be fixed with an easy experiment and it is what it is, but my point is that the narrative of the manuscript should be more fair and reflect the fact that, although interesting, what the authors are observing is a simple correlation. They should still go ahead and propose a model for it, but they should be more balanced in the conclusion and do not imply that this evidence is sufficient to demonstrate the proposed model. It is absolutely fine to refer to the literature and comment on the fact that similar observations have been reported and this is in line with those, but still this is not an ultimate demonstration.
Comments on current version:
The authors have now addressed my concerns.
We thank the reviewer for their support!
Reviewer #2 (Public review):
RNA guanine-rich G-quadruplexes (rG4s) are non-canonical higher order nucleic acid structures that can form under physiological conditions. Interestingly, cellular stress is positively correlated with rG4 induction.
In this study, the authors examined human hippocampal postmortem tissue for the formation ofrG4s in aging and Alzheimer Disease (AD). rG4 immunostaining strongly increased in the hippocampus with both age and with AD severity. 21 cases were used in this study (age range 30-92).
This immunostaining co-localized with hyper-phosphorylated tau immunostaining in neurons. The BG4 staining levels were also impacted by APOE status. rG4 structure was previously found to drive tau aggregation. Based on these observations, the authors propose a model of neurodegeneration in which chronic rG4 formation drives proteostasis collapse.
This model is interesting, and would explain different observations (e.g., RNA is present in AD aggregates and rG4s can enhance protein oligomerization and tau aggregation).
Main issue from the previous round of review:
There is indeed a positive correlation between Braak stage severity and BG4 staining, but this correlation is relatively weak and borderline significant ((R = 0.52, p value = 0.028). This is probably the main limitation of this study, which should be clearly acknowledged (together with a reminder that "correlation is not causality"). Related to this, here is no clear justification to exclude the four individuals in Fig 1d (without them R increases to 0.78). Please remove this statement. On the other hand, the difference based on APOE status is more striking.
Comments on current version:
The authors have made laudable efforts to address the criticisms I made in my evaluation of the original manuscript.
We thank the reviewer for their support!
Recommendations for the authors:
Reviewing Editor:
I would suggest two minor edits:
- The findings are correlative and descriptive, but the title implies functionality (A New Role for RNA G-quadruplexes in Aging and Alzheimer′s Disease). I would suggest toning down this title).
- While I understand the limitations in performing additional biochemical experiments to validate the immunofluorescence study, I think this is worth mentioning as a limitation in the text.
We have made these two changes as requested, altering the title to remove the word Role that may imply more meaning than intended, and adding a line to the discussion on the need for future additional biochemical experiments.
Reviewer #1 (Recommendations for the authors):
Thanks for addressing the concerns raised.
We thank the reviewer for their support!
Reviewer #2 (Recommendations for the authors):
Minor point:
Related to the "correlation is not causality" remark I made in my evaluation of the original manuscript: the authors' answer is reasonable. Still, I would suggest to modify the abstract: "we propose a model of neurodegeneration in which chronic rG4 formation drives proteostasis collapse" => "we propose a model of neurodegeneration in which chronic rG4 formation is linked to proteostasis collapse"
All other remarks I made have been answered properly.
We thank the reviewer for their support! We have made the change exactly as requested by the reviewer.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Reviewer #1 (Public review):
Summary:
The manuscript investigates lipid scrambling mechanisms across TMEM16 family members using coarse-grained molecular dynamics (MD) simulations. While the study presents a statistically rigorous analysis of lipid scrambling events across multiple structures and conformations, several critical issues undermine its novelty, impact, and alignment with experimental observations.
Critical issues:
(1) Lack of Novelty:
The phenomenon of lipid scrambling via an open hydrophilic groove is already well-established in the literature, including through atomistic MD simulations. The authors themselves acknowledge this fact in their introduction and discussion. By employing coarse-grained simulations, the study essentially reiterates previously known findings with limited additional mechanistic insight. The repeated observation of scrambling occurring predominantly via the groove does not offer significant advancement beyond prior work.
We agree with the reviewer’s statement regarding the lack of novelty when it comes to our observations of scrambling in the groove of open Ca<sup>2+</sup>-bound TMEM16 structures. However, we feel that the inclusion of closed structures in this study, which attempts to address the yet unanswered question of how scrambling by TMEM16s occurs in the absence of Ca<sup>2+</sup>, offers new observations for the field. In our study we specifically address to what extent the induced membrane deformation, which has been theorized to aid lipids cross the bilayer especially in the absence of Ca<sup>2+</sup>, contributes to the rate of scrambling (see references 36, 59, and 66). There are also several TMEM16F structures solved under activating conditions (bound to Ca<sup>2+</sup> and in the presence of PIP2) which feature structural rearrangements to TM6 that may be indicative of an open state (PDB 6P48) and had not been tested in simulations. We show that these structures do not scramble and thereby present evidence against an out-of-the-groove scrambling mechanism for these states. Although we find a handful of examples of lipids being scrambled by Ca<sup>2+</sup>-free structures of TMEM16 scramblases, none of our simulations suggest that these events are related to the degree of deformation.
(2) Redundancy Across Systems:
The manuscript explores multiple TMEM16 family members in activating and non-activating conformations, but the conclusions remain largely confirmatory. The extensive dataset generated through coarse-grained MD simulations primarily reinforces established mechanistic models rather than uncovering fundamentally new insights. The effort, while statistically robust, feels excessive given the incremental nature of the findings.
Again, we agree with the reviewer’s statement that our results largely confirm those published by other groups and our own. We think there is however value in comparing the scrambling competence of these TMEM16 structures in a consistent manner in a single study to reduce inconsistencies that may be introduced by different simulation methods, parameters, environmental variables such as lipid composition as used in other published works of single family members. The consistency across our simulations and high number of observed scrambling events have allowed us to confirm that the mechanism of scrambling is shared by multiple family members and relies most obviously on groove dilation.
(3) Discrepancy with Experimental Observations:
The use of coarse-grained simulations introduces inherent limitations in accurately representing lipid scrambling dynamics at the atomistic level. Experimental studies have highlighted nuances in lipid permeation that are not fully captured by coarse-grained models. This discrepancy raises questions about the biological relevance of the reported scrambling events, especially those occurring outside the canonical groove.
We thank the reviewer for bringing up the possible inaccuracies introduced by coarse graining our simulations. This is also a concern for us, and we address this issue extensively in our discussion. As the reviewer pointed out above, our CG simulations have largely confirmed existing evidence in the field which we think speaks well to the transferability of observations from atomistic simulations to the coarse-grained level of detail. We have made both qualitative and quantitative comparisons between atomistic and coarse-grained simulations of nhTMEM16 and TMEM16F (Figure 1, Figure 4-figure supplement 1, Figure 4-figure supplement 5) showing the two methods give similar answers for where lipids interact with the protein, including outside of the canonical groove. We do not dispute the possible discrepancy between our simulations and experiment, but our goal is to share new nuanced ideas for the predicted TMEM16 scrambling mechanism that we hope will be tested by future experimental studies.
(4) Alternative Scrambling Sites:
The manuscript reports scrambling events at the dimer-dimer interface as a novel mechanism. While this observation is intriguing, it is not explored in sufficient detail to establish its functional significance. Furthermore, the low frequency of these events (relative to groove-mediated scrambling) suggests they may be artifacts of the simulation model rather than biologically meaningful pathways.
We agree with the reviewer that our observed number of scrambling events in the dimer interface is too low to present it as strong evidence for it being the alternative mechanism for Ca<sup>2+</sup>-independent scrambling. This will require additional experiments and computational studies which we plan to do in future research. However, we are less certain that these are artifacts of the coarse-grained simulation system as we observed a similar event in an atomistic simulation of TMEM16F.
Conclusion:
Overall, while the study is technically sound and presents a large dataset of lipid scrambling events across multiple TMEM16 structures, it falls short in terms of novelty and mechanistic advancement. The findings are largely confirmatory and do not bridge the gap between coarse-grained simulations and experimental observations. Future efforts should focus on resolving these limitations, possibly through atomistic simulations or experimental validation of the alternative scrambling pathways.
Reviewer #2 (Public review):
Summary:
Stephens et al. present a comprehensive study of TMEM16-members via coarse-grained MD simulations (CGMD). They particularly focus on the scramblase ability of these proteins and aim to characterize the "energetics of scrambling". Through their simulations, the authors interestingly relate protein conformational states to the membrane's thickness and link those to the scrambling ability of TMEM members, measured as the trespassing tendency of lipids across leaflets. They validate their simulation with a direct qualitative comparison with Cryo-EM maps.
Strengths:
The study demonstrates an efficient use of CGMD simulations to explore lipid scrambling across various TMEM16 family members. By leveraging this approach, the authors are able to bypass some of the sampling limitations inherent in all-atom simulations, providing a more comprehensive and high-throughput analysis of lipid scrambling. Their comparison of different protein conformations, including open and closed groove states, presents a detailed exploration of how structural features influence scrambling activity, adding significant value to the field. A key contribution of this study is the finding that groove dilation plays a central role in lipid scrambling. The authors observe that for scrambling-competent TMEM16 structures, there is substantial membrane thinning and groove widening. The open Ca<sup>2+</sup>-bound nhTMEM16 structure (PDB ID 4WIS) was identified as the fastest scrambler in their simulations, with scrambling rates as high as 24.4 {plus minus} 5.2 events per μs. This structure also shows significant membrane thinning (up to 18 Å), which supports the hypothesis that groove dilation lowers the energetic barrier for lipid translocation, facilitating scrambling.
The study also establishes a correlation between structural features and scrambling competence, though analyses often lack statistical robustness and quantitative comparisons. The simulations differentiate between open and closed conformations of TMEM16 structures, with open-groove structures exhibiting increased scrambling activity, while closed-groove structures do not. This finding aligns with previous research suggesting that the structural dynamics of the groove are critical for scrambling. Furthermore, the authors explore how the physical dimensions of the groove qualitatively correlate with observed scrambling rates. For example, TMEM16K induces increased membrane thinning in its open form, suggesting that membrane properties, along with structural features, play a role in modulating scrambling activity.
Another significant finding is the concept of "out-of-the-groove" scrambling, where lipid translocation occurs outside the protein's groove. This observation introduces the possibility of alternate scrambling mechanisms that do not follow the traditional "credit-card model" of groove-mediated lipid scrambling. In their simulations, the authors note that these out-of-the-groove events predominantly occur at the dimer interface between TM3 and TM10, especially in mammalian TMEM16 structures. While these events were not observed in fungal TMEM16s, they may provide insight into Ca<sup>2+</sup>-independent scrambling mechanisms, as they do not require groove opening.
Weaknesses:
A significant challenge of the study is the discrepancy between the scrambling rates observed in CGMD simulations and those reported experimentally. Despite the authors' claim that the rates are in line experimentally, the observed differences can mean large energetic discrepancies in describing scrambling (larger than 1kT barrier in reality). For instance, the authors report scrambling rates of 10.7 events per μs for TMEM16F and 24.4 events per μs for nhTMEM16, which are several orders of magnitude faster than experimental rates. While the authors suggest that this discrepancy could be due to the Martini 3 force field's faster diffusion dynamics, this explanation does not fully account for the large difference in rates. A more thorough discussion on how the choice of force field and simulation parameters influence the results, and how these discrepancies can be reconciled with experimental data, would strengthen the conclusions. Likewise, rate calculations in the study are based on 10 μs simulations, while experimental scrambling rates occur over seconds. This timescale discrepancy limits the study's accuracy, as the simulations may not capture rare or slow scrambling events that are observed experimentally and therefore might underestimate the kinetics of scrambling. It's however important to recognize that it's hard (borderline unachievable) to pinpoint reasonable kinetics for systems like this using the currently available computational power and force field accuracy. The faster diffusion in simulations may lead to overestimated scrambling rates, making the simulation results less comparable to real-world observations. Thus, I would therefore read the findings qualitatively rather than quantitatively. An interesting observation is the asymmetry observed in the scrambling rates of the two monomers. Since MARTINI is known to be limited in correctly sampling protein dynamics, the authors - in order to preserve the fold - have applied a strong (500 kJ mol-1 nm-2) elastic network. However, I am wondering how the ENM applies across the dimer and if any asymmetry can be noticed in the application of restraints for each monomer and at the dimer interface. How can this have potentially biased the asymmetry in the scrambling rates observed between the monomers? Is this artificially obtained from restraining the initial structure, or is the asymmetry somehow gatekeeping the scrambling mechanism to occur majorly across a single monomer? Answering this question would have far-reaching implications to better describe the mechanism of scrambling.
The main aim of our computational survey was to directly compare all relevant published TMEM16 structures in both open and closed states using the Martini 3 CGMD force field. Our standardized simulation and analysis protocol allowed us to quantitatively compare scrambling rates across the TMEM16 family, something that has never been done before. We do acknowledge that direct comparison between simulated versus experimental scrambling rates is complicated and is best to be interpreted qualitatively. In line with other reports (e.g., Li et al, PNAS 2024), lipid scrambling in CGMD is 2-3 orders of magnitude faster than typical experimental findings. In the CG simulation field, these increased dynamics due to the smoother energy landscape are a well known phenomenon. In our view, this is a valuable trade-off for being able to capture statistically robust scrambling dynamics and gain mechanistic understanding in the first place, since these are currently challenging to obtain otherwise. For example, with all-atom MD it would have been near-impossible to conclude that groove openness and high scrambling rates are closely related, simply because one would only measure a handful of scrambling events in (at most) a handful of structures.
Considering the elastic network: the reviewer is correct in that the elastic network restrains the overall structure to the experimental conformation. This is necessary because the Martini 3 force field does not accurately model changes in secondary (and tertiary) structure. In fact, by retaining the structural information from the experimental structures, we argue that the elastic network helped us arrive at the conclusion that groove openness is the major contributing factor in determining a protein’s scrambling rate. This is best exemplified by the asymmetric X-ray structure of TMEM16K (5OC9), in which the groove of one subunit is more dilated than the other. In our simulation, this information was stored in the elastic network, yielding a 4x higher rate in the open groove than in the closed groove, within the same trajectory.
Notably, the manuscript does not explore the impact of membrane composition on scrambling rates. While the authors use a specific lipid composition (DOPC) in their simulations, they acknowledge that membrane composition can influence scrambling activity. However, the study does not explore how different lipids or membrane environments or varying membrane curvature and tension, could alter scrambling behaviour. I appreciate that this might have been beyond the scope of this particular paper and the authors plan to further chase these questions, as this work sets a strong protocol for this study. Contextualizing scrambling in the context of membrane composition is particularly relevant since the authors note that TMEM16K's scrambling rate increases tenfold in thinner membranes, suggesting that lipid-specific or membrane-thickness-dependent effects could play a role.
Considering different membrane compositions: for this study, we chose to keep the membranes as simple as possible. We opted for pure DOPC membranes, because it has (1) negligible intrinsic curvature, (2) forms fluid membranes, and (3) was used previously by others (Li et al, PNAS 2024). As mentioned by the reviewer, we believe our current study defines a good standardized protocol and solid baseline for future efforts looking into the additional effects of membrane composition, tension, and curvature that could all affect TMEM16-mediated lipid scrambling.
Reviewer #3 (Public review):
Strengths:
The strength of this study emerges from a comparative analysis of multiple structural starting points and understanding global/local motions of the protein with respect to lipid movement. Although the protein is well-studied, both experimentally and computationally, the understanding of conformational events in different family members, especially membrane thickness less compared to fungal scramblases offers good insights.
We appreciate the reviewer recognizing the value of the comparative study. In addition to valuable insights from previous experimental and computational work, we hope to put forward a unifying framework that highlights various TMEM16 structural features and membrane properties that underlie scrambling function.
Weaknesses:
The weakness of the work is to fully reconcile with experimental evidence of Ca²⁺-independent scrambling rates observed in prior studies, but this part is also challenging using coarse-grain molecular simulations. Previous reports have identified lipid crossing, packing defects, and other associated events, so it is difficult to place this paper in that context. However, the absence of validation leaves certain claims, like alternative scrambling pathways, speculative.
It is generally difficult to quantitatively compare bulk measurements of scrambling phenomena with simulation results. The advantage of simulations is to directly observe the transient scrambling events at a spatial and temporal resolution that is currently unattainable for experiments. The current experimental evidence for the precise mechanism of Ca<sup>2+</sup>-independent scrambling is still under debate. We therefore hope to leverage the strength of MD and statistical rigor of coarse-grained simulations to generate testable hypotheses for further structural, biochemical, and computational studies.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the previous reviews.
Public Reviews:
Reviewer #1 (Public review):
This experiment sought to determine what effect congenital/early-onset hearing loss (and associated delay in language onset) has on the degree of inter-individual variability in functional connectivity to the auditory cortex. Looking at differences in variability rather than group differences in mean connectivity itself represents an interesting addition to the existing literature. The sample of deaf individuals was large, and quite homogeneous in terms of age of hearing loss onset, which are considerable strengths of the work. The experiment appears well conducted and the results are certainly of interest. R: Thank you for your positive and thoughtful feedback.
Reviewer #3 (Public review):
Summary:
This study focuses on changes in brain organization associated with congenital deafness. The authors investigate differences in functional connectivity (FC) and differences in the variability of FC. By comparing congenitally deaf individuals to individuals with normal hearing, and by further separating congenitally deaf individuals into groups of early and late signers, the authors can distinguish between changes in FC due to auditory deprivation and changes in FC due to late language acquisition. They find larger FC variability in deaf than normal-hearing individuals in temporal, frontal, parietal, and midline brain structures, and that FC variability is largely driven by auditory deprivation. They suggest that the regions that show a greater FC difference between groups also show greater FC variability.
Strengths:
The manuscript is well-written, and the methods are clearly described and appropriate. Including the three different groups enables the critical contrasts distinguishing between different causes of FC variability changes. The results are interesting and novel.
Weaknesses:
Analyses were conducted for task-based data rather than resting-state data. The authors report behavioral differences between groups and include behavioral performance as a nuisance regressor in their analysis. This is a good approach to account for behavioral task differences, given the data. Nevertheless, additional work using resting-state functional connectivity could remove the potential confound fully.
The authors have addressed my concerns well.
Thank you for your thoughtful feedback. We appreciate your acknowledgment of the strengths of our study and the approaches taken to address potential confounds. As noted, we discuss the limitation of not including resting-state data in the manuscript, and we agree that this represents an important avenue for future research. We hope to address this question in future studies.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public review):
Summary:
The paper proposes that the placement of criteria for determining whether a stimulus is 'seen' or 'unseen' can significantly impact the validity of neural measures of consciousness. The authors found that conservative criteria, which require stronger evidence to classify a stimulus as 'seen,' tend to inflate effect sizes in neural measures, making conscious processing appear more pronounced than it is. Conversely, liberal criteria, which require less evidence, reduce these effect sizes, potentially underestimating conscious processing. This variability in effect sizes due to criterion placement can lead to misleading conclusions about the nature of conscious and unconscious processing.
Furthermore, the study highlights that the Perceptual Awareness Scale (PAS), a commonly used tool in consciousness research, does not effectively mitigate these criterion-related confounds. This means that even with PAS, the validity of neural measures can still be compromised by how criteria are set. The authors emphasize the need for careful consideration and standardization of criterion placement in experimental designs to ensure that neural measures accurately reflect the underlying cognitive processes. By addressing this issue, the paper aims to improve the reliability and validity of findings in the field of consciousness research.
Strengths:
(1) This research provides a fresh perspective on how criterion placement can significantly impact the validity of neural measures in consciousness research.
(2) The study employs robust simulations and EEG experiments to demonstrate the effects of criterion placement, ensuring that the findings are well-supported by empirical evidence.
(3) By highlighting the limitations of the PAS and the impact of criterion placement, the study offers practical recommendations for improving experimental designs in consciousness research.
Weaknesses:
The primary focused criterion of PAS is a commonly used tool, but there are other measures of consciousness that were not evaluated, which might also be subject to similar or different criterion limitations. A simulation could applied to these metrics to show how generalizable the conclusion of the study is.
We would like to thank reviewer 1 for their positive words and for taking the time to evaluate our manuscript. We agree that it would be important to gauge generalization to other metrics of consciousness. Note however, that the most commonly used alternative methods are postdecision wagering and confidence, both of which are known to behave quite similarly to the PAS (Sandberg, Timmermans , Overgaard & Cleeremans, 2010). Indeed, we have confirmed in other work that confidence is also sensitive to criterion shifts (see https://osf.io/preprints/psyarxiv/xa4fj). Although it has been claimed that confidence-derived aggregate metrics like meta-d’ or metacognitive efficiency may overcome criterion shifts, it would require empirical data rather than simulation to settle whether this is true or not (also see the discussion in https://osf.io/preprints/psyarxiv/xa4fj). Furthermore, out of these metrics, the PAS seems to be the preferred one amongst consciouness researchers (see figure 4 in Francken, Beerendonk, Molenaar, Fahrenfort, Kiverstein, Seth, Gaal S van, 2022; as well as https://osf.io/preprints/psyarxiv/bkxzh). Thus, given the fact that other metrics are either expected to behave in similar ways and/or because it would require more empirical work to determine along which dimension(s) criterion shifts would operate in alternative metrics, we see no clear path to implement the suggested simulations. We anticipate that aiming to do this would require a considerable amount of additional work, figuring out many things which we believe would better suit a future project. We would of course be open to doing this if the reviewer would have more specific suggestions for how to go about the proposed simulations.
Reviewer #2 (Public review):
Summary:
The study investigates the potential influence of the response criterion on neural decoding accuracy in consciousness and unconsciousness, utilizing either simulated data or reanalyzing experimental data with post-hoc sorting data.
Strengths:
When comparing the neural decoding performance of Target versus NonTarget with or without post-hoc sorting based on subject reports, it is evident that response criterion can influence the results. This was observed in simulated data as well as in two experiments that manipulated the subject response criterion to be either more liberal or more conservative. One experiment involved a two-level response (seen vs unseen), while the other included a more detailed four-level response (ranging from 0 for no experience to 3 for a clear experience). The findings consistently indicated that adopting a more conservative response criterion could enhance neural decoding performance, whether in conscious or unconscious states, depending on the sensitivity or overall response threshold.
Weaknesses:
(1) The response criterion plays a crucial role in influencing neural decoding because a subject's report may not always align with the actual stimulus presented. This discrepancy can occur in cases of false alarms, where a subject reports seeing a target that was not actually there, or in cases where a target is present but not reported. Some may argue that only using data from consistent trials (those with correct responses) would not be affected by the response criterion. However, the authors' analysis suggests that a conservative response criterion not only reduces false alarms but also impacts hit rates. It is important for the authors to further investigate how the response criterion affects neural decoding even when considering only correct trials.
We would like to thank reviewer 2 for taking the time to evaluate our manuscript. We appreciate the suggestion to investigate neural decoding on only correct trials. What we in fact did is consider target trials that are 'correct' (hits = seen target present trials) and 'incorrect' (misses = unseen target present trials) separately, see figure 4A and figure 4B. This shows that the response criterion also affects the neural measure of consciousness when only considering correct target present trials. Note however, that one cannot decode 'unseen' (target present) trials if one only aims to decode 'correct' trials, because those are all incorrect by definition. We did not analyze false alarms (these would be the 'seen' trials on the noise distribution of Figure 1A), as there were not enough trials of those, especially in the conservative condition (see Figure 2C and 2D), making comparisons between conservative and liberal impossible. However, the predictions for false alarms are pretty straightforward, and follow directly from the framework in Figure 1.
(2) The author has utilized decoding target vs. nontarget as the neural measures of unconscious and/or conscious processing. However, it is important to note that this is just one of the many neural measures used in the field. There are an increasing number of studies that focus on decoding the conscious content, such as target location or target category. If the author were to include results on decoding target orientation and how it may be influenced by response criterion, the field would greatly benefit from this paper.
We thank the reviewer for the suggestion to decode orientation of the target. In our experiments, the target itself does not have an orientation, but the texture of which it is composed does. We used four orientations, which were balanced out within and across conditions such that presence-absence decoding is never driven by orientation, but rather by texture based figure-ground segregation (for similar logic, see for example Fahrenfort et al, 2007; 2008 etc). There are a couple of things to consider when wanting to apply a decoding analysis on the orientation of these textures:
(1) Our behavioral task was only on the presence or absence of the target, not on the orientation of the textures. This makes it impossible to draw any conclusions about the visibility of the orientation of the textures. Put differently: based on behavior there is no way of identifying seen or unseen orientations, correctly or incorrectly identified orientations etc. For examply, it is easy to envision that an observer detects a target without knowing the orientation that defines it, or vice versa a situation in which an observer does not detect the target while still being aware of the orientation of a texture in the image (either of the figure, or of the background). The fact that we have no behavioral response to the orientation of the textures severely limits the usefulness of a hypothetical decoding effect on these orientations, as such results would be uninterpretable with respect to the relevant dimension in this experiment, which is visibility.
(2) This problem is further excarbated by the fact that the orientation of the background is always orthogonal to the orientation of the target. Therefore, one would not only be decoding the orientation of the texture that constitutes the target itself, but also the texture that constitutes the background. Given that we also have no behavioral metric of how/whether the orientation of the background is perceived, it is similarly unclear how one would interpret any observed effect.
(3) Finally, it is important to note that – even when categorization/content is sometimes used as an auxiliary measure in consciousness research (often as a way to assay objective performance) - consciousness is most commonly conceptualized on the presence-absence dimension. A clear illustration of this is the concept of blindsight. Blindsight is the ability of observers to discriminate stimuli (i.e. identify content) without being able to detect them. Blindsight is often considered the bedrock of the cognitive neuroscience of consciousness as it acts as proof that one can dissociate between unconscious processing (the categorization of a stimulus, i.e. the content) and conscious processing of that stimulus (i.e. the ability to detect it).
Given the above, we do not see how the suggested analysis could contribute to the conclusions that the manuscript already establishes. We hope that – given the above - the reviewer agrees with this assessment.
Reviewer #3 (Public review):
Summary:
Fahrenfort et al. investigate how liberal or conservative criterion placement in a detection task affects the construct validity of neural measures of unconscious cognition and conscious processing. Participants identified instances of "seen" or "unseen" in a detection task, a method known as post hoc sorting. Simulation data convincingly demonstrate that, counterintuitively, a conservative criterion inflates effect sizes of neural measures compared to a liberal criterion. While the impact of criterion shifts on effect size is suggested by signal detection theory, this study is the first to address this explicitly within the consciousness literature. Decoding analysis of data from two EEG experiments further shows that different criteria lead to differential effects on classifier performance in post hoc sorting. The findings underscore the pervasive influence of experimental design and participants report on neural measures of consciousness, revealing that criterion placement poses a critical challenge for researchers.
Strengths and Weaknesses:
One of the strengths of this study is the inclusion of the Perceptual Awareness Scale (PAS), which allows participants to provide more nuanced responses regarding their perceptual experiences. This approach ensures that responses at the lowest awareness level (selection 0) are made only when trials are genuinely unseen. This methodological choice is important as it helps prevent the overestimation of unconscious processing, enhancing the validity of the findings.
A potential area for improvement in this study is the use of single time-points from peak decoding accuracy to generate current source density topography maps. While we recognize that the decoding analysis employed here differs from traditional ERP approaches, the robustness of the findings could be enhanced by exploring current source density over relevant time windows. Event-related peaks, both in terms of timing and amplitude, can sometimes be influenced by noise or variability in trial-averaged EEG data, and a time-window analysis might provide a more comprehensive and stable representation of the underlying neural dynamics.
We thank reviewer 3 for their positive words and for taking the time to evaluate our manuscript. If we understand the reviewer correctly, he/she suggests that the signal-to-noise ratio could be improved by averaging over time windows rather than taking the values at singular peaks in time. Before addressing this suggestion, we would like to point out that we plotted the relevant effects across time in Supplementary Figure S1A and S1B. These show that the observed effects were not somehow limited in time, i.e. only occuring around the peaks, but that they consistenly occured throughout the time course of the trial. In line with this observation one might argue that the results could be improved further by averaging across windows of interest rather than taking the peak moments alone, as the reviewer suggests. Although this might be true, there are many analysis choices that one can make, each of which could have a positive (or negative) effect on the signal to noise ratio. For example, when taking a window of interest, one is faced with a new choice to make, this time regarding the number of consecutive samples to average across (i.e. the size of the window), etc. More generally there is a long list of choices that may affect the precise outcome of analyses, either positively or negatively. Having analyzed the data in one way, the problem with adding new analysis approaches is that there is no objective criterion for deciding which analysis would be ‘best’, other than looking at the outcome of the statistical analyses themselves. Doing this would constitute an explorative double-dipping-like approach to analyzing the results, which – aside from potentially increasing the signal to noise ratio – is likely to also result in an increase of the type I error rate. In the past, when the first author of this manuscript has attempted to minimize the number of statistical tests, he has lowered the number of EEG time points by simply taking the peaks (for example see https://doi.org/10.1073/pnas.1617268114), and that is the approach that was taken here as well. Given the above, we prefer not to further ‘try out’ additional analytical approaches on this dataset, simply to improve the results. We hope the reviewer sympathizes with our position that it is methodologically most sound to stick to the analyses we have already performed and reported, without further exploration.
It is helpful that the authors show the standard error of the mean for the classifier performance over time. A similar indication of a measure of variance in other figures could improve clarity and transparency.
That said, the paper appears solid regarding technical issues overall. The authors also do a commendable job in the discussion by addressing alternative paradigms, such as wagering paradigms, as a possible remedy to the criterion problem (Peters & Lau, 2015; Dienes & Seth, 2010). Their consideration of these alternatives provides a balanced view and strengthens the overall discussion.
We thank the reviewer for this suggestion. Note that we already have a measure of variance in the other figures too, namely showing the connected data points of individual participants. Indvidual data points as a visualization of variance is preferred by many journals (e.g., see https://www.nature.com/documents/cr-gta.pdf), and also shows the spread of relevant differences when paired points are connected. For example, in Figure 2, 3 and 4, the relevant difference is between the liberal and conservative condition. When wanting to show the spread of the differences between these conditions, one option would be to first subtract the two measures in a pairwise fashion (e.g., liberal-conservative), and then plot the spread of those differences using some metric (e.g. standard error/CI of the mean difference). However, this has the disadvantage of no longer separately showing the raw scores on the conditions that are being compared. Showing conditions separately provides clarity to the reader about what is being compared to what. The most common approach to visualizing the variance of the relevant difference in such cases, is to plot the connected individual data points of all participants in the same plot. The uniformity of the slope of these lines in such a visualization provides direct insight into the spread of the relevant difference. Plotting the standard error of the mean on the raw scores of the conditions in these plots would not help, because this would not visualize the spread of the relevant difference (liberal-conservative). We therefore opted in the manuscript to show the mean scores on the conditions that we compare, while also showing the connected raw data points of individual participants in the same plot. One might argue that we should then use that same visualization in figure 3A, but note that this figure is merely intended to identify the peaks, i.e. it does not compare liberal to conservative. Furthermore, plotting the decoding time lines of individual participants would greatly diminish the clarity of this figure. Given our explanation, we hope the reviewer agrees with the approach that we chose, although we are of course open to modifying the figures if the reviewer has a suggestion for doing so while taking into account the points we raise here in our response.
Impact of the Work:
This study effectively demonstrates a phenomenon that has been largely unexplored within the consciousness literature. Subjective measures may not reliably capture the construct they aim to measure due to criterion confounds. Future research on neural measures of consciousness should account for this issue, and no-report measures may be necessary until the criterion problem is resolved.
Recommendations for the authors:
Reviewer #2 (Recommendations for the authors):
The authors could further elaborate on the results of the PAS to provide a clearer insight into the impact of response criteria, which is notably more complex than in other experiments. Specifically, the results demonstrate that conservative response criterion condition displays a considerably higher sensitivity compared to those with a liberal response criterion. It would be interesting to explore whether this shift in sensitivity suggests a correlation between changes in response criteria and conscious experiences, and how the interaction between sensitivity and response criteria can affect the neural measure of consciousness.
We thank the reviewer for this suggestion. Note that the change in sensitivity that we observed is minor compared to the change we observed in response criterion (hedges g criterion in exp 2 = 2.02, compared to hedges g sensitivity/d’ in exp 2 = 0.42). However, we do investigate the effect of sensitivity (disregarding response criterion) on decoding accuracy. To this end we devised Figure 3C (for the full decoding time course see Supplementary Figure S1B). These figures show that the small behavioral sensitivity effects observed in both experiments (hedges g sensitivity in exp 1 = 0.30, exp 2 = 0.42) did not translate into significant decoding differences between conservative and liberal in either experiment. This comes as no surprise given the small corresponding behavioral effects. Note that small sensitivity differences between liberal and conservative conditions are commonplace, plausibly driven by the fact that being liberal also involves being more noisy in one’s response tendencies (i.e. sometimes randomly indicating presence). Further, the reviewer suggests that we might correlate changes in response criteria to changes in conscious experience. The only relevant metric of conscious experience for which we have data in this manuscript is the Perceptual Awareness Scale (PAS), so we assume the reviewer asks for a correlation between experimentally induced changes in response criterion with the equivalent changes in d’. To this end we computed the difference in the PAS-based d’ metric between conservative and liberal, as well as the difference in the PAS-based criterion metric between conservative and liberal, and correlated these across subjects (N=26) using a Spearman rank correlation. The result shows that these metrics do not correlate r(24)=0.04, p=0.85. Note however that small-N correlations like these are only somewhat reliable for large effect sizes. An N of 26 and a mere power of 80% requires an effect size of at least r=0.5 to be detectable, so even if a correlation were to exist we may not have had enough power to detect it. Due to these caveats we opted to not report this null-correlation in the manuscript, but we are of course willing to do so if the reviewer and/or editor disagrees with this assessment.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public review):
Summary:
Tubert C. et al. investigated the role of dopamine D5 receptors (D5R) and their downstream potassium channel, Kv1, in the striatal cholinergic neuron pause response induced by thalamic excitatory input. Using slice electrophysiological analysis combined with pharmacological approaches, the authors tested which receptors and channels contribute to the cholinergic interneuron pause response in both control and dyskinetic mice (in the LDOPA off state). They found that activation of Kv1 was necessary for the pause response, while activation of D5R blocked the pause response in control mice. Furthermore, in the LDOPA off-state of dyskinetic mice, the absence of the pause response was restored by the application of clozapine. The authors claimed that (1) the D5R-Kv1 pathway contributes to the cholinergic interneuron pause response in a phasic dopamine concentration-dependent manner, and (2) clozapine inhibits D5R in the L-DOPA off state, which restores the pause response.
Strengths:
The electrophysiological and pharmacological approaches used in this study are powerful tools for testing channel properties and functions. The authors' group has well-established these methodologies and analysis pipelines. Indeed, the data presented were robust and reliable.
Thank you for your comments.
Weaknesses:
Although the paper has strengths in its methodological approaches, there is a significant gap between the presented data and the authors' claims.
There was no direct demonstration that the D5R-Kv1 pathway is dominant when dopamine levels are high. The term 'high' is ambiguous, and it raises the question of whether the authors believe that dopamine levels do not reach the threshold required to activate D5R under physiological conditions.
We acknowledge that further work is necessary to clarify the role of the D5R in physiological conditions. While we haven’t found effects of the D1/D5 receptor antagonist SCH23390 on the pause response in control animals (Fig. 3), it is still possible that dopamine levels reach the threshold to stimulate D5R when burst firing of dopaminergic neurons contributes to dopamine release. We believe the pause response depends, among other factors, on the relative stimulation levels of SCIN D2 and D5 receptors, which is likely not an all-or-nothing phenomenon. To reduce ambiguity, we have eliminated the labels referring to dopamine levels in Figure 6F.
Furthermore, the data presented in Figure 6 are confusing. If clozapine inhibits active D5R and restores the pause response, the D5R antagonist SCH23390 should have the same effect. The data suggest that clozapine-induced restoration of the pause response might be mediated by other receptors, rather than D5R alone.
Thank you for letting us clarify this issue. Please note that the levels of endogenous dopamine 24 h after the last L-DOPA challenge in severe parkinsonian mice are expected to be very low. In the absence of an agonist, a pure D1/D5 antagonist would not exert an effect, as demonstrated with SCH23390 alone, which did not have an impact on the SCIN response to thalamic stimulation (Fig. 6). While clozapine can also act as a D1/D5 receptor antagonist, its D1/D5 effects in absence of an agonist are attributed to its inverse agonist properties (PMID: 24931197). Notably, SCH23390 prevented the effect of clozapine, allowing us to conclude that ligand-independent D1/D5 receptor-mediated mechanisms are involved in suppressing the pause response in dyskinetic mice. We now made it clearer in the third paragraph of the Discussion.
Reviewer #2 (Public review):
Summary:
This manuscript by Tubert et al presents the role of the D5 receptor in modulating the striatal cholinergic interneuron (CIN) pause response through D5R-cAMP-Kv1 inhibitory signaling. Their model elucidates the on / off switch of CIN pause, likely due to the different DA affinity between D2R and D5R. This machinery may be crucial in modulating synaptic plasticity in cortical-striatal circuits during motor learning and execution. Furthermore, the study bridges their previous finding of CIN hyperexcitability (Paz et al., Movement Disorder 2022) with the loss of pause response in LID mice.
Strengths:
The study had solid findings, and the writing was logically structured and easy to follow. The experiments are well-designed, and they properly combined electrophysiology recording, optogenetics, and pharmacological treatment to dissect/rule out most, if not all, possible mechanisms in their model.
Thank you for your comments.
Weaknesses:
The manuscript is overall satisfying with only some minor concerns that need to be addressed. Manipulation of intracellular cAMP (e.g. using pharmacological analogs or inhibitors) can add additional evidence to strengthen the conclusion.
Thank you for the suggestion. While we acknowledge that we are not providing direct evidence of the role of cAMP, we chose not to conduct these experiments because cAMP levels influence several intrinsic and synaptic currents beyond Kv1, significantly affecting membrane oscillations and spontaneous firing, as shown in Paz et al. 2021. However, we are modifying the fourth paragraph of the Discussion so there is no misinterpretation about our findings in the current work.
Reviewer #3 (Public review):
Summary:
Tubert et al. investigate the mechanisms underlying the pause response in striatal cholinergic interneurons (SCINs). The authors demonstrate that optogenetic activation of thalamic axons in the striatum induces burst activity in SCINs, followed by a brief pause in firing. They show that the duration of this pause correlates with the number of elicited action potentials, suggesting a burst-dependent pause mechanism. The authors demonstrated this burst-dependent pause relied on Kv1 channels. The pause is blocked by an SKF81297 and partially by sulpiride and mecamylamine, implicating D1/D5 receptor involvement. The study also shows that the ZD7288 does not reduce the duration of the pause and that lesioning dopamine neurons abolishes this response, which can be restored by clozapine.
Weaknesses:
While this study presents an interesting mechanism for SCIN pausing after burst activity, there are several major concerns that should be addressed:
(1) Scope of the Mechanism:
It is important to clarify that the proposed mechanism may apply specifically to the pause in SCINs following burst activity. The manuscript does not provide clear evidence that this mechanism contributes to the pause response observed in behavioral animals. While the thalamus is crucial for SCIN pauses in behavioral contexts, the exact mechanism remains unclear. Activating thalamic input triggers burst activity in SCINs, leading to a subsequent pause, but this mechanism may not be generalizable across different scenarios. For instance, approximately half of TANs do not exhibit initial excitation but still pause during behavior, suggesting that the burst-dependent pause mechanism is unlikely to explain this phenomenon. Furthermore, in behavioral animals, the duration of the pause seems consistent, whereas the proposed mechanism suggests it depends on the prior burst, which is not aligned with in vivo observations. Additionally, many in vivo recordings show that the pause response is a reduction in firing rate, not complete silence, which the mechanism described here does not explain. Please address these in the manuscript.
Thank you for your valuable feedback. While the absence of an initial burst in some TANs in vivo may suggest the involvement of alternative or additional mechanisms, this does not exclude a participation of Kv1 currents. We have seen that subthreshold depolarizations induced by thalamic inputs are sufficient to produce an afterhyperpolarization (AHP) mediated by Kv1 channels (see Tubert et al., 2016, PMID: 27568555). Although such subthreshold depolarizations are not captured in current recordings from behaving animals, intracellular in vivo recordings have demonstrated an intrinsically generated AHP after subthreshold depolarization of SCIN caused by stimulation of excitatory afferents (PMID: 15525771). Additionally, when pause duration is plotted against the number of spikes elicited by thalamic input (Fig. 1G), we found that one elicited spike is followed by an interspike interval 1.4 times longer than the average spontaneous interspike interval. We acknowledge the potential involvement of additional factors, including a decrease of excitatory thalamic input coinciding with the pause, followed by a second volley of thalamic inputs (Fig. 1J-K, after observations by Matsumoto et al., 2001- PMID: 11160526), as well as the timing of elicited spikes relative to ongoing spontaneous firing (Fig. 1D-E). Dopaminergic modulation (Fig. 3) and regional differences among striatal regions (PMID: 24559678) may also contribute to the complexity of these dynamics.
(2) Terminology:
The use of "pause response" throughout the manuscript is misleading. The pause induced by thalamic input in brain slices is distinct from the pause observed in behavioral animals. Given the lack of a clear link between these two phenomena in the manuscript, it is essential to use more precise terminology throughout, including in the title, bullet points, and body of the manuscript.
While we acknowledge that our study does not include in vivo evidence, we believe ex vivo preparations have been instrumental in elucidating the mechanisms underlying the responses observed in vivo. We also agree with previous ex vivo studies in using consistent terminology. However, we will clarify the ex vivo nature of our work in the abstract and bullet points for greater transparency.
(3) Kv1 Blocker Specificity:
It is unclear how the authors ruled out the possibility that the Kv1 blocker did not act directly on SCINs. Could there be an indirect effect contributing to the burst-dependent pause? Clarification on this point would strengthen the interpretation of the results.
Thank you for letting us clarify this issue. In our previous work (Tubert et al., 2016) we showed that the Kv1.3 and Kv1.1 subunits are selectively expressed in SCIN throughout the striatum. Moreover, gabaergic transmission is blocked in our preparations. We are including a phrase to make it clearer in the manuscript (Results section, subheading “The pause response to thalamic stimulation requires activation of Kv1 channels”).
(4) Role of D1 Receptors:
While it is well-established that activating thalamic input to SCINs triggers dopamine release, contributing to SCIN pausing (as shown in Figure 3), it would be helpful to assess the extent to which D1 receptors contribute to this burst-dependent pause. This could be achieved by applying the D1 agonist SKF81297 after blocking nAChRs and D2 receptors.
Thank you for letting us clarify this point. We show that blocking D2R or nAChR reduces the pause only for strong thalamic stimulation eliciting 4 SCIN spikes (Figure 3G), whereas the D1/D5 agonist SKF81297 is able to reduce the pause induced by weaker stimulation as well (Figure 3C). In addition, the D1/D5 receptor antagonist SCH23390 does not modify the pause response (Figure 3C). This may indicate that nAChR-mediated dopamine release induced by thalamic-induced bursts more efficiently activates D2R compared to D5R. We speculate that, in this context, lack of D5R activation may be necessary to keep normal levels of Kv1.3 currents necessary for SCIN pauses.
(5) Clozapine's Mechanism of Action:
The restoration of the burst-dependent pause by clozapine following dopamine neuron lesioning is interesting, but clozapine acts on multiple receptors beyond D1 and D5.
Although it may be challenging to find a specific D5 antagonist or inverse agonist, it would be more accurate to state that clozapine restores the burst-dependent pause without conclusively attributing this effect to D5 receptors.
Thank you for your insightful observation. We acknowledge the difficulty of targeting dopamine receptors pharmacologically due to the lack of highly selective D1/D5 inverse agonists. We used SCH23390, which is a highly selective D1/D5 receptor antagonist devoid of inverse agonist effects, to block clozapine’s ability to restore SCIN pauses (Figure 6C). This indicates that the restoration of SCIN pauses by clozapine depends on D1/D5 receptors. Furthermore, in a previous study, we demonstrated that clozapine’s effect on restoring SCIN excitability in dyskinetic mice (a phenomenon mediated by Kv1 channels in SCIN; Tubert et al., 2016) was not due to its action on serotonin receptors (Paz, Stahl et al., 2022). While our data do not rule out the potential contribution of other receptors, such as muscarinic acetylcholine receptors, we believe they strongly support the role of D1/D5 receptors. To reflect this, we added a statement discussing the potential contribution of receptors beyond D1/D5 in the last paragraph of the Discussion.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
(1) The effect of MgTx was not consistent with the previous study (Tubert, 2016). I expected MgTx to increase the basal firing rate of cholinergic interneurons.
Thank you for highlighting this. In our previous study we used ACSF in the recording pipette, instead of the intracellular solution -higher in potassium- used in the present study. This is likely related to the higher spontaneous firing rates of SCIN observed in the present study, which made the SCIN response stand out. In addition, our previous study analyzed the effect of MgTx on spontaneous firing frequency of SCIN isolated from major circuit regulation by adding CNQX and picrotoxin to the bath, while in this study we needed to preserve the thalamic input and only picrotoxin in the bath was used. Given these differences, the two conditions are not strictly comparable but rather give complementary information.
(2) In the text, the authors claim that "SCINs recorded in the parkinsonian OFF-L-DOPA condition show an increase in membrane excitability that mimics changes acutely induced by SKF81297 in SCINs from control mice." However, the data for SKF81297 do not support this claim.
We modified the text to make it clearer that the cited phrase refers to a previous publication (PMID: 35535012) in which SCIN intrinsic excitability was characterized via analysis of responses to somatic current injection in whole-cell recordings. In the present study Fig. 3D shows SKF81297 effects on interspike intervals during spontaneous activity with a trend towards increased firing, and Fig. 4E a lack of effect on “burst duration” for responses with different numbers of spikes elicited by thalamic afferent stimulation.
(3) I recommend testing whether other receptors, such as D2R, contribute to the clozapineinduced pause response in the L-DOPA off state.
Thank you for your suggestion. We acknowledge that studying the role of D2R is important. However, our preliminary data suggest that a comprehensive follow up study, which is beyond the scope of this manuscript, is necessary to understand their role.
Reviewer #2 (Recommendations for the authors):
(1) For Figure 1D-E, I understand that the authors are trying to state that the previous spontaneous spike contributes to a hyperpolarized window that induces a delay in the evoked spikes. However, it is almost impossible to discriminate between spontaneous and evoked spikes in this experiment. Furthermore, considering the tonic firing property, I highly suspect that even a sham control design (no optogenetic light) will give you a similar distribution as in Figure 1E (the longer IN X1, the shorter in X2).
We agree that some spikes following stimulus onset may have occurred independently of the light stimulus, as it is also possible during behavioral tasks. We used the baseline recordings to estimate the effects of a sham stimulus as requested and included the data in Fig. 1E-F. As expected, the sham stimulation data showed a similar inverse relationship with the time elapsed from the preceding spike, but latencies were longer than with the stimulus (except for times close to the average ISI), suggesting that the optical stimulation increased the probability of evoking a spike (Fig. 1F). Remarkably, the pause following this threshold stimulation was significantly longer than the baseline ISI, as reported in the main text (Results section, last sentence of first paragraph).
(2) The authors used optogenetics to induce thalamic inputs to induce the pause after bursts. Considering CINs also receive inputs from different brain regions, e.g. cortex, does this phenomena/pause after bursts also exist following cortical inputs?
We did not study the SCIN response to cortical inputs, but both thalamic and cortical inputs seem to drive SCIN pause-responses as observed by others (PMID: 24553950).
(3) The effect of the D5R inverse agonism, and the further combined with D5R agonist and antagonist, faithfully reveal/confirm the increase of ligand-independent activity of D5R in LID reported previously. It would be ideal to also directly modulate intracellular cAMP (as in the 2022 paper) to confirm the rescue effects on the CIN pause response.
Please, see our response in the public review.
(4) In healthy conditions, the balance between D2R and D5R signaling (shown in Figure 6F left) switches the pause and no pause modes which potentially contributes to cortical-striatal plasticity. How about in LID off L-DOPA condition? Is it possible to rescue/modulate the pause on/off mode by D2R agonism in LID?
We haven’t tested the effect of D2 agonists yet, but this is scheduled for follow up studies.
Reviewer #3 (Recommendations for the authors):
(1) The authors use the ratio of pause duration to baseline ISI to describe the pause, which is useful for detecting significant differences. However, it would be beneficial to also report the actual duration of the burst-dependent pause to provide readers with a clearer understanding of the variation in pauses.
In all figures we report the average baseline ISI duration for each experiment / experimental condition, allowing readers to estimate actual pause durations. We added in the main text actual average pause durations corresponding to Fig. 1H, which are representative of those observed along the study.
(2) In Figure 3D, a more detailed comparison would be helpful, as there appears to be a significant difference between the SKF81297 group and others.
We acknowledge that there might be a trend towards reduced ISIs, however, it was statistically non-significant (see legend of figure 3). In addition, the effect of SKF81297 seems unrelated to this trend, as its effect is also seen under the effect of ZD7288, which substantially prolongs the baseline ISI (Fig. 4E-F).
-
-
www.researchsquare.com www.researchsquare.com
-
Author response:
The following is the authors’ response to the current reviews.
Comments on revisions:
I thank the authors for addressing my comments.
- I believe that additional in vivo experiments, or the inclusion of controls for the specificity of the inhibitor, which the authors argue are beyond the scope of the current study, are essential to address the weaknesses and limitations stated in my current evaluation.
We respectfully acknowledge the reviewer's concern but would like to reiterate that demonstrating the specificity of the inhibitor is beyond the scope of this study. Alpelisib (BYL-719) is a clinically approved drug widely recognized as a specific inhibitor of p110α, primarily used in the treatment of breast cancer. Its selectivity for the p110α isoform has been extensively validated in the literature.
In our study, we used Alpelisib to assess whether pharmacological inhibition of p110α would produce effects similar to those observed in our genetic model, which is particularly relevant for the potential translational implications of our findings. Given the well-documented specificity of this inhibitor, we believe that additional controls to confirm its selectivity are unnecessary within the context of this study. Instead, our focus has been to investigate the functional role of p110α activity in macrophage-driven inflammation using the models described.
We appreciate the reviewer’s insight and hope this clarification addresses their concern.
- While the neutrophil depletion suggests neutrophils are not required for the phenotype, there are multiple other myeloid cells, in addition to macrophages, that could be contributing or accounting for the in vivo phenotype observed in the mutant strain (not macrophage specific).
We appreciate the reviewer's observation regarding the potential involvement of other myeloid cells. However, it is important to highlight that the inflammatory process follows a well-characterized sequential pattern. Our data clearly demonstrate that in the paw inflammation model:
· Neutrophils are effectively recruited, as evidenced by the inflammatory abscess filled with polymorphonuclear cells.
· However, macrophages fail to be recruited in the RBD model.
Given that this critical step is disrupted, it is reasonable to expect that any subsequent steps in the inflammatory cascade would also be affected. A precise dissection of the role of other myeloid populations would require additional lineage-specific models to selectively target each subset, which, as we have previously stated, would be the focus of an independent study.
While we cannot entirely exclude the contribution of other myeloid cells, our data strongly support the conclusion that macrophages are, at the very least, a key component of the observed phenotype. We explicitly address this point in the Discussion section, where we acknowledge the potential involvement of other myeloid populations.
- Inclusion of absolute cell numbers (in addition to the %) is essential. I do not understand why the authors are not including these data. Have they not counted the cells?
We appreciate the reviewer’s concern regarding the inclusion of absolute cell numbers. However, as stated in the Materials and Methods section, we analyzed 50,000 cells per sample, and the percentages reported in the manuscript are directly derived from this standardized count.
Our decision to present the data as percentages follows standard practices in flow cytometry-based analyses, as it allows for a clearer and more biologically relevant comparison of relative changes between conditions. This approach ensures consistency across samples and facilitates the interpretation of population dynamics during inflammation.
We would also like to clarify that all data are based on actual counts, and rigorous controls were implemented throughout the study to ensure accuracy and reproducibility. We hope this explanation addresses the reviewer’s concern and provides further clarity on our approach.
- Lastly, inclusion of representatives staining and gating strategies for all immune profiling measurements carried out by flow cytometry is important. This point has not been addressed, not even in writing.
We appreciate the reviewer’s concern regarding the inclusion of absolute cell numbers. However, as stated in the Materials and Methods section, we analyzed 50,000 cells per sample, and the percentages reported in the manuscript are directly derived from this standardized count.
Our decision to present the data as percentages follows standard practices in flow cytometry-based analyses, as it allows for a clearer and more biologically relevant comparison of relative changes between conditions. This approach ensures consistency across samples and facilitates the interpretation of population dynamics during inflammation.
We would also like to clarify that all data are based on actual counts, and rigorous controls were implemented throughout the study to ensure accuracy and reproducibility. We hope this explanation addresses the reviewer’s concern and provides further clarity on our approach.
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public review):
This study by Alejandro Rosell et al. reveals the immunoregulatory role of the RAS-p110α pathway in macrophages, specifically in regulating monocyte extravasation and lysosomal digestion during inflammation. Disrupting this pathway, through genetic tools or pharmacological intervention in mice, impairs the inflammatory response, leading to delayed resolution and more severe acute inflammation. The authors suggest that activating p110α with small molecules could be a potential therapeutic strategy for treating chronic inflammation. These findings provide important insights into the mechanisms by which p110α regulates macrophage function and the overall inflammatory response.
The updates made by the authors in the revised version have addressed the main points raised in the initial review, further improving the strength of their findings.
Reviewer #2 (Public review):
Summary:
Cell intrinsic signaling pathways controlling the function of macrophages in inflammatory processes, including in response to infection, injury or in the resolution of inflammation are incompletely understood. In this study, Rosell et al. investigate the contribution of RAS-p110α signaling to macrophage activity. p110α is a ubiquitously expressed catalytic subunit of PI3K with previously described roles in multiple biological processes including in epithelial cell growth and survival, and carcinogenesis. While previous studies have already suggested a role for RAS-p110α signaling in macrophage function, the cell intrinsic impact of disrupting the interaction between RAS and p110α in this central myeloid cell subset is not known.
Strengths:
Exploiting a sound previously described genetically engineered mouse model that allows tamoxifen-inducible disruption of the RAS-p110α pathway and using different readouts of macrophage activity in vitro and in vivo, the authors provide data consistent with their conclusion that alteration in RAS-p110α signaling impairs various but selective aspects of macrophage function in a cell-intrinsic manner.
Weaknesses:
My main concern is that for various readouts, the difference between wild-type and mutant macrophages in vitro or between wild-type and Pik3caRBD mice in vivo is modest, even if statistically significant. To further substantiate the extent of macrophage function alteration upon disruption of RAS-p110α signaling and its impact on the initiation and resolution of inflammatory responses, the manuscript would benefit from a more extensive assessment of macrophage activity and inflammatory responses in vivo.
Thank you for raising this point. We understand the reviewer’s concern regarding the modest yet statistically significant differences observed between wild-type and mutant macrophages in vitro, as well as between wild-type and Pik3ca<sup>RBD</sup> mice in vivo. Our current study aimed to provide a foundational exploration of the role of RAS-p110α signaling in macrophage function and inflammatory response, focusing on a set of core readouts that demonstrate the physiological relevance of this pathway. While a more extensive in vivo assessment could offer additional insights into macrophage activity and the nuanced effects of RAS-p110α disruption, it would require an array of new experiments that are beyond the current scope.
However, we believe that the current data provide significant insights into the pathway’s role, highlighting important alterations in macrophage function and inflammatory processes due to RAS-p110α disruption. These findings lay the groundwork for future studies that can build upon our results with a more comprehensive analysis of macrophage activity in various inflammatory contexts.
In the in vivo model, all cells have disrupted RAS-p100α signaling, not only macrophages. Given that other myeloid cells besides macrophages contribute to the orchestration of inflammatory responses, it remains unclear whether the phenotype described in vivo results from impaired RAS-p100α signaling within macrophages or from defects in other haematopoietic cells such as neutrophils, dendritic cells, etc.
Thank you for raising this point. To address this, we have added a paragraph in the Discussion section acknowledging that RAS-p110α signaling disruption affects all hematopoietic cells (lines 461-470 in the discussion). However, we also provide several lines of evidence that support macrophages as the primary cell type involved in the observed phenotype. Specifically, we note that neutrophil depletion in chimera mice did not alter transendothelial extravasation, and that macrophages were the primary cell type showing significant functional defects in the paw edema model. These findings, combined with specific deficiencies in myeloid populations, suggest a predominant role of macrophages in the impaired inflammatory response, though we acknowledge the potential contributions of other myeloid cells.
Inclusion of information on the absolute number of macrophages, and total immune cells (e.g. for the spleen analysis) would help determine if the reduced frequency of macrophages represents an actual difference in their total number or rather reflects a relative decrease due to an increase in the number of other/s immune cell/s.
Thank you for this suggestion. We understand the value of presenting actual measurements; however, we opted to display normalized data to provide a clearer comparison between WT and RBD mice, as this approach highlights the relative differences in immune populations between the two groups. Normalizing data helps to focus on the specific impact of the RAS-p110α disruption by minimizing inter-sample variability that can obscure these differences.
To further address the reviewer’s concern regarding the interpretation of macrophage frequencies, we have included a pie chart that represents the relative proportions of the various immune cell populations studied within our dataset. Author response image 1 provides a visual overview of the immune cell distribution, enabling a clearer understanding of whether the observed decrease in macrophage frequency represents an actual reduction in total macrophage numbers or a shift in their relative abundance due to changes in other immune populations.
We hope this approach satisfactorily addresses reviewer’s concerns by providing both a normalized dataset for clearer interpretation of genotype-specific effects and an overall immune profile that contextualizes macrophage frequency within the broader immune cell landscape.
Author response image 1.
Recommendations for the authors:
Reviewer #2 (Recommendations for the authors):
As proof of concept data that activation of RAS-p110α signaling constitutes indeed a putative approach for treating chronic inflammation is not included in the manuscript, I suggest removing this implication from the abstract.
Thank you for this suggestion. We have now removed this implication from the abstract to maintain clarity and to better reflect the scope of the data presented in the manuscript.
Inclusion of a control in which RBD/- cells are also treated with BYL719, across experiments in which the inhibitor is used, would be important to determine, among other things, the specificity of the inhibitor.
We appreciate the reviewer’s suggestion to include RBD/- cells treated with BYL719 as an additional control. However, we would like to clarify that this approach would raise a different biological question, as treating RBD mice with BYL719 would not only address the specificity of the inhibitor but also examine the combined effects of genetic and pharmacologic disruptions on PI3K pathway signaling. Investigating this dual disruption falls outside the scope of our current study, which is focused specifically on the effects of RAS-p110α disruption.
It is also important to note that our RBD mouse model selectively disrupts RAS-mediated activation of p110α, while PI3K activation can still occur through other pathways, such as receptor tyrosine kinases (RTKs) and G protein-coupled receptors (GPCRs). Thus, inhibiting p110α with BYL719 would produce broader effects beyond the inhibition of RAS-PI3K signaling, impacting PI3K activation regardless of its upstream source.
In addition, incorporating this control would require us to repeat nearly all experiments in the manuscript, as it would necessitate generating and analyzing new samples for each experimental condition. Given the scope and resources involved, we believe this approach is unfeasible at this stage of the revision process.
We hope this explanation is satisfactory and that the current data in the manuscript provide a rigorous assessment of the RAS-p110α signaling pathway within the defined experimental scope.
Figure 3I is missing the statistical analysis (this is mentioned in the legend though).
Thank you for pointing this out. We apologize for the oversight. The statistical analysis for Figure 3I has now been added.
Gating strategies and representative staining should be included more generally across the manuscript.
Thank you for this suggestion. To address this, we have added a new supplementary figure (Figure 2-Supplement Figure 2) that illustrates the gating strategy along with a representative dataset. Additionally, a brief summary of the gating strategy has been included in the main text to further clarify the methodology.
It is recommended that authors show actual measurements rather than only data normalized to the control (or arbitrary units).
Thank you for this suggestion. We understand the value of presenting actual measurements; however, we opted to display normalized data to provide a clearer comparison between WT and RBD mice, as this approach highlights the relative differences in immune populations between the two groups. Normalizing data helps to focus on the specific impact of the RAS-p110α disruption by minimizing inter-sample variability that can obscure these differences.
To further address the reviewer’s concern regarding the interpretation of macrophage frequencies, we have included a pie chart that represents the relative proportions of the various immune cell populations studied within our dataset. Author response image 1 provides a visual overview of the immune cell distribution, enabling a clearer understanding of whether the observed decrease in macrophage frequency represents an actual reduction in total macrophage numbers or a shift in their relative abundance due to changes in other immune populations.
We hope this approach satisfactorily addresses reviewer’s concerns by providing both a normalized dataset for clearer interpretation of genotype-specific effects and an overall immune profile that contextualizes macrophage frequency within the broader immune cell landscape.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
eLife Assessment
This manuscript reports important findings that the methyltransferase METTL3 is involved in the repair of abasic sites and uracil in DNA, mediating resistance to floxuridine-driven cytotoxicity. Convincing evidence shows the involvement of m6A in DNA based on single cell imaging and mass spec data. The authors present evidence that the m6A signal does not result from bacterial contamination or RNA, but the text does not make this overly clear.
We thank the editors for recognizing the importance of our work and the relevance of METTL3 and 6mA in DNA repair. We agree the evidence presented can be regarded as convincing, in that it includes validation with orthogonal approaches and excludes the source of 6mA being RNA or bacterial contamination.
To clarify, the identification of 6mA in DNA, upon DNA damage, is based first on immunofluorescence observations using an anti-m6A antibody. In this setting, removal of RNA with RNase treatment fails to reduce the 6mA signal, excluding the possibility that the source of signal is RNA. In contrast, removal of DNA with DNase treatment removes all 6mA signal, strongly suggesting that the species carrying the N6-methyladenosine modification is DNA (Figure 3D, E). Importantly, in Figure 3F, G, we provide orthogonal, quantitative mass spectrometry data that independently confirm this finding. Mass spectrometry-liquid chromatography of DNA analytes, conclusively shows the presence of 6mA in DNA upon treatment with DNA damaging agents and excludes that the source is RNA, based on exact mass.
Cells only show the 6mA signal when treated with DNA damaging agents, and the 6mA is absent from untreated cells (Figure 3D, E, H, I). This provides strong evidence that the 6mA signal is not a result of bacterial contamination in our cell lines. Furthermore, our cell lines are routinely tested for mycoplasma contamination. It could be possible that stock solutions of DNA damaging agents may be contaminated, but this would need to be true for all individual drugs and stocks tested, which is highly unlikely. Moreover, the data showing 6mA signal is not significantly different from untreated cells when a DNA damaging agent is combined with a METTL3 inhibitor (Figure 3H, I) provides strong evidence against bacterial contamination in our stocks.
In summary, we provide conclusive evidence, based on orthogonal methods, that the METTL3-dependent N6-methyladenosine modification is deposited in DNA, not RNA, in response to DNA damage and have now clarified these points in the results and discussion.
Public Reviews:
Reviewer #1 (Public review):
Summary:
The authors sought to identify unknown factors involved in the repair of uracil in DNA through a CRISPR knockout screen.
Strengths:
The screen identified both known and unknown proteins involved in DNA repair resulting from uracil or modified uracil base incorporation into DNA. The conclusion is that the protein activity of METTL3, which converts A nucleotides to 6mA nucleotides, plays a role in the DNA damage/repair response. The importance of METTL3 in DNA repair, and its colocalization with a known DNA repair enzyme, UNG2, is well characterized.
Weaknesses:
This reviewer identified no major weaknesses in this study. The manuscript could be improved by tightening the text throughout, and more accurate and consistent word choice around the origin of U and 6mA in DNA. The dUTP nucleotide is misincorporated into DNA, and 6mA is formed by methylation of the A base present in DNA. Using words like 6mA "deposition in DNA" seems to imply it results from incorporation of a methylated dATP nucleotide during DNA synthesis.
The increased presence of 6mA during DNA damage could result from methylation at the A base itself (within DNA) or from incorporation of pre-modified 6mA during DNA synthesis. Our data do not directly discriminate between these two mechanisms, and we clarified this point in the discussion.
Reviewer #2 (Public review):
Summary:
In this work, the authors performed a CRISPR knockout screen in the presence of floxuridine, a chemotherapeutic agent that incorporates uracil and fluoro-uracil into DNA, and identified unexpected factors, such as the RNA m6A methyltransferase METTL3, as required to overcome floxuridine-driven cytotoxicity in mammalian cells. Interestingly, the observed N6-methyladenosine was embedded in DNA, which has been reported as DNA 6mA in mammalian genomes and is currently confirmed with mass spectrometry in this model. Therefore, this work consolidated the functional role of mammalian genomic DNA 6mA, and supported with solid evidence to uncover the METTL3-6mA-UNG2 axis in response to DNA base damage.
Strengths:
In this work, the authors took an unbiased, genome-wide CRISPR approach to identify novel factors involved in uracil repair with potential clinical interest.
The authors designed elegant experiments to confirm the METTL3 works through genomic DNA, adding the methylation into DNA (6mA) but not the RNA (m6A), in this base damage repair context. The authors employ different enzymes, such as RNase A, RNase H, DNase, and liquid chromatography coupled to tandem mass spectrometry to validate that METTL3 deposits 6mA in DNA in response to agents that increase genomic uracil.
They also have the Mettl3-KO and the METTL3 inhibition results to support their conclusion.
Weaknesses:
Although this study demonstrates that METTL3-dependent 6mA deposition in DNA is functionally relevant to DNA damage repair in mammalian cells, there are still several concerns and issues that need to be improved to strengthen this research.
First, in the whole paper, the authors never claim or mention the mammalian cell lines contamination testing result, which is the fundamental assay that has to be done for the mammalian cell lines DNA 6mA study.
Our cell lines are routinely tested for bacterial contamination, specifically mycoplasma, and we state this information in the revised manuscript.
Importantly, we do not observe 6mA in untreated cells, strongly suggesting that the 6mA signal observed is dependent on the presence of DNA damage and not caused by contamination in the cell lines (Figure 3D, E, H, I). While it could be possible that stock solutions of DNA damaging agents may be contaminated, this would need to be the case for all individual drugs and stocks tested that induce 6mA, which is very unlikely. Finally, the data showing 6mA signal is not significantly different from untreated cells when a DNA damaging agent is combined with a METTL3 inhibitor (Figure 3 H, I) provides strong evidence against bacterial contamination in our drug stocks.
Second, in the whole work, the authors have not supplied any genomic sequencing data to support their conclusions. Although the sequencing of DNA 6mA in mammalian models is challenging, recent breakthroughs in sequencing techniques, such as DR-Seq or NT/NAME-seq, have lowered the bar and improved a lot in the 6mA sequencing assay. Therefore, the authors should consider employing the sequencing methods to further confirm the functional role of 6mA in base repair.
While we agree that it could be important to understand the precise genomic location of 6mA in relation to DNA damage, this is outside the scope of the current study. Moreover, this exercise may prove unproductive. If 6mA is enriched in DNA at damage sites or as DNA is replicated, the genomic mapping of 6mA is likely to be stochastic. If stochastic, it would be impossible to obtain the read depth necessary to map 6mA accurately.
Third, the authors used the METTL3 inhibitor and Mettl3-KO to validate the METTL36mA-UNG2 functional roles. However, the catalytic mutant and rescue of Mettl3 may be the further experiments to confirm the conclusion.
We believe this to be an excellent suggestion from Reviewer #2 but we are unable to perform the proposed experiment at this time. We encourage future studies to explore the rescue experiment.
Reviewer #3 (Public review):
Summary:
The authors are showing evidence that they claim establishes the controversial epigenetic mark, DNA 6mA, as promoting genome stability.
Strengths:
The identification of a poorly understood protein, METTL3, and its subsequent characterization in DDR is of high quality and interesting.
Weaknesses:
(1) The very presence of 6mA (DNA) in mammalian DNA is still highly controversial and numerous studies have been conclusively shown to have reported the presence of 6mA due to technical artifacts and bacterial contamination. Thus, to my knowledge there is no clear evidence for 6mA as an epigenetic mark in mammals, and consequently, no evidence of writers and readers of 6mA. None of this is mentioned in the introduction. Much of the introduction can be reduced, but a paragraph clearly stating the controversy and lack of evidence for 6mA in mammals needs to be added, otherwise, the reader is given an entirely distorted view of the field.
These concerns must also be clearly in the limitations section and even in the results section which fails to nuance the authors' findings.
We agree with the reviewer that the presence and potential function of 6mA in mammalian DNA has been debated. Importantly, the debate regarding the presence and quantity of 6mA in DNA has been previously restricted to undamaged, baseline conditions. In complete agreement with this notion, we do not detect appreciable levels of 6mA in untreated cells. We revised the introduction section to present the debate about 6mA in DNA. We, however, want to highlight that our study provides, for the first time, convincing evidence (based on two orthogonal methods) that 6mA is present in DNA in response to a stimulus, DNA damage. We do not claim or provide any data that suggest 6mA is a baseline epigenetic mark.
(2) What is the motivation for using HT-29 cells? Moreover, the materials and methods do not state how the authors controlled for bacterial contamination, which has been the most common cause of erroneous 6mA signals to date. Did the authors routinely check for mycoplasma?
HT-29 is a cell line of colorectal origin and chemotherapeutic agents that introduce uracil and uracil derivatives in DNA, as those used in this study, are relevant for the treatment of colorectal cancer. As indicated above, we do not observe 6mA in untreated cells, strongly suggesting that the 6mA signal observed is dependent on DNA damage and not caused by a potential bacterial contamination (Figure 3D, E, H, I). Additionally, our cell lines are routinely tested for bacterial contamination, specifically mycoplasma.
(3) The single cell imaging of 6mA in various cells is nice. The results are confirmed by mass spec as an orthogonal approach. Another orthogonal and quantitative approach to assessing 6mA levels would be PacBio. Similarly, it is unclear why the authors have not performed dot-blots of 6mA for genomic DNA from the given cell lines.
We are confused by this point since an orthogonal approach to detect 6mA, mass spectrometry-liquid chromatography, was employed. This method does not use an antibody and confirms the increase of 6mA in DNA when cells were treated with DNA damaging agents. This data is presented in Figure 3F, G.
It is sensible to hypothesize that the localization of 6mA is consistent with DNA replication (like uracil deposition). In this event, the genomic mapping of 6mA is likely to be stochastic. This would make quantification with PacBio sequencing difficult because it would be very challenging to achieve the appropriate read depth to call a modified base.
Dot blots rely on an antibody and thus are not truly orthogonal to our immunofluorescence-based measurements. We preferred the mass spectrometry-liquid chromatography approach we took as a true orthogonal approach.
(4) The results of Figure 3 need further investigation and validation. If the results are correct the authors are suggesting that the majority of 6mA in their cell lines is present in the DNA, and not the RNA, which is completely contrary to every other study of 6mA in mammalian cells that I am aware of. This could suggest that the antibody is not, in fact, binding to 6mA, but to unmodified adenine, which would explain why the signal disappears after DNAse treatment. Indeed, binding of 6mA to unmethylated DNA is a commonly known problem with most 6mA antibodies and is well described elsewhere.
Based on this and the following comment, we are convinced that Reviewer #3 has overlooked two critical elements of our study:
First, the immunofluorescence work presented in Figure 3, showing 6mA signal in response to DNA damage, uses cells that were pre-extracted to remove excess cytoplasmic RNA. This method is often used in immunofluorescence experiments of this kind. The pre-extraction method removes most of the cytoplasmic content, and the majority of the cytoplasmic m6A RNA signal. Supplementary Figure 3D shows cells that have not been pre-extracted prior to staining. These images show the cytoplasmic m6A signal is abundant if we do not perform the pre-extraction step.
If the antibody used to label 6mA significantly reacted with unmodified adenine, we would expect a large signal in untreated or untreated and denatured conditions. In contrast, an increase in 6mA is not observed in either case.
Second, the orthogonal approach we employed, mass spectrometry coupled with liquid chromatography, measures 6mA DNA analytes specifically by exact mass. This approach does not depend on an antibody and yields results consistent with those from the immunofluorescence experiments.
(5) Given the lack of orthologous validation of the observed DNA 6mA and the lack of evidence supporting the presence of 6mA in mammalian DNA and consequently any functional role for 6mA in mammalian biology, the manuscript's conclusions need to be toned down significantly, and the inherent difficulty in assessing 6mA accurately in mammals acknowledged throughout.
As discussed in response to prior comments, Figure 3 does provide two independent and orthologous methods that demonstrate 6mA presence in DNA specifically, and not RNA, in response to DNA damage. Complementary and orthogonal datasets are presented using either immunofluorescence microscopy or mass spectrometry-liquid chromatography of extracted DNA. The latter method does not rely on an antibody and can discriminate 6mA DNA versus RNA based on exact mass. We revised the text to clarify that Figure 3F, G is a completely orthogonal approach.
Recommendations for the authors:
Reviewer #2 (Recommendations for the authors):
The authors cited most of the related publications; however, the reviewer suggested that three 2015 papers in Cell (Dahua Chen's, Yang Shi's, and Chuan He's) and the 2016 Nature (Andrew Xiao's) article are worth citing here because those are the milestone works reported the genomic DNA 6mA, for the first wave, in eukaryotic and mammalian genomes.
Furthermore, in Tao P. Wu and Andrew Z. Xiao's 2016 Nature article, the result has already emphasized the genomic DNA 6mA is enriched in the H2A.X sites; therefore, that work indicated the link between DNA damage and repair and 6mA's functional role. The authors may add some comments or discussion on this point.
Last but not least, the authors may also need to discuss the reported evidence of DNA 6mA's function in mitochondria.
We thank the reviewer for these suggestions. We revised our introduction and include additional references and discussion points, as suggested by the reviewer.
Reviewer #3 (Recommendations for the authors):
Minor points:
(1) In general, the manuscript is too verbose, and the amount of text can be dramatically reduced/sharpened. The introduction in particular is too long.
We revised the manuscript and reduced text when appropriate.
(2) Each results section can also be condensed to improve clarity significantly. Indeed the results section reads like a 'Result & Discussion' section, which is then followed by a Discussion. Maybe the discussion section can be shortened to a 'conclusion'.
We revised the results section when appropriate and reworked the discussion.
Importantly, we revised the text related to Figure 3 as it does appear that Reviewer #3 did not appreciate key results present in this figure, specifically the orthogonal, mass spectrometry approach validating the discovery of 6mA DNA species (Figure 3F, G). We added a schematic as Figure 3F to further clarify this point as well.
(3) The accession number for sequencing data in GEO data should be provided.
The accession numbers is now provided in the manuscript. GSE282260.
(4) All figures are unnecessarily small and in some cases, supporting figures from the supplementary data should be moved into the main figure to improve clarity.
The figures are of high image quality and can be enlarged easily. If there are specific figures that the reviewer believes will improve clarity, we would be happy to move them.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Thank you for sharing a detailed review of our manuscript titled, Variations and predictability of epistasis on an intragenic fitness landscape. We have now carefully gone through the reviewers’ and the editor’s comments and have the following preliminary responses.
(1) Measurement noise in the folA fitness landscape. All three reviewers and the editors raise the important matter of incorporating measurement noise in the fitness landscape. The paper by Papkou and coworkers makes the fitness measurements of the landscape in six independent repeats. They show that the fitness data is highly correlated in each repeat, and use the weighted mean of the repeats to report their results. They do not study how measurement noise influences their findings. The results by Papkou and coworkers were our starting point, and hence, we built on the landscape properties reported in their study. As a result, we also analyse our results working with the same mean of the six independent measurements.
The main result of the work by Papkou and coworkers is that largest subgraph in the landscape has 514 fitness peaks.
We revisit this result by quantifying how measurement noise changes this number. By doing this, we note the subgraph contains only 127 peaks which are statistically significant. We define a sequence as a peak when its corresponding fitness is greater than all its one-distance neighbours with a p-value < 0.05. This shows that, as pointed out in the reviews, incorporating noise in the landscape results significantly changes how we view the landscape – a facet not included in Papkou et al and the current version of our manuscript.
Not incorporating measurement noise means that the entire landscape has 4055 peaks. When measurement noise is included in the analysis, this number reduces to 137, out of which 136 are high fitness backgrounds (functional).
In the revised version of our manuscript, we will incorporate measurement noise in our analysis. Through this, we will also address the concern regarding the use of an arbitrary cut-off to study “fluid” epistasis. However, we note that arbitrary cut-offs to define DFEs have been recently used (Sane et al., PNAS, 2023).
We also note that previous work with large scale landscapes (Wu et al, eLife, 2016) also reported a fitness landscape with a single experiment, with no repeats.
(2) Global nonlinearities and higher-order leading to fluid epistasis. Attempts at building models for higher-order epistasis from empirical data have largely been confined to landscapes of a limited data size. For example, Sailer & Harms, Genetics, 2017 propose models for higher-order epistasis from seven empirical data sets, each with less than a 100 data points. Another recent attempt (Park et al, Nat Comm, 2024) proposes rule for protein structure-function with 20 fitness landscapes. In this study, only one landscape which used fitness as a phenotype had ~160000 data points (of which only 42% were included for analysis). All other data sets which used fitness as a phenotype contained less than 10000 data points. While these statistical proposals of how higher-order epistasis operates exist, none of them are reliant of large scale, exhaustive network, like the one proposed by Papkou and coworkers.
In the edited manuscript, we will replace our arbitrary cut-off with results of statistical tests carried out based on measurement noise.
Global non-linearities shape evolutionary responses. We would like to emphasize that the goal of this work to study and understand how these global non-linearities result in patterns on a large fitness landscape by presenting the sum total of these fundamental factors in shaping statistical patterns.
While we understand that we may not have sufficiently explained the effects of global non-linearities on our results, we do not agree with the reviewer’s conclusion that our results are artifacts of these non-linearities. We will expand on the role of these nonlinearities on the patterns that we observe (like, fitness being bounded, as pointed out by reviewer 2, or differential impact of a mutation in functional vs. non-functional variants).
We also speculate that changing our arbitrary cut-off (selection coefficient of 0.05) to measurement noise will not alter our results qualitatively.
The question we address in our work is, therefore, how does the nature of epistasis change with genetic background over a large, exhaustive landscape. The nature of epistasis between two mutations is analysed in all 4<sup>7</sup> backgrounds. The causative agents for the change in epistasis will be context-dependent, depending on the precise nature of the two mutations and the background. For instance, a certain background might simply introduce a Stop codon in the sequence. Notwithstanding these precise, local mechanistic explanations, we seek to answer how epistasis changes statistically in a sequence. Investigating statistical patterns which explain switch in nature of epistasis in deep, exhaustive landscapes is a long-term goal of this research.
(3) Last, in our revised manuscript, we will address the reviewers’ other minor comments on the various aspects of the manuscript.
-
- Jan 2025
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
eLife Assessment
This important work proposes a neural network model of interactions between the prefrontal cortex and basal ganglia to implement adaptive resource allocation in working memory, where the gating strategies for storage are adjusted by reinforcement learning. Numerical simulations provide convincing evidence for the superiority of the model in improving effective capacity, optimizing resource management, and reducing error rates, as well as solid evidence for its human-like performance. The paper could be strengthened further by a more thorough comparison of model predictions with human behavior and by improved clarity in presentation. This work will be of broad interest to computational and cognitive neuroscientists, and may also interest machine-learning researchers who seek to develop brain-inspired machine-learning algorithms for memory.
We thank the reviewers for their thorough and constructive comments, which have helped us clarify, augment and solidify our work. Regarding the suggestion to include a “more thorough comparison with with human behavior”, we believe this comment reflects one of the reviewer’s suggestion to compare with sequential order effects. We now include a new section with simulations showing that the network exhibits clear recency effects in accordance with the literature, and where such recency effects are known to be related to WM interference and not due to passive decay. Overall our work makes substantial contact with human behavioral patterns that have been documented in the human literature (and which as far as we know have not been jointly captured by any one model), such as the shape of the error distributions, including probability of recall and variable precision; attraction to recently presented items, sensitivity to reinforcement history, set-size dependent chunking, recency effects, dopamine manipulation effects, as well of a range of human data linking capacity limitations to frontostriatal function. It also provides a theoretical proposal for the well established phenomenon of capacity limitations in humans, suggesting that they arise due to difficulty in WM management.
Below we address each reviewer individually, responding to each comment and providing the relevant location in the paper that the changes and additions were made. Reviewer responses are included in blue/bold for clarity.
Public Reviews:
Reviewer 1:
Thank you for your comments. We appreciate your statements of the strengths of this paper and your suggestions to improve this paper.
First, the method section appears somewhat challenging to follow. To enhance clarity, it might be beneficial to include a figure illustrating the overall model architecture. This visual aid could provide readers with a clearer understanding of the overall network model.
Additionally, the structure depicted in Figure 2 could be potentially confusing. Notably, the absence of an arrow pointing from the thalamus to the PFC and the apparent presence of two separate pathways, one from sensory input to the PFC and another from sensory input to the BG and then to the thalamus, may lead to confusion. While I recognize that Figure 2 aims to explain network gating, there is room for improvement in presenting the content accurately.
As suggested, we added a figure (new figure 2) illustrating the overall model architecture before expanding it to show the chunking circuitry. This figure also shows the projections from thalamus to PFC (we preserve the previous figure 2, now figure 3, as an example sequence of network gating decisions, in more abstract form to help facilitate a functional understanding of the sequence of events without too much clutter). We also made several other general clarifications to the methods sections to make it more transparent and easier to follow, as per your suggestions.
Still, for the method part, it would enhance clarity to explicitly differentiate between predesigned (fixed) components and trainable components. Specifically, does the supplementary material state that synaptic connection weights in striatal units (Go&NoGo) are trained using XCAL, while other components, such as those in the PFC and lateral inhibition, are not trained (I found some sentences in 'Limitations and Future Directions')?
We have now explicitly specified learned and fixed components. We have further explained the role of XCAL and how striatal Go/NoGo weights are trained. We have also added clarification on how gating policies are learned via eligibility traces and synaptic tags.
I'm not sure about the training process shown in Figure 8. It appears that the training may not have been completed, given that the blue line representing the chunk stripe is still ascending at the endpoint. The weights depicted in panel d) seem to correspond with those shown in panels b) and c), no? Then, how is the optimization process determined to be finished? Alternatively, could it be stated that these weight differences approach a certain value asymptotically? It would be better to clarify the convergence criteria of the optimization process.
The training process has been clarified and we specify (in the last paragraph of the Base PBWM Model) how we determine when training is complete. We also can confirm that the network behavior has stabilized in learning even if the Go/NoGo weights continue to grow over time for the chunked layer (due to imperfect performance and reinforcement of the chunk gating strategy).
Reviewer 2:
Thank you for your comments. We appreciate your notes on the strengths of the paper and your suggestions to help improve the paper.
The model employs a spiking neural network, which is relatively complex. Additionally, while this paper validates the effectiveness of chunking strategies used by the brain to enhance working memory efficiency through computational simulations, further comparison with related phenomena observed in cognitive neuroscience experiments on limited working memory capacity, such as the recency effect, is necessary to verify its generalizability.
Thank you for proposing we add in more connections with human WM. Based on your specific recommendation, we have included the section “Network recapitulates human sequential effects in working memory.” where we discuss recency effects in human working memory and how our model recapitulates this effect. We have also made the connections to human data and human work more explicit throughout the manuscript (Figure 4c). As noted in response to the assessment, we believe our model does make contact with a wide variety of cognitive neuroscience data in human WM, such as the shape of the error distributions, including probability of recall and variable precision; attraction to recently presented items, sensitivity to
reinforcement history, set-size dependent chunking, recency effects, and dopamine manipulation effects, as well of a range of human data linking capacity limitations to frontostriatal function. It also provides a theoretical proposal for the well established phenomenon of capacity limitations in humans, suggesting that they arise due to difficulty in WM management.
Recommendations For The Authors:
Reviewer 1:
I appreciate the authors' clear discussion of the limitations of this work in the section "Limitations and Future Directions". The development of a comprehensive model framework to overcome these constraints should require a separate paper, though, I am curious if the authors have attempted any experiments, such as using two identically designed chunking layers, that could partially support the assumptions presented in the paper.
Expanding the number of chunking layers is a great future direction. We felt that it was most effective for this paper to begin with a minimal set up with proof of concept. We hypothesize that, given our results, a reinforcement learning algorithm would be able to learn to select the best level of abstraction (degree of chunking) in more continuous form, but would require more experience across a range of tasks to do so.
I'm not sure whether it's appropriate that "Frontostriatal Chunking Gating..." precedes "Dopamine Balance is...", maybe it would be better to reverse the order thus avoiding the need to mention the role of dopamine before delving into the details. Additionally, including a summary at the end of the Introduction, outlining how the paper is organized, could provide readers with a clear roadmap of the forthcoming content.
We appreciate this suggestion. After careful thought, we wanted to preserve the order because we felt it was important to make the direct connection between set size and stripe usage following the discussion on performance based on increasing stripes.
The authors could improve the overall polish of the paper. The equations in the Method section are somewhat confusing: Eq. (2) appears incorrect, as it lacks a weight w_i and n should presumably be in the denominator. For Eq. (3), the comma should be replaced with ']'... It would be advisable to cross-reference these equations with the original O'Reilly and Frank paper for consistency.
Thank you for pointing out the errors in the method equations- those equations were indeed rendering incorrectly. We have fixed this problem.
Additionally, there are frequent instances of missing figure and reference citations (many '?'s), and it would be beneficial to maintain consistent citation formatting throughout the paper: sometimes citations are presented as "key/query coding (Traylor, Merullo, Frank, and Pavlick, 2024; see also Swan and Wyble, 2014)", while other times they are written as "function (O'Reilly & Frank, 2006)"...
Lastly, there is an empty '3.1' section in the supplementary material that should be addressed.
The citation issues were fixed. The supplementary information was cleaned and the missing section was removed. Thank you for mentioning these errors.
Reviewer 2:
Thank you for the following recommendations and suggestions. We respond to each individual point based on the numbering system used in your review.
(1) This paper utilizes the experimental paradigm of visual working memory, in which different visual stimuli are sequentially loaded into the working memory system, and the accuracy of memory for these stimuli is calculated.
The authors could further plot the memory accuracy curve as the number of items (N) increases, under both chunking and non-chunking strategies. This would allow for the examination of whether memory accuracy suddenly declines at a specific value of N (denoted as Nc), thereby determining the limited capacity of working memory within this experimental framework, which is about 4 different items or chunks. Additionally, it could be investigated whether the value of Nc is larger when the chunking strategy is applied.
We have included an additional plot (Probability of Recall) as a supplemental figure to Figure 5 to explore the probability of recall as a function of set size for both chunking and no chunking models. This plot shows that the chunking model increases probability of recall when set size exceeds allocated capacity (but that nevertheless both models show decreases in recall with set size, consistent with the literature).
(2) The primacy effect or recency effect observed in the experiments and traditional working memory models, including the slot model and the limited resource model, should be examined to see if it also appears in this model.
The literature on human working memory shows a prevalent recency effect (but not a primacy effect, which is thought to be due to episodic memory, and which is not included in our model). We have added a section showing that our model demonstrates clear recency effects.
(3) The construction of the model and the single neuron dynamics involved need further refinement and optimization:
Model Description: The details of the model construction in the paper need to be further elaborated to help other researchers better understand and apply the model in reproducing or extending research. Specifically:
a) The construction details of different modules in the model (such as Input signal, BG, striatum, superficial PFC, deep PFC) and the projection relationships between different modules. Adding a diagram to illustrate the network construction would be beneficial.
To aid in the understanding of the model construction and model components, we have included an additional figure (Figure 1: Base Model) that explains the key layers and components of the model. We have also altered the overall model figures to show more clearly that the inputs project to both PFC and striatum, to highlight that information is temporarily represented in superficial PFC layers even before striatal gating, which is needed for storage after the input decays.
We have expanded the methods and equations and we also provide a link to the model github for purposes of reproducibility and sharing.
A base model figure was added to specify key connections.
a) The numbers of excitatory and inhibitory neurons within different modules and the connections between neurons.
We added clarification on the type of connections between layers (specifying which are fixed and learned). We have also added the size of layers in a new appendix section “Layer Sizes and Inner Mechanics”
b) The dynamics of neurons in different modules need to be elaborated, including the description of the dynamic equations of variables (such as x) involved in single neuron equations.
Single neuron dynamics are explained in equations 1-4. Equations 5-6 explain how activation travels between layers. The specific inhibitory dynamics in the chunking layer are elaborated in Figure 4. PBWM Model and Chunking Layer Details. The Appendix section “Neural model implementational details” states the key equations, neural information and connectivity. Since there is a large corpus of background information underlying these models, we have linked the Emergent github and specifically the Computational Cognitive Neuroscience textbook which has a detailed description of all equations. For the sake of paper length and understability, we chose the most relevant equations that distinguish our model.
c) The selection of parameters in the model, especially those that significantly affect the model's performance.
The appendix section hyperparameter search details some of the key parameters and why those values were chosen.
d) The model employs a sequential working memory paradigm, the forms of external stimuli involved in the encoding and recalling phases (including their mathematical expressions, durations, strengths, and other parameters) need to be elaborated further.
We appreciate this comment. We have expanded the Appendix section “Continuous Stimuli” to include the details of stimuli presentation (including durations etc).
(4) The figures in the paper need optimization. For example, the size of the schematic diagram in Figure 2 needs to be enlarged, while the size of text such as "present stimulus 1, 2, recall stimulus 1" needs to be reduced. Additionally, the citation of figures in the main text needs to be standardized. For example, Figure 1b, Figure 1c, etc., are not cited in the main text.
The task sequence figure (original Figure 2) has been modified and following your suggestions, text sizes have been modified.
(5) Section 3.1 in the appendix is missing.
Supplemental section 3.1 is removed.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public Review):
MacDonald et al., investigated the consequence of double knockout of substance P and CGRPα on pain behaviors using a newly created mouse model. The investigators used two methods to confirm knockout of these neuropeptides: traditional immunolabeling and a neat in vitro assay where sensory neurons from either wildtype or double knock are co-cultured with substance P "sniffer cells", HEK cells stably expressing NKR1 (a substance P receptor), GCaMP6s and Gα15. It should be noted that functional assays confirming CGRPα knockout were not performed. Subsequently, the authors assayed double knockout mice (DKO) and wildtype (WT) mice in numerous behavioral assays using different pain models, including acute pain and itch stimuli, intraplanar injection of Complete Freund's Adjuvant, prostaglandin E2, capsaicin, AITC, oxaliplatin, as well as the spared nerve injury model. Surprisingly, the authors found that pain behaviors did not differ between DKO and WT mice in any of the behavioral assays or pain paradigms. Importantly, female and male mice were included in all analyses. These data are important and significant, as both substance P and CGRPα have been implicated in pain signaling, though the magnitude of the effect of a single knockout of either gene has been variable and/or small between studies.
The conclusions of the study are largely supported by the data; however, additional experimental controls and analyses would strengthen the authors claims.
We thank the reviewer for their insightful comments and have answered them below.
(1) The authors note that single knockout models of either substance P or CGRPα have produced variable effects on pain behaviors that are study-dependent. Therefore, it would have strengthened the study if the authors included these single knockout strains in a side-by-side analysis (in at least some of the behavioral assays), as has been done in prior studies in the field when using double- or triple-knockout mouse models (for example, see PMID: 33771873). If in the authors hands, single knockouts of either peptide also show no significant differences in pain behaviors, then the finding that double knockouts also do not show significant differences would be less surprising.
In our study, we found no phenotypic differences between WT and DKO mice, suggesting Substance P and CGRPα are largely dispensable for pain behavior. We agree that if we had we observed significant changes in behavior, it would have been interesting to examine the effects of knocking out each gene individually to determine which peptide is responsible for the phenotype. However, given the double deletion had no effect, we can predict that loss of each alone would have no or minor effects. In line with this, a more recent study that comprehensively phenotyped the Calca KO mouse found no deficits in a range of danger related behaviors (PMID: 34376756). Overall, as we are reporting negative data about the Double KO, we do not believe extensive studies of the single KOs is necessary to support the findings of our paper.
(2) It is unclear why the authors only show functional validation of substance P knockout using "sniffer" cells, but not CGRPα. Inclusion of this experiment would have added an additional layer of rigor to the study.
Imaging of CGRPα release is more challenging using the ‘sniffer’ approach because functional CGRP receptors require the expression of two genes: Calcrl (or Calcr) along with Ramp1. We now have succeeded in generating a new stable cell line expressing Calcrl and Ramp1, along with GCaMPs and human Galpha15 and include new data in the revised Figure 1F-H and Figure Supplement 1B. These cells respond robustly to CGRPalpha, but not to SP. In contrast, the existing SP cell line responds to SP but not CGRPalpha. Capsaicin evokes a strong response in these cells in co-culture with DRGs. This response is dramatically reduced in the DKO. This data therefore confirms our mice have a loss of CGRPalpha signaling as indicated by IHC.
(3) The authors should be a bit more reserved in the claims made in the manuscript. The main claim of the study is that "CGRPα and substance P are not required for pain transmission." However, the authors also note that neuropeptides can have opposing effects that may produce a net effect of no change. In my view, the data presented show that double knockout of substance P and CGRPα do not affect somatic pain behaviors, but do not preclude a role for either of these molecules in pain signaling more generally. Indeed, the authors also note that these neuropeptides could be involved in nociceptor crosstalk with the immune or vascular systems to promote headache. The authors only assayed pain responses to glabrous skin stimulation. How the DKO mice would behave in orofacial pain assays, migraine assays, visceral pain assays, or bone/joint pain assays, for example, was not tested. I do not suggest the authors include these experiments, only that they address the limitations/weaknesses of their study more thoroughly.
The reviewer makes an important point that we agree with. Our study assesses acute and chronic pain in peptide DKO mice lacking Substance P and CGRPα. Most of our data focuses on the hindpaw as pain in the paw is the gold-standard approach for phenotyping pain targets and numerous well-validated chronic pain models have been developed for this body site. However, to extend the conclusions to other tissues, we did also look at visceral pain and GI distress using acetic acid and LiCl models (Figure 2J and Figure 2 supplement). We agree with the reviewer that given the utility of CGRP monoclonal antibodies, migraine experiments would be interesting for future studies using these mice, a point we highlight in the discussion. Bone/joint pain is also clearly important from a translational perspective, but outside the scope of the current study.
(4) A more minor but important point, the authors do not describe the nature of the WT animals used. Are the littermates or a separately maintained colony of WT animals? The WT strain background should be included in the methods section.
The WT strain are C57/BL6j from Jackson Lab. This has been added to the methods.
Reviewer #2 (Public Review):
Summary:
The paper aimed to examine the effect of co-ablating Substance P and CGRPα peptides on pain using Tac1 and Calca double knockout (DKO) mice. The authors observed no significant changes in acute, inflammatory, and neuropathic pain. These results suggest that Substance P and CGRPα peptides do not play a major role in mediating pain in mice. Moreover, they reveal that the lack of behavioral phenotype cannot be explained by the redundancy between the two peptides, which are often co-expressed in the same neuron
Strengths:
The paper uses a straightforward approach to address a significant question in the field. The authors confirm the absence of Substance P and CGRPα peptides at the levels of DRG, spinal cord, and midbrain. Subsequently, they employ a comprehensive battery of behavioral tests to examine pain phenotypes, including acute, inflammatory, and neuropathic pain. Additionally, they evaluate neurogenic inflammation by measuring edema and extravasation, revealing no changes in DKO mice. The data are compelling, and the study's conclusions are well-supported by the results. The manuscript is succinct and well-presented.
We thank the reviewer for their enthusiasm for the importance of our work.
Reviewer #3 (Public Review):
In this study, the authors were assessing the role of double global knockout of substance P and CGPRα on the transmission of acute and chronic pain. The authors first generated the double knockout (DKO) mice and validated their animal model. This is then followed by a series of acute and chronic pain assessments to evaluate if the global DKO of these neuropeptides are important in modulating acute and chronic pain behaviors. Authors found that these DKO mice Substance P and CGRPα are not required for the transmission of acute and chronic pain although both neuropeptides are strongly implicated in chronic pain. This study does provide more insight into the role of these neuropeptides on chronic pain processing, however, more work still needs to be done. (see the comments below).
We thank the reviewer for their detailed and constructive feedback, and below outline the steps we have taken to answer their concerns.
(1) In assessing the double KO (result #1), why are different regions of the brains shown for substance P and CGRPα (for example, midbrain for substance P and amygdala for CGRPα)? Since the authors mentioned that these peptides co-expressed in the brain (as in the introduction), shouldn't the same brain regions be shown for both IHC? It would be ideal if the authors could show both regions (midbrain and amygdala) in addition to the DRG and spinal cord for both peptides in their findings.<br /> In addition, since this is double KO, the authors should show more representative IHC-stained brain regions (spanning from the anterior to posterior).
We could not co-stain both SP and CGRP in the same sections as the DKO mouse has endogenous GFP and RFP fluorescence, limiting us to one channel (far red). Specifically, we use a Calca KO that is a Cre:GRP knock-in/knockout (Chen et al 2018, PMID30344042) and Tac1 KO is a tagRFP knock-in/knockout (Wu et al 2018 PMID29485996). This is why we show different brain sections.
(2) It is also unclear as to why the authors only assessed the loss of substance P signaling in the double KO mice. Shouldn't the same be done for CGRPα signaling? Either the authors assess this, or the authors have to provide clear explanations as to why only substance P signaling was assessed.
As noted in our response to Reviewer 1, imaging of CGRP release is more challenging using the ‘sniffer’ approach because functional CGRP receptors require the expression of two genes: Calcrl (or Calcr) along with Ramp1. We have now generated this cell line and performed the experiment (see revised Figure 1 and Figure 1 Supplement).
(3) Has these animal's naturalistic behavior been assessed after the double KO (food intake, sleep, locomotion for example)? I think this is important as changes to these naturalistic behaviors can affect pain processes or outcomes.
We agree that assessment of naturalistic behavior including food intake, sleep and locomotion would be interesting to look at in DKO mice. However, our study is focused on acute and chronic pain behavior of these animals, and therefore a comprehensive phenotypic assessment of naturalistic home-cage behavior is outside the scope of our study.
(4) Figure 2H: The authors acknowledge that there is a trend to decrease with capsaicin-evoked coping-like responses. However, a close look at the graph suggests that the lack of significance could be driven by 1 mouse. Have the authors run an outlier test? Alternatively, the authors should consider adding more n to these experiments to verify their conclusions.
We were reluctant to add more animals searching for significance. Instead, we investigated the potential phenotype further by looking at cfos staining in the cord and found no differences (Figure 2, supplement 1). This result suggests loss of the two peptides does not grossly disrupt capsaicin evoked pain signal transmission between the nociceptor and post-synaptic dorsal neurons in the spinal cord.
(5) Similarly, the values for WT in the evoked cFos activity (Figure 2- Suppl Figure 1) are pretty variable. Considering that the n number is low (n = 5), authors should consider adding more n.<br /> Also, since the n number is low in this experiment (eg. 5 vs 4), does this pass the normality test to run a parametric unpaired t-test? Either the authors increase their n numbers or run the appropriate statistical test.
As described in the statistical tables, the Shapiro-Wilk test indicates these data do pass the normality test. Therefore, we retain the use of the unpaired t test, which demonstrates no significant difference between the groups.
(6) In most of the results, authors ran a parametric test despite the low n number. Authors have to ensure that they are carrying out the appropriate statistical test for their dataset and n number.
We now provide a table of the statistical results, which provides detailed information about all statistical tests performed in this study. For experiments where we make a single comparison between the two distributions (WT vs DKO), we have run a Shapiro-Wilk test. Where the data from both groups pass the normality test, we retain the use of the unpaired t test. Where the Shapiro-Wilk test indicates data from either group are unlikely to be normally distributed, we now use a Mann-Whitney U test to compare the groups, as this non-parametric test makes no assumptions about the underlying distribution.
Many experiments involved two factors (genotype, and e.g. temperature, drug, time-point). These data were analyzed in the original submission using 2-WAY ANOVA or Repeated Measures 2-WAY ANOVA, followed by post-hoc Sidak’s tests to compute p values adjusted for multiple comparisons. Because there is no widely agreed non-parametric alternative to 2-WAY ANOVA for analyzing data with two factors and that enables us to account for multiple comparisons, we used 2-WAY ANOVA as is typically used in the field for these kinds of experiments. We reasoned sticking with the 2-WAY ANOVA was the best course of action based on information provided by the statistical software used for this study - https://www.graphpad.com/support/faq/with-two-way-anova-why-doesnt-prism-offer-a-nonparametric-alternative-test-for-normality-test-for-homogeneity-of-variances-test-for-outliers/
We note that regardless of the test, our conclusion that there are no major changes in acute or chronic pain behaviors are clear and strongly supported.
(7) Along the same line of comment with the previous, authors should increase the n number for DKO for staining (Figure 4) as n number is only 3 and there is variability in the cFos quantification in the ipsilateral side.
We believe this is not necessary as the finding is clear that there is no difference.
(8) Authors should provide references for statement made in Line 319-321 as authors mentioned that there are accumulating evidence indicating that secretion of these neuropeptides from nociceptor peripheral terminals modulates immune cells and the vasculature in diverse tissues.
We now provide several references to primary papers and reviews supporting this statement.
(9) Authors state that the sample size used was similar to those from previous studies, but no references were provided. Also, even though the sample sizes used were similar, I believe that the right statistic test should be used to analyze the data.
We have now cited several classic studies phenotyping mouse KOs in pain in the methods that used similar sample sizes. As detailed above, we have taken the reviewer’s feedback on board and performed normality testing to ensure the correct statistical test is used for each experiment.
(10) In the discussion, the authors noted that knocking out of a gene remains the strongest test of whether the molecule is essential for a biological phenomenon. At the same time, it was acknowledged that Substance P infusion into the spinal cord elicits pain, but it is analgesic in the brain. The authors might want to expand more on this discussion, including how we can selectively assess the role of these neuropeptides in areas of interest. For example, knocking out both Substance P and CGRPα in selected areas instead of the global KO since there are reported compensatory effects.
This is highlighted in the closing paragraph: “Emerging approaches to image and manipulate these molecules (Girven et al., 2022; Kim et al., 2023), as well as advances in quantitating pain behaviors (Bohic et al., 2023; MacDonald and Chesler, 2023), may ultimately reveal the fundamental roles of neuropeptides in generating our experience of pain.” The Kim preprint (now published, and so the citation has been updated in the text) describes a method of inactivating neuropeptide transmission in select brain regions in a cell-type specific manner.
Recommendations for the authors:
Reviewer #2 (Recommendations For The Authors):
I do not have any major comments. My minor comments are as follows:
(1) What was the control group for all behavioral studies? Was it WT from an independent colony or one of the littermates was used for generating controls?
We used C57/Bl6 mice from Jax. This is now mentioned in methods.
(2) In Fig. 2H, it seems that the effect will become significant if several mice are added.
We are reluctant to add mice searching for significance. Sample sizes were determined before we collected the data blind.
(3) There is no figure 3, but two figures 4.
Thank you. This has been corrected.
(4) Multiple typos in the legend for figure 4 (lines 234-254). Line 242 (& n=8 (3M, 3F)), line 243 (swelling and plasma), line 252 ((n=8 for) & n=6 for DKO (4M, 4F)).
Thank you. This has been corrected.
(5) In Figure 4 (lines 273-285), the contralateral side is mentioned in B but no images are shown.
Thank you. We removed the mention.
(6) Although ligand knockouts cannot be compared directly with receptor inhibition, the readers could benefit from discussing studies of receptor ablation and/or pharmacological inhibition.
We do discuss the classic studies of receptor KO, and the clinical data on receptor blockers here –
“However, selective antagonists of the Substance P receptor NKR1 failed to relieve chronic pain in human clinical trials (Hill, 2000). Although CGRP monoclonal antibodies and receptor blockers have proven effective for subsets of migraine patients, their usefulness for other types of pain in humans is unclear (De Matteis et al., 2020; Jin et al., 2018). In line with this, knockout mice deficient in Substance P, CGRPα or their receptors have been reported to display some pain deficits, but the analgesic effects are neither large nor consistent between studies (Cao et al., 1998; De Felipe et al., 1998; Guo et al., 2012; Salmon et al., 2001, 1999; Zimmer et al., 1998).”
Reviewer #3 (Recommendations For The Authors):
Minor comments:
(1) Figure 1E: What does chambers mean? Additionally, are the 12 chambers equally from the male and female samples (6 from male and 6 from female)?
We have changed this to well. Each replicate is an individual well from 8 well chamber slide. In all these experiments, the wells are approximately evenly distributed by mouse, because from each mouse we cultured around 8 wells’ worth of DRGs.
(2) Figure 1D: What does low and high mean in the Hargreaves test?
These refer to a low and high active intensity of the radiant heat stimulus. Number is now described in the methods. 40 and 55 in the intensity units used by the instrument.
(3) Figure 2-Suppl Figure 1: Authors should provide a bigger image of the image so that it is clearer to the readers.
We think the image is of a reasonable size and comparable to the images used elsewhere in the paper.
(4) Authors should consider labeling their supplementary figures in running numbers or combining supplementary figures together to avoid confusion. For example, Figure 2-Supplementary Figure 1 and Figure 2- Supplementary Figure 2 can be combined as just Supplementary Figure 2.
We agree with the reviewer this would be clearer, but we have followed eLife’s convention for labelling and numbering supplements.
(5) Figure 3 is mislabeled as Figure 4.
Thank you. We have corrected this.
(6) Only female mice were used in the CFA experiment, which does not go in line with the rest of the results which consist of both sexes.
We have repeated the experiment with additional male mice. To be consistent with the von frey data, these were followed for 7 days, and so the figure now shows a 7 day time course.
(7) Typo in line 243. The word "and" is subscript.
Thank you. We have corrected this.
(8) There is a typo in the legend for Figure 4 where E is labeled I, G is labeled as F, and J is labeled as J.
Thank you. We have corrected this.
(9) Authors should specify what "several weeks" means (Line 263).
It means three weeks. We tested to 21 days. We will replace with three.
(10) Authors should specify what "one day" means (Line 267). For example, how many days after the intraplantar oxaliplatin treatment? Also, authors should justify why that specific time point was selected or have a reference for it.
This means one day after - 24 hours. Please see PMID: 33693512. Two references are provided in them methods.
(11) Figure 4 legend: authors should again be specific on what "prolonged" entails (Line 277).
We have replaced prolonged with 30 minutes brushing. Specifically, 3 x 10 min stim period, with 1 min rest between stim. It is in the methods.
(12) In the methods section, authors state that both male and female mice were used for all experiments. However, only female mice were used in the CFA experiment (see minor comment #6). Authors should verify and correct this.
This is correct. We only used female mice for one of the groups. We have since repeated with males, now included in the data.
(13) Authors should be more specific in the methods section on how long the habituation per day, how many days and what were the mice habituation to (experimenter, room, chamber, etc)?
As noted in the methods, mice are habituated for at least an hour to the chambers, and thus implicitly to the room. We do not perform explicit habituation to the investigator such as repeated handling.
(14) Authors need to provide more information on the semi-automated procedure they are referring to in Line 397. Also, authors should also provide the criteria for cFos quantification (eg. Intensity, etc). If this has been published before, they should provide the reference.
We have added this. We used the ‘Find maxima’ and ‘Analyze particles’ functions in FIJI, followed by a manual curation step.
(15) How much acetone was applied and how was it applied to the paw? (Line 495)
We used the same applicator (1ml syringe with a well at the top) to generate a droplet of acetone that was used for all mice. This has been added to methods.
(16) Authors should specify the amount of capsaicin injected in μl (Line 500).
20 ul. We have added this.
(17) Authors should explain or reference why they are analyzing the 15 min interval between 5 and 20 minutes for injection (Line507-508).
Acetic acid behaviour lasts around 30 mins in our hands. We chose the 15 minute interval because it reduces burdensome hand scoring time by 50% versus doing the whole 30 mins. We reasoned that in the first 5 mins post injection the animal behaviour may be contaminated by stress related to handling, injection and return to chamber. Thus, 5 and 20 minutes provided a sensible time-frame for scoring the behavior when it is at its peak.
(18) Authors have to provide more information/explanation on how they decide on the conditioned taste aversion protocol. Like why they do 30 mins exposure to a single water-containing bottle followed 90 mins exposure to both bottles. If this has been published before, they should provide the reference.
We read dozens of different published protocols in the literature, and piloted one that was something of an amalgam of some of them with various adaptations of convenience. Because it worked on our first attempt, we stuck to it. The advantage of the CTA assay is it is incredibly robust to changes in the specificities of the paradigm, evincing the clear survival value of learning to avoid tastes that make you sick.
(19) Authors again should provide more detail in their methods section.
a. Specify the time frame that they are assessing here (Line 533).
This can be seen in the Figure. 0 to 120 mins. We have added it to the methods.
b. How long were the mice allowed to recover post-SNI before mechanical allodynia was assessed (Line 545)?
This is apparent in the figures. 2 days to 21 days. We have added it to the methods.
c. How much of the oxaliplatin was injected into the mice?
40 ug / 40 ul (see PMID:33693512)
Editors note: Reviewers agreed that addressing the concerns about power, outliers, and statistics, as well as functional validation of CGRPα would raise the strength of evidence to compelling, and inclusion of comparison to single KO would raise it to exceptional.
Should you choose to revise your manuscript, please check to ensure full statistical reporting including exact p-values wherever possible alongside the summary statistics (test statistic and df) and 95% confidence intervals. These should be reported for all key questions and not only when the p-value is less than 0.05.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Regarding a future revised version, we plan to:
-
refer to the "MoMac-VERSE" study according to the original report.
-
modify incorrectly formatted references.
-
modify the text to acknowledge the heterogeneity and variability in the response of primary cells to the GSK3 inhibitor.
-
improve the explanation of the reanalysis of single cell RNAseq data in Figure 7 (ref. 47, GSE120833), and re-adapt the graphs of the scRNA-Seq data using different plot parameters (e.g., reduction = "umap.scvi") to provide a more friendly-user visualization including bona fide macrophage markers for each subpopulation.
-
include statistical analyses in each one of the figure legends
-
perform additional analyses (e.g., dose-response and kinetics of CHIR-99021 effects) and mechanistic studies (e.g., role of proteasome) to further dissect the re-programming ability of the GSK3/MAFB axis.
-
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
eLife Assessment
This study provides valuable insights into the behavioral, computational, and neural mechanisms of regime shift detection, by identifying distinct roles for the frontoparietal network and ventromedial prefrontal cortex in sensitivity to signal diagnosticity and transition probabilities, respectively. The findings are supported by solid evidence, including an innovative task design, robust behavioral modeling, and well-executed model-based fMRI analyses, though claims of neural selectivity would benefit from more rigorous statistical comparisons. Overall, this work advances our understanding of how humans adapt belief updating in dynamic environments and offers a framework for exploring biases in decision-making under uncertainty.
Thank you for reviewing our manuscript. We appreciate the editors’ assessment and the reviewers’ constructive comments. Below we address the reviewers’ comments. In particular, we addressed Reviewer 1’s comments on (1) neural selectivity by performing statistical comparisons and (2) parameter estimation by providing more details on how the system-neglect model was parameterized. We addressed Reviewer 2’s comments on (1) our neuroimaging results regarding frontoparietal network and (2) model comparisons.
Public Reviews:
Reviewer #1 (Public review):
Summary:
The study examines human biases in a regime-change task, in which participants have to report the probability of a regime change in the face of noisy data. The behavioral results indicate that humans display systematic biases, in particular, overreaction in stable but noisy environments and underreaction in volatile settings with more certain signals. fMRI results suggest that a frontoparietal brain network is selectively involved in representing subjective sensitivity to noise, while the vmPFC selectively represents sensitivity to the rate of change.
Strengths:
(1) The study relies on a task that measures regime-change detection primarily based on descriptive information about the noisiness and rate of change. This distinguishes the study from prior work using reversal-learning or change-point tasks in which participants are required to learn these parameters from experiences. The authors discuss these differences comprehensively.
Thank you for recognizing our contribution to the regime-change detection literature and our effort in discussing our findings in relation to the experience-based paradigms.
(2) The study uses a simple Bayes-optimal model combined with model fitting, which seems to describe the data well.
Thank you for recognizing the contribution of our Bayesian framework and system-neglect model.
(3) The authors apply model-based fMRI analyses that provide a close link to behavioral results, offering an elegant way to examine individual biases.
Thank you for recognizing our execution of model-based fMRI analyses and effort in using those analyses to link with behavioral biases.
Weaknesses:
My major concern is about the correlational analysis in the section "Under- and overreactions are associated with selectivity and sensitivity of neural responses to system parameters", shown in Figures 5c and d (and similarly in Figure 6). The authors argue that a frontoparietal network selectively represents sensitivity to signal diagnosticity, while the vmPFC selectively represents transition probabilities. This claim is based on separate correlational analyses for red and blue across different brain areas. The authors interpret the finding of a significant correlation in one case (blue) and an insignificant correlation (red) as evidence of a difference in correlations (between blue and red) but don't test this directly. This has been referred to as the "interaction fallacy" (Niewenhuis et al., 2011; Makin & Orban de Xivry 2019). Not directly testing the difference in correlations (but only the differences to zero for each case) can lead to wrong conclusions. For example, in Figure 5c, the correlation for red is r = 0.32 (not significantly different from zero) and r = 0.48 (different from zero). However, the difference between the two is 0.1, and it is likely that this difference itself is not significant. From a statistical perspective, this corresponds to an interaction effect that has to be tested directly. It is my understanding that analyses in Figure 6 follow the same approach.
Relevant literature on this point is:
Nieuwenhuis, S, Forstmann, B & Wagenmakers, EJ (2011). Erroneous analyses of interactions in neuroscience: a problem of significance. Nat Neurosci 14, 1105-1107. https://doi.org/10.1038/nn.2886
Makin TR, Orban de Xivry, JJ (2019). Science Forum: Ten common statistical mistakes to watch out for when writing or reviewing a manuscript. eLife 8:e48175. https://doi.org/10.7554/eLife.48175
There is also a blog post on simulation-based comparisons, which the authors could check out: https://garstats.wordpress.com/2017/03/01/comp2dcorr/
I recommend that the authors carefully consider what approach works best for their purposes. It is sometimes recommended to directly compare correlations based on Monte-Carlo simulations (cf Makin & Orban). It might also be appropriate to run a regression with the dependent variable brain activity (Y) and predictors brain area (X) and the model-based term of interest (Z). In this case, they could include an interaction term in the model:
Y = \beta_0 + \beta_1 \cdot X + \beta_2 \cdot Z + \beta_3 \cdot X \cdot Z
The interaction term reflects if the relationship between the model term Z and brain activity Y is conditional on the brain area of interest X.
Thank you for this great suggestion. We tested the difference in correlation both parametrically and nonparametrically. Their results were identical. In our parametric test, we used the Fisher z transformation to transform the difference in correlation coefficients to the z statistic (Fisher, 1921). That is, for two correlation coefficients, r<sub>blue</sub> (the correlation between behavioral slope, and neural slope estimated at change-consistent signals; sample size n<sub>blue</sub>) and r<sub>red</sub>, (the correlation between behavioral slope, and neural slope estimated at change-consistent signals; sample size n<sub>red</sub>), the z statistic of the difference in correlation is given by
We found that among the five ROIs in the frontoparietal network, two of them, namely the left IFG and left IPS, the difference in correlation was significant (one-tailed z test; left IFG: z=1.8355, p=0.0332; left IPS: z=2.3782, p=0.0087). For the remaining three ROIs, the difference in correlation was not significant (dmPFC: z=0.7594, p=0.2238 ; right IFG: z=0.9068, p=0.1822; right IPS: z=1.3764, p=0.0843). We chose one-tailed test because we already know the correlation under the blue signals was significantly greater than 0. Hence the alternative hypothesis is that r<sub>blue</sub>–r<sub>red</sub>>0.
In our nonparametric test, we performed nonparametric bootstrapping to test for the difference in correlation. That is, we resampled with replacement the dataset (subject-wise) and used the resampled dataset to compute the difference in correlation. We then repeated the above for 100,000 times so as to obtain the distribution of the correlation difference. We then tested for significance and estimated p-value based on this distribution. Consistent with our parametric tests, here we also found that the difference in correlation was significant in left IFG and left IPS (left IFG: r<sub>blue</sub>–r<sub>red</sub>=0.46, p=0.0496; left IPS: r<sub>blue</sub>–r<sub>red</sub>=0.5306, p=0.0041), but was not significant in dmPFC, right IFG, and right IPS (dmPFC: r<sub>blue</sub>–r<sub>red</sub>=0.1634, p=0.1919; right IFG: r<sub>blue</sub>–r<sub>red</sub>=0.2123, p=0.1681; right IPS: r<sub>blue</sub>–r<sub>red</sub>=0.3434, p=0.0631).
We will update these results in the revised manuscript. In summary, we found that the left IFG and left IPS in the frontoparietal network differentially responded to signals consistent with change (blue signals) compared with signals inconsistent with change (red signals). First, the neural sensitivity to signal diagnosticity measured when signals consistent with change appeared (blue signals) significantly correlated with individual subjects’ behavioral sensitivity to signal diagnosticity (r<sub>blue</sub>). By contrast, neural sensitivity to signal diagnosticity measured when signals inconsistent with change appeared did not significantly correlate with behavioral sensitivity (r<sub>red</sub>). Second, the difference in correlation, r<sub>blue</sub>–r<sub>red</sub>, was statistically significant between correlation obtained at signals consistent with change and correlation obtained at signals inconsistent with change.
Another potential concern is that some important details about the parameter estimation for the system-neglect model are missing. In the respective section in the methods, the authors mention a nonlinear regression using Matlab's "fitnlm" function, but it remains unclear how the model was parameterized exactly. In particular, what are the properties of this nonlinear function, and what are the assumptions about the subject's motor noise? I could imagine that by using the inbuild function, the assumption was that residuals are Gaussian and homoscedastic, but it is possible that the assumption of homoscedasticity is violated, and residuals are systematically larger around p=0.5 compared to p=0 and p=1. Relatedly, in the parameter recovery analyses, the authors assume different levels of motor noise. Are these values representative of empirical values?
We thank the reviewer for this excellent point. The reviewer touched on model parameterization, assumption of noise, and parameter recovery analysis, which we answered below.
On our model was parameterized
We parameterized the model according to the system-neglect model in Eq. (2) and estimated the alpha parameter separately for each level of transition probability and the beta parameter separately for each level of signal diagnosticity. As a result, we had a total of 6 parameters (3 alpha and 3 beta parameters) in the model. The system-neglect model is then called by fitnlm so that these parameters can be estimated. The term ‘nonlinear’ regression in fitnlm refers to the fact that you can specify any model (in our case the system-neglect model) and estimate its parameters when calling this function. In our use of fitnlm, we assume that the noise is Gaussian and homoscedastic (the default option).
On the assumptions about subject’s motor noise
We wish to emphasize that we did not call the noise ‘motor’ because it can be estimation noise as well. Regardless, in the context of fitnlm, we assume that the noise is Gaussian and homoscedastic.
On the possibility that homoscedasticity is violated
In the revision, we plan to examine this possibility (residuals larger around p=0.5 compared with p=0 and p=1).
On whether the noise levels in parameter recovery analysis are representative of empirical values
To address the reviewer’s question, we conducted a new analysis using maximum likelihood estimation to estimate the noise level of each individual subject. We proceeded in the following steps. First, for each subject separately, we used the parameter estimates of the system-neglect model to compute the period-wise probability estimates of regime shift. As a reminder, we referred to a ‘period’ as the time when a new signal appeared during a trial (for a given transition probability and signal diagnosticity). Each trial consisted of 10 successive periods. Second, we computed the period-wise likelihood, the probability of observing the subject’s actual probability estimate given the probability estimate predicted by the system-neglect model and the noise level. Here we define noise as the standard deviation of a Gaussian distribution centered at the model-predicted probability estimate. We then summed over all periods the negative logarithm of likelihood and used MATLAB’s minimization algorithm (the ‘fmincon’ function) to obtain the noise estimate that minimized the sum of negative log likelihood (thus the noise estimate that maximized the sum of log likelihood). Across subjects, we found that the mean noise estimate was 0.1480 and ranged from 0.0816 to 0.3239. The noise estimate of each subject can be seen in the figure below.
Author response image 1.
Compared with our original parameter recovery analysis where the maximum noise level was set at 0.1, our data indicated that some subjects’ noise was larger than this value. Therefore, we expanded our parameter recovery analysis to include noise levels beyond 0.1 to up to 0.3. We found good parameter recovery across different levels of noise, with the Pearson correlation coefficient between the input parameter values used to simulate data and the estimated parameter values greater 0.95 (Supplementary Fig. S3). The results will be updated in Supplementary Fig. S3.
Author response image 2.
Parameter recovery. We simulated probability estimates according to the system-neglect model. We used each subject’s parameter estimates as our choice of parameter values used in the simulation. Using simulated data, we estimated the parameters (𝛼 and 𝛽) in the system-neglect model. To examine parameter recovery, we plotted the parameter values we used to simulate the data against the parameter estimates we obtained based on simulated data and computed their Pearson correlation. Further, we added different levels of Gaussian white noise with standard deviation 𝜎 = 0.01, 0.05, 0.1,0.2, 0.3 to the simulated data to examine parameter recovery and show the results respectively in Fig. A, B, C, D, and E. For each noise level, we show the parameter estimates in the left two graphs. In the right two graphs, we plot the parameter estimates based on simulated data against the parameter values used to simulate the data. A. Noise 𝜎 = 0.01. B. Noise 𝜎 = 0.05. C. Noise 𝜎 = 0.1. D. Noise 𝜎 = 0.2. E. Noise 𝜎 = 0.3.
We will update the parameter recovery section (p. 44) and Supplementary Figure S3 to incorporate these new results:
“We implemented 5 levels of noise with σ={0.01,0.05,0.1,0.2,0.3} and examined the impact of noise on parameter recovery for each level of noise. These noise levels covered the range of empirical noise levels we estimated from the subjects. To estimate each subject’s noise level, we carried out maximum likelihood estimation in the following steps. First, for each subject separately, we used the parameter estimates of the system-neglect model to compute the period-wise probability estimates of regime shift. As a reminder, we referred to a ‘period’ as the time when a new signal appeared during a trial (for a given transition probability and signal diagnosticity). Each trial consisted of 10 successive periods. Second, we computed the period-wise likelihood, the probability of observing the subject’s actual probability estimate given the probability estimate predicted by the system-neglect model and the noise level. Here we define noise as the standard deviation of a Gaussian distribution centered at the model-predicted probability estimate. We then summed over all periods the negative natural logarithm of likelihood and used MATLAB’s minimization algorithm (the ‘fmincon’ function) to obtain the noise estimate that minimized the sum of negative log likelihood (thus the noise estimate that maximized the sum of log likelihood). Across subjects, we found that the mean noise estimate was 0.1480 and ranged from 0.0816 to 0.3239 (Supplementary Figure S3).”
The main study is based on N=30 subjects, as are the two control studies. Since this work is about individual differences (in particular w.r.t. to neural representations of noise and transition probabilities in the frontoparietal network and the vmPFC), I'm wondering how robust the results are. Is it likely that the results would replicate with a larger number of subjects? Can the two control studies be leveraged to address this concern to some extent?
It would be challenging to use the control studies to address the robustness concern. The control studies were designed to address the motor confounds. They were less suitable, however, for addressing the individual difference issue raised by the reviewer. We discussed why this is the case below.
The two control studies did not allow us to examine individual differences – in particular with respect to neural selectivity of noise and transition probability – and therefore we think it is less likely to leverage the control studies. Having said that, it is possible to look at neural selectivity of noise (signal diagnosticity) in the first control experiment where subjects estimated the probability of blue regime in a task where there was no regime change (transition probability was 0). However, the fact that there were no regime shifts in the first control experiment changed the nature of the task. Instead of always starting at the Red regime in the main experiment, in the first control experiment we randomly picked the regime to draw the signals from. It also changed the meaning and the dynamics of the signals (red and blue) that would appear. In the main experiment the blue signal is a signal consistent with change, but in the control experiment this is no longer the case. In the main experiment, the frequency of blue signals is contingent upon both noise and transition probability where blue signals are less frequent than red signals because of the small transition probabilities. But in the first control experiment, the frequency of blue signals is not less frequent because the regime was blue in half of the trials. Due to these differences, we do not see how analyzing the control experiments could help in establishing robustness because we do not have a good prediction as to whether and how the neural selectivity would be impacted by these differences.
We can address the issue of robustness through looking at the effect size. In particular, with respect to individual differences in neural sensitivity of transition probability and signal diagnosticity, since the significant correlation coefficients between neural and behavioral sensitivity were between 0.4 and 0.58 for signal diagnosticity in frontoparietal network (Fig. 5C), and -0.38 and -0.37 for transition probability in vmPFC (Fig. 5D), the effect size of these correlation coefficients was considered medium to large (Cohen, 1992). Cohen, J. (1992). A power primer. Psychological Bulletin, 112(1), 155-159.
It seems that the authors have not counterbalanced the colors and that subjects always reported the probability of the blue regime. If so, I'm wondering why this was not counterbalanced.
We are aware of the reviewer’s concern. The first reason we did not do these (color counterbalancing and report blue/red regime balancing) was to not confuse the subjects in an already complicated task. Balancing these two variables also comes at the cost of sample size, which was the second reason we did not do it. Although we can elect to do these balancing at the between-subject level to not impact the task complexity, we could have introduced another confound that is the individual differences in how people respond to these variables. This is the third reason we were hesitant to do these counterbalancing.
Reviewer #2 (Public review):
Summary:
This paper focuses on understanding the behavioral and neural basis of regime shift detection, a common yet hard problem that people encounter in an uncertain world. Using a regime-shift task, the authors examined cognitive factors influencing belief updates by manipulating signal diagnosticity and environmental volatility. Behaviorally, they have found that people demonstrate both over and under-reaction to changes given different combinations of task parameters, which can be explained by a unified system-neglect account. Neurally, the authors have found that the vmPFC-striatum network represents current belief as well as belief revision unique to the regime detection task. Meanwhile, the frontoparietal network represents cognitive factors influencing regime detection i.e., the strength of the evidence in support of the regime shift and the intertemporal belief probability. The authors further link behavioral signatures of system neglect with neural signals and have found dissociable patterns, with the frontoparietal network representing sensitivity to signal diagnosticity when the observation is consistent with regime shift and vmPFC representing environmental volatility, respectively. Together, these results shed light on the neural basis of regime shift detection especially the neural correlates of bias in belief update that can be observed behaviorally.
Strengths:
(1) The regime-shift detection task offers a solid ground to examine regime-shift detection without the potential confounding impact of learning and reward. Relatedly, the system-neglect modeling framework provides a unified account for both over or under-reacting to environmental changes, allowing researchers to extract a single parameter reflecting people's sensitivity to changes in decision variables and making it desirable for neuroimaging analysis to locate corresponding neural signals.
Thank you for recognizing our task design and our system-neglect computational framework in understanding change detection.
(2) The analysis for locating brain regions related to belief revision is solid. Within the current task, the authors look for brain regions whose activation covary with both current belief and belief change. Furthermore, the authors have ruled out the possibility of representing mere current belief or motor signal by comparing the current study results with two other studies. This set of analyses is very convincing.
Thank you for recognizing our control studies in ruling out potential motor confounds in our neural findings on belief revision.
(3) The section on using neuroimaging findings (i.e., the frontoparietal network is sensitive to evidence that signals regime shift) to reveal nuances in behavioral data (i.e., belief revision is more sensitive to evidence consistent with change) is very intriguing. I like how the authors structure the flow of the results, offering this as an extra piece of behavioral findings instead of ad-hoc implanting that into the computational modeling.
Thank you for appreciating how we showed that neural insights can lead to new behavioral findings.
Weaknesses:
(1) The authors have presented two sets of neuroimaging results, and it is unclear to me how to reason between these two sets of results, especially for the frontoparietal network. On one hand, the frontoparietal network represents belief revision but not variables influencing belief revision (i.e., signal diagnosticity and environmental volatility). On the other hand, when it comes to understanding individual differences in regime detection, the frontoparietal network is associated with sensitivity to change and consistent evidence strength. I understand that belief revision correlates with sensitivity to signals, but it can probably benefit from formally discussing and connecting these two sets of results in discussion. Relatedly, the whole section on behavioral vs. neural slope results was not sufficiently discussed and connected to the existing literature in the discussion section. For example, the authors could provide more context to reason through the finding that striatum (but not vmPFC) is not sensitive to volatility.<br />
We thank the reviewer for the valuable suggestions.
With regard to the first comment, we wish to clarify that we did not find frontoparietal network to represent belief revision. It was the vmPFC and ventral striatum that we found to represent belief revision ( in Fig. 3). For the frontoparietal network, we identified its involvement in our task through finding that its activity correlated with strength of change evidence (Fig. 4) and individual subjects’ sensitivity to signal diagnosticity (Fig. 5). Conceptually, these two findings reflect how individuals interpret the signals (signals consistent or inconsistent with change) in light of signal diagnosticity. This is because (1) strength of change evidence is defined as signals (+1 for signal consistent with change, and -1 for signal inconsistent with change) multiplied by signal diagnosticity and (2) sensitivity to signal diagnosticity reflects how individuals subjectively evaluate signal diagnosticity. At the theoretical level, these two findings can be interpreted through our computational framework in that both the strength of change evidence and sensitivity to signal diagnosticity contribute to estimating the likelihood of change (Eqs. 1 and 2). We added a paragraph in Discussion to talk about this.
We will add on p. 35:
“For the frontoparietal network, we identified its involvement in our task through finding that its activity correlated with strength of change evidence (Fig. 4) and individual subjects’ sensitivity to signal diagnosticity (Fig. 5). Conceptually, these two findings reflect how individuals interpret the signals (signals consistent or inconsistent with change) in light of signal diagnosticity. This is because (1) strength of change evidence is defined as signals (+1 for signal consistent with change, and -1 for signal inconsistent with change) multiplied by signal diagnosticity and (2) sensitivity to signal diagnosticity reflects how individuals subjectively evaluate signal diagnosticity. At the theoretical level, these two findings can be interpreted through our computational framework in that both the strength of change evidence and sensitivity to signal diagnosticity contribute to estimating the likelihood of change (Eqs. 1 and 2).”
With regard to the second comment, we added discussion on the behavioral and neural slope comparison. We pointed out previous papers conducting similar analysis (Vilares et al., 2012; Ting et al., 2015; Yang & Wu, 2020), their findings and how they relate to our results. Vilares et al. found that sensitivity to prior information (uncertainty in prior distribution) in the orbitofrontal cortex (OFC) and putamen correlated with behavioral measure of sensitivity to prior. In the current study, transition probability acts as prior in the system-neglect framework (Eq. 2) and we found that ventromedial prefrontal cortex represents subjects’ sensitivity to transition probability. Together, these results suggest that OFC and vmPFC are involved in the subjective evaluation of prior information in both static (Vilares et al., 2012) and dynamic environments (current study). In addition, we added to the literature by showing that distinct from vmPFC in representing sensitivity to transition probability or prior, the frontoparietal network represents how sensitive individual decision makers are to the diagnosticity of signals in revealing the true state (regime) of the environment.
We will add on p. 36:
“In the current study, our psychometric-neurometric analysis focused on comparing behavioral sensitivity with neural sensitivity to the system parameters (transition probability and signal diagnosticity). We measured sensitivity by estimating the slope of behavioral data (behavioral slope) and neural data (neural slope) in response to the system parameters. Previous studies had adopted a similar approach (Vilares et al., 2012; Ting et al., 2015; Yang & Wu, 2020). For example, Vilares et al. (2012) found that sensitivity to prior information (uncertainty in prior distribution) in the orbitofrontal cortex (OFC) and putamen correlated with behavioral measure of sensitivity to the prior. In the current study, transition probability acts as prior in the system-neglect framework (Eq. 2) and we found that ventromedial prefrontal cortex represents subjects’ sensitivity to transition probability. Together, these results suggest that OFC and vmPFC are involved in the subjective evaluation of prior information in both static (Vilares et al., 2011) and dynamic environments (current study). In addition, we added to the literature by showing that distinct from vmPFC in representing sensitivity to transition probability or prior, the frontoparietal network represents how sensitive individual decision makers are to the diagnosticity of signals in revealing the true state (regime) of the environment.”
(2) More details are needed for behavioral modeling under the system-neglect framework, particularly results on model comparison. I understand that this model has been validated in previous publications, but it is unclear to me whether it provides a superior model fit in the current dataset compared to other models (e.g., a model without \alpha or \beta). Relatedly, I wonder whether the final result section can be incorporated into modeling as well - i.e., the authors could test a variant of the model with two \betas depending on whether the observation is consistent with a regime shift and conduct model comparison.
Thank you for the great suggestion.
To address the reviewer’s question on model comparison, we tested 4 variants of the system-neglect model and incorporated them into the final result section. The original system-neglect model and its four models are:
– Original system-neglect model: 6 total parameters, 3 beta parameters (one for each level of signal diagnosticity) and 3 alpha parameters (one for each level of transition probability).
– M1: System-neglect model with signal-dependent beta parameters (alpha parameters, and beta parameters separately estimated at change-consistent and change-inconsistent signals): 9 total parameters, 3 beta parameters for change-consistent signals, 3 beta parameters for change-inconsistent signals, and 3 alpha parameters.
– M2: System-neglect model with signal-dependent alpha parameters (alpha parameters separately estimated at change-consistent and change-inconsistent signals, and beta parameters): 9 total parameters, 3 alpha parameters for change-consistent signals, 3 alpha parameters for change-inconsistent signals, and 3 beta parameters.
– M3: System-neglect model without alpha parameters (only the beta parameters): 3 total parameters, all are beta parameters (one for each level of signal diagnosticity).
– M4: System-neglect model without beta parameters (only the alpha parameters): 3 total parameters, all are alpha parameters (one for each level of transition probability).
We compared these four models with the original system-neglect model. In the figure below, we plot where is the Akaike Information Criterion (AIC) of one of the new models minus the AIC of the original model. ∆AIC<0 indicates that the new model is better than the original model. By contrast, ∆AIC>0 suggests that the new model is worse than the original model.
Author response image 3.
When we separately estimated the beta parameter (M1) for change-consistent signals and change-inconsistent signals, we found that its AIC is significantly smaller than the original model (p<0.01). The same was found for the model where we separately estimated the alpha parameters for change-consistent and change-inconsistent signals (M2). When we took out either the alpha (M3) or the beta parameters (M4), we found that these models were worse than the original model (p<0.01). In summary, we found that models where we separately estimated the alpha/beta parameters for change-consistent and change-inconsistent signals were better than the original model. This is consistent with the insight the neural data provided.
To show these results, we will add a new figure (Figure 7) in the revised manuscript.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public review):
Summary:
Insects and their relatives are commonly infected with microbes that are transmitted from mothers to their offspring. A number of these microbes have independently evolved the ability to kill the sons of infected females very early in their development; this male killing strategy has evolved because males are transmission dead-ends for the microbe. A major question in the field has been to identify the genes that cause male killing and to understand how they work. This has been especially challenging because most male-killing microbes cannot be genetically manipulated. This study focuses on a male-killing bacterium called Wolbachia. Different Wolbachia strains kill male embryos in beetles, flies, moths, and other arthropods. This is remarkable because how sex is determined differs widely in these hosts. Two Wolbachia genes have been previously implicated in male-killing by Wolbachia: oscar (in moth male-killing) and wmk (in fly male-killing). The genomes of some male-killing Wolbachia contain both of these genes, so it is a challenge to disentangle the two.
This paper provides strong evidence that oscar is responsible for male-killing in moths. Here, the authors study a strain of Wolbachia that kills males in a pest of tea, Homona magnanima. Overexpressing oscar, but not wmk, kills male moth embryos. This is because oscar interferes with masculinizer, the master gene that controls sex determination in moths and butterflies. Interfering with the masculinizer gene in this way leads the (male) embryo down a path of female development, which causes problems in regulating the expression of genes that are found on the sex chromosomes.
We would like to thank you for evaluating our manuscript.
Strengths:
The authors use a broad number of approaches to implicate oscar, and to dissect its mechanism of male lethality. These approaches include:
(1) Overexpressing oscar (and wmk) by injecting RNA into moth eggs.
(2) Determining the sex of embryos by staining female sex chromosomes.
(3) Determining the consequences of oscar expression by assaying sex-specific splice variants of doublesex, a key sex determination gene, and by quantifying gene expression and dosage of sex chromosomes, using RNASeq.
(4) Expressing oscar along with masculinizer from various moth and butterfly species, in a silkmoth cell line.
This extends recently published studies implicating oscar in male-killing by Wolbachia in Ostrinia corn borer moths, although the Homona and Ostrinia oscar proteins are quite divergent. Combined with other studies, there is now broad support for oscar as the male-killing gene in moths and butterflies (i.e. order Lepidoptera). So an outstanding question is to understand the role of wmk. Is it the master male-killing gene in insects other than Lepidoptera and if so, how does it operate?
Thank you for your comments. Wolbachia strains often carry wmk genes, but as observed in this study, the homologs in Homona showed no apparent MK ability. These showed strong male lethality in D. melanogaster, but it is still unclear whether the genes are the master male-killing gene in Diptera. It is also possible that the genes show toxicities in other lepidopteran insects as well as in other insect taxa. Further functional validation assays in different insects are warranted to clarify whether wmk shows toxicity in different insect taxa. We have also discussed the functions of wmk in the Discussion section (lines 301-306).
Weaknesses:
I found the transfection assays of oscar and masculinizer in the silkworm cell line (Figure 4) to be difficult to follow. There are also places in the text where more explanation would be helpful for non-experts (see recommendations).
Thank you for your suggestion. We have thoroughly revised the manuscript to address all the questions, comments and suggestions you raised in “recommendations”. In particular, we have revised the section on the transfection assays of Oscar and Masc in Bm-N4 cells (result section “Hm-oscar suppresses the masculinizing functions of lepidopteran masc genes” starts on line 214 and Fig. 4; materials and methods section ”Transfection assays and quantification of BmIMP<sup>M</sup>”, starts on line 483). We have also provided more detailed explanations for non-experts in some contexts (in response to your recommendation). We believe that the resulting revisions have significantly improved the quality and comprehensiveness of our manuscript.
Reviewer #2 (Public review):
Summary:
Wolbachia are maternally transmitted bacteria that can manipulate host reproduction in various ways. Some Wolbachia induce male killing (MK), where the sons of infected mothers are killed during development. Several MK-associated genes have been identified in Homona magnanima, including Hm-oscar and wmk-1-4, but the mechanistic links between these Wolbachia genes and MK in the native host are still unclear.
In this manuscript, Arai et al. show that Hm-oscar is the gene responsible for Wolbachia-induced MK in Homona magnanima. They provide evidence that Hm-Oscar functions through interactions with the sex determination system. They also found that Hm-Oscar disrupts sex determination in male embryos by inducing female-type dsx splicing and impairing dosage compensation. Additionally, Hm-Oscar suppresses the function of Masc. The manuscript is well-written and presents intriguing findings. The results support their conclusions regarding the diversity and commonality of MK mechanisms, contributing to our understanding of the mechanisms and evolutionary aspects of Wolbachia-induced MK.
We would like to thank you for evaluating our manuscript.
Strengths/weaknesses:
(1) The authors found that transient overexpression of Hm-oscar, but not wmk-1-4, in Wolbachia-free H. magnanima embryos induces female-biased sex ratios. These results are striking and mirror the phenotype of the wHm-t infected line (WT12). However, Table 1 lists the "male ratio," while the text presents the "female ratio" with standard deviation. For consistency, the calculation term should be uniform, and the "ratio" should be listed for each replicate.
We have revised the first results section (Hm-oscar induces female-biased sex ratios, starting from line 147) accordingly to maintain the consistency in the calculation term. In the revised manuscript, the 'male ratio' is now consistently used, in alignment with Fig. 1. In addition, we have included all sex ratio information (number of males and females) in the supplementary data file for transparency and clarity.
(2) The error bars in Figure 3 are quite large, and the figure lacks statistical significance labels. The authors should perform statistical analysis to demonstrate that Hm-oscar-overexpressed male embryos have higher levels of Z-linked gene expression.
The large error bar on each chromosome (Fig.3a-d) likely reflect the overall variation in expression levels across different transcripts. Accordingly, we have included statistical data for Figure 3 based on the Steel-Dwass test for expression levels. However, displaying statistical significance directly on the whisker plots would make the figure too cluttered due to the numerous combinations. Instead, we have provided all the statistical data in the supplementary data file. To further support the claim that Z-linked genes are more highly expressed in wHm-t-infected/Hb-Oscar-injected embryos, we have included the expression data for a Z-linked gene tpi, along with its statistical data in the revised manuscript (Fig. 3e, lines 210-212).
(3) The authors demonstrated that Hm-Oscar suppresses the masculinizing functions of lepidopteran Masc in BmN-4 cells derived from the female ovaries of Bombyx mori. They should clarify why this cell line was chosen and its biological relevance. Additionally, they should explain the rationale for evaluating the expression levels of the male-specific BmIMP variant and whether it is equivalent to dsx.
Thank you for your suggestion. We selected BmN-4 cell line because previous studies have established it as a reliable model for investigating the functions of lepidopteran masc genes and the interactions between masc and Oscar genes (Katsuma et al., 2019; 2022). In addition, BmIMP<sup>M</sup> is a male-specific regulator of the male-type dsx, making it an ideal target for assessing the 'maleness' induced by transfection of the masc gene in female-derived BmN-4 cells (Suzuki et al., 2010; Katsuma et al., 2015). We have included more detailed background information in the revised manuscript and have thoroughly revised this section (Hm-oscar suppresses the masculinizing functions of lepidopteran masc genes, starting at line 214) and Figure 4 for better clarity.
(4) Although the authors show that Hm-oscar is involved in Wolbachia-induced MK in Homona magnanima and interacts with the sex determination system in lepidopteran insects, the precise molecular mechanism of Hm-oscar-induced MK remains unclear. Further studies are needed to elucidate how Hm-oscar regulates Homona magnanima genes to induce MK, though this may be beyond the scope of the current manuscript.
Based on our findings and previous studies in Homona, Ostrinia and Bombyx (Arai et al., 2023a; Katsuma et al., 2023; Kiuchi et al., 2014), we hypothesize that the molecular mechanisms underlying _w_Hm-induced MK are likely linked to impaired dosage compensation caused by the inhibition of Masc function by the Hm-Oscar protein. While the precise mechanisms remain unclear, unbalanced Z-linked gene expression due to the impaired dosage compensation (i.e., 2-fold higher Z-linked gene expression compared to normal males) is known to be lethal for lepidopteran males (Kiuchi et al., 2014; Fukui et al., 2015; Visser et al., 2021). We have outlined this hypothesis in the Discussion section (lines 245-254).
Reviewer #3 (Public review):
Summary:
Overall, this is a clearly written manuscript with nice hypothesis testing in a non-model organism that addresses the mechanism of Wolbachia-mediated male killing. The authors aim to determine how five previously identified male-killing genes (encoded in the prophage region of the wHm Wolbachia strain) impact the native host, Homona magnanima moths. This work builds on the authors' previous studies in which:
(1) They tested the impact of these same wHm genes via heterologous expression in Drosophila melanogaster.
(2) They examined the activity of other male-killing genes (e.g., from the wFur Wolbachia strain in its native host: Ostrinia furnacalis moths).
Advances here include identifying which wHm gene most strongly recapitulates the male-killing phenotype in the native host (rather than in Drosophila), and the finding that the Hm-Oscar protein has the potential for male-killing in a diverse set of lepidopterans, as inferred by the cell-culture assays.
Strengths:
Strengths of the manuscript include the reverse genetics approaches to dissect the impact of specific male-killing loci, and the use of a "masculinization" assay in Lepidopteran cell lines to determine the impact of interactions between specific masc and oscar homologs.
We would like to thank you for evaluating our manuscript.
Weaknesses:
My major comments are related to the lack of statistics for several experiments (and the data normalization process), and opportunities to make the manuscript more broadly accessible.
Thank you for your suggestions. We have thoroughly revised the manuscript to provide clearer explanations for non-experts. In addition, we have included more detailed statistical data for Figure 3 and Figure 4 based on the Steel-Dwass tests. For Figure 3a-d, displaying statistical significance directly on the whisker plots would make the figure too cluttered due to the numerous combinations. Therefore, we have provided all the statistical data in the supplementary data file. To further support the claim that Z-linked genes are more highly expressed in w_Hm-t-infected/Hm-Oscar-injected embryos, we have included the expression data for a Z-linked gene _tpi, along with its statistical data in the revised manuscript (Fig.3e, lines 210-212). Regarding Figure 4, we have revised the Figure based on the reviewer’s suggestions, and provided more detailed information on how the expression data were analyzed (Transfection assays and quantification of BmIMP<sup>M</sup>, lines 495-520). We have also included more detailed background information on the assay system (Hm-oscar suppresses the masculinizing functions of lepidopteran masc genes, lines 215-237). Although we did not observe statistical significance based on the Steel-Dwass test, likely due to limited replicates, the observed changes in the IMP gene expression remain clearly evident.
The manuscript I think would be much improved by providing more details regarding some of the genes and cross-lineage comparisons. I know some of this is reported in previous publications, but some summary and/or additional analysis would make this current manuscript much more approachable for a broader audience, and help guide readers to specific important findings. For example, a graphic and/or more detail on how the wmk/oscar homologs (within and across Wolbachia strains) differ (e.g., domains, percent divergence, etc) would be helpful for contextualizing some of the results. I recognize the authors discuss this in parts (e.g., lines 223-227), but it does require some bouncing between sections to follow. Similarly, the experiments presented in Figure 4 indicate that Hm-oscar has broad spectrum activity: how similar are the masc proteins from these various lepidopterans? Are they highly conserved? Rapidly evolving? Do the patterns of masc protein evolution provide any hints at how Oscar might be interacting with masc?
Thank you for your valuable suggestion. To address this, we have included a visualization of the structural differences between the Oscar and wmk homologs in Figure 1a of the revised manuscript. In addition, we have included more detailed information for these genes and revised the introduction (lines 110-114; 124-137) and discussion (lines 255-266) to provide a clearer and more comprehensive overview. We have also described the similarity of the Masc proteins and Oscar proteins that we used, which is now reflected in the revised Figure 4b and 4d. More detailed information on these proteins is available in the supplementary data. Notably, Masc proteins exhibit high sequence variability with conserved domains (Figure 4d). Previous study identified the N-terminal region of Masc as crucial for the Oscar function (Katsuma et al., 2022). The wide spectrum of the actions of Hm-Oscar likely stems from these conserved structures of Masc, but the effects might have undergone evolutionary tuning through interactions with the native host as discussed in lines 293-294.
It is clear from Figure 1 that the combinations of wmk homologs do not cause male killing on their own. Did the authors test if any of the wmk homologs impact the MK phenotype of oscar? It looks like a previous study tested this in wFur (noted in lines 250-252), but given that the authors also highlight the differences between the wFur-oscar and Hm-oscar proteins, this may be worth testing in this system. Related to this, what is the explanation for why there would be 4 copies of wmk in Hm?
Thank you for your valuable suggestion. Unfortunately, we have not yet tested the effects of co-expression of wmk and Oscar. Due to a technical issue, the mixing of multiple constructs results in a reduced amount of mRNA (i.e. mixing wmk-3 and Hm-Oscar at the same concentration results in a 2-fold lower concentration in mRNA for both genes compared to mono-injected groups). In addition, we have previously tested injecting mRNA at the twofold higher concentration (i.e. 2 ug/ul mRNA), which resulted in very low hatchability regardless of the genes. Katsuma et al (2022) tested the effect of wmk on the sex determination system, but did not test the effect of co-injection/transfection of wmk and Oscar. Considering the results of this and previous studies (Katsuma et al., 2022; Arai et al., 2023), it is likely that the targets of the wmk and oscar genes are different (as discussed in lines 267-289). Co-injection of wmk and oscar may not produce additive effects. Nevertheless, we would like to test the results in future studies using the Drosophila system as well.
As you point out, it is an interesting point that the moth-derived MK Wolbachia w_Hm-t encodes four _wmk genes, although they have no apparent effect on host survival. The exact functional relevance of these wmk homologs remains unclear. However, they may play a role in Wolbachia biology as transcriptional regulators, given that they encode HTH domains. Wolbachia generally encode several wmk homologs in their genome, regardless of whether they induce MK. This suggests that the functions of the wmk genes may be 'suppressed' in certain Wolbachia-host systems. The wmk and Hm-oscar genes are located within a prophage region, and some wmk genes are tandemly arrayed (as described in Arai et al., 2023). These wmk homologs may have increased in number by horizontal phage transfer, and the region containing wmk and adjacent sequences may act as a genomic island for virulence. So far, the function of wmk homologs has only been tested in D. melanogaster and H. magnanima, and further studies in other Wolbachia-host systems are highly warranted to test whether wmk exerts MK effects in other insect models. These points have been briefly discussed in the revised manuscript (lines 301-306; 318-320).
Why are some of the broods male-biased (2/3) rather than ~50:50? (Lines 170-175, Figure 2a). For example, there is a strong male bias in un-hatched oscar-injected and naturally infected embryos, whereas the control uninfected embryos have normal 50:50 sex ratios. It is difficult to interpret the rate of male-killing given that the sex ratios of different sets of zygotes are quite variable.
The observed male-biased sex ratios in unhatched embryos are due to the occurrence of MK during embryogenesis. In the unhatched groups, the skew towards males reflects that fact that the male embryos were targeted and killed by Wolbachia/Oscar, leading to a surplus of unhatched male embryos. Conversely, hatched individuals show a higher proportion of females because many of the males were eliminated during embryogenesis. Thus, the unhatched embryos are more male-biased, while the hatched individuals are more female-biased in the Hm-oscar/_w_Hm-t treated groups. We have revised the relevant section (Males are killed mainly at the embryonic stage, lines 179-186) and provided more detailed information to clarify this explanation.
Figure 2b - it appears there are both male and female bands in the HmOsc male lane. I think this makes sense (likely a partial phenotype due to the nature of the overexpression approach), but this is worth highlighting, especially in the context of trying to understand how much of the MK phenotype might be recapitulated through these methods. Related, there is no negative control for this PCR.
Thank you for your suggestion. As you noted, a faint dsx-M band is visible in the Hm-oscar treated group in Figure 2b. This is consistent with previous findings by Arai et al. (2023), which reported that male embryos with low-density w_Hm-t showed double bands of _dsx-M and dsx-F, similar to what we observed in this study. This information has been included in the revised manuscript in lines 196-198, as follows:
“Notably, male embryos expressing Hm-oscar also exhibited weak male-type dsx splicing in addition to the female-type splicing, resembling the previously observed pattern in male embryos infected with low-titer _w_Hm-t (Arai et al., 2023a).”
Also, we appreciate your comment regarding the missing of negative control. The figure has now been revised as we realised that the negative control lane had been lost during the preparation of the figure. We also included the relevant molecular marker information in both the figure legends and Figure 2b.
It appears the RNA-seq analysis (Figure 3) is based on a single biological replicate for each condition. And, there are no statistical comparisons that support the conclusions of a shift in dosage compensation. Finally, it is unclear what exactly is new data here: the authors note "The expression data of the wHm-t-infected and non-infected groups were also calculated based on the transcriptome data included in Arai et al. (2023a)" - So, are the data in Figure 3c and 3d a re-print of previous data? The level of dosage compensation inferred by visually comparing the control conditions in 3b and 3d does not appear consistent. With only one biological replicate library per condition, what looks like a re-print of previous data, and no statistical comparisons, this is a weakly supported conclusion.
Thank you for your suggestion. In this study, we generated the RNA-seq data for the Hm-oscar/GFP-injected groups, but did not sequence the w_Hm-t-infected/NSR lines. Instead, the previously generated RNA-seq data of _w_Hm-t-infected/NSR (Arai et al., 2023) were re-analyzed (rather than simply reprinted) to evaluate whether the expression patterns of _Hm-oscar-injected and w_Hm-t-infected groups are similar. We have revised the Results section (_Hm-oscar impairs dosage compensation in male embryos, lines 200-212), the Materials and methods section (Quantification of Z chromosome-linked genes, lines 454-456), and the figure legends to provide more precise information about this analysis.
Although we did not perform replicates for the RNA-seq comparisons, it is important to note that each RNA-seq sample contains 50-60 male/female individuals. We believe the results are still robust and clearly indicative of the trends we observe. This was further supported by the quantification of Hmtpi gene expression, which we have visualized in Figure 3e (and lines 210-212). As you noted, the expression patterns in Figure 3b (GFP injected) and Figure 3d (NSR) are not completely identical. This discrepancy may be due to the differences between injection treatments and natural infections. Nevertheless, both treatments are consistent in showing that gene expressions on the Z chromosome (Chr01 and Chr15) are not upregulated.
We have also added more detailed statistical data for Figure 3 based on the Steel-Dwass tests. For Figure 3a-d, however, showing the statistical significance directly on the whisker plots would create excessive clutter due to the numerous combinations of chromosomes. Instead, we have provided the full statistical data in the supplementary data file. Furthermore, to support/strengthen our conclusion that Z-linked genes are highly expressed in w_Hm-t-infected/_Hm-Oscar-injected embryos, we have included expression data for the Z-linked gene tpi, along with statistical data, in the revised manuscript (Fig. 3e, lines 210-212).
In Figure 4: There are no statistics to support the conclusions presented here. Additionally, the data have gone through a normalization process, but it is difficult to follow exactly how this was done. The control conditions appear to always be normalized to 100 ("The expression levels of BmImpM in the Masc and Hm-Oscar/Oscar co-transfected cells were normalized by setting each Masc-transfected cell as 100"). I see two problems with this approach:
(1) This has eliminated all of the natural variation in BmImpM expression, which is likely not always identical across cells/replicates.
(2) How then was the percentage of BmImpM calculated for each of the experimental conditions? Was each replicate sample arbitrarily paired with a control sample? This can lead to very different outcomes depending on which samples are paired with each other. The most appropriate way to calculate the change between experimental and control would be to take the difference between every single sample (6 total, 3 control, 3 experimental) and the mean of the control group. The mean of the control can then be set at 100 as the authors like, but this also maintains the variability in the dataset and then eliminates the issue of arbitrary pairings. This approach would also then facilitate statistical comparisons which is currently missing.
Thank you for your suggestion. As you pointed out in (1), the previous analysis did indeed eliminate the natural variation in BmIMP-M expression. In the revised manuscript and Figure 4, we have reanalyzed the data following your suggestion and have described the variation across replicates.
For (2), the data shown in the previous manuscript were normalized to 100 for each Masc-treated group. In doing so, each replicate sample was arbitrarily paired with a control sample from the same cell lot to account for variations that might occur due to differences in cell lots. However, following your recommendation, we have revised the figure to set the average of the Hm-masc treated group to 100, rather than using arbitrary pairings. More detailed normalization procedures have been provided in the section 'Transfection assays and quantification of BmIMP' (lines 483-520). Additionally, we have provided more detailed background information on the assay system in lines 218-223. Although we did not observe statistical significance based on the Steel-Dwass test, likely due to the limited number of replicates, the differences in IMP gene expression between the Masc-treated and Masc&Hm-oscar-treated groups remain evident.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
Line 38: change to: 'Wolbachia are maternally transmitted'.
Revised accordingly (line 38).
Line 69: remove 'seemingly'.
Revised accordingly (line 69).
Paragraph starting line 123: I don't think this is so clear to a reader who is not familiar with the work and system. It would be helpful to more clearly explain that candidate male-killing genes from Wolbachia that infect Homona were inserted into Drosophila melanogaster, and that their expression was then induced, with interesting patterns (and that it can be a bit difficult to interpret the transgenic expression of genes from a moth male-killer that are inserted into a fly). Also, the sentence about the combined action of cifA and cifB in Drosophila cytoplasmic incompatibility is also confusing to a non-expert. I would suggest removing it.
Thank you for your suggestion. We have revised the paragraph (lines 124-139) to provide clearer background information, making it easier for non-experts to follow. We have also removed the sentence regarding the combined effect of cifA and cifB to improve the flow and overall clarity.
Line 170: what is the explanation for the male-biased sex ratio instead of 50-50?
The male-biased sex ratio occurs because MK happens during embryogenesis. Unhatched embryos include males that were killed by Wolbachia/Oscar, resulting in a higher proportion of unhatched male embryos. Conversely, the hatched individuals display a female bias, as most of the males were eliminated during embryogenesis. Thus, the unhatched embryos are more male-biased, while the hatched individuals are more female-biased in the Hm-oscar/_w_Hm-t treated groups. We have revised the section “Males are killed mainly at the embryonic stage” (lines 170-186) to include more detailed information explaining this phenomenon.
Line 190: please explain what are the Z chromosomes in Bombyx and Homona and Lepidoptera in general (chromosomes 1 and 15?), as this is not so clear for a non-expert.
Thank you for your suggestion. I have revised the section (lines 200-212) to include more precise background information about the chromosome constitutions in lines 202-204 as follows:
“Unlike other lepidopteran species, Tortricidae, including H. magnanima, generally possess a large Z chromosome that is homologous to B. mori chromosomes 1 (Z) and 15 (autosome).”
Line 222: please explain oscar diversity and classification in more detail, as this is not so clear for a non-expert.
Thank you for your suggestion. We have revised the sentences to provide clearer background information on the diversity of oscar genes (lines 255-264).
Figure 4: I found this difficult to follow. Why are there 2 rows (HmOscar and Oscar)? Does oscar here refer to oscar from Ostrinia? I am also a bit confused about the baseline control of Masc in these cell lines. If I understand Lepidoptera sex determination, then these cell lines are expressing high levels of female-specific piRNAs that suppress Masc. How specific are these piRNAs (i.e. do Bombyx piRNAs suppress Mascs from other Lepidoptera)? How much extra Masc will override endogenous piRNA? Information is lost by setting Masc expression to 100% in each separate comparison.
Yes, the Oscar indicates the w_Fur-encoded _oscar (Oscar from Ostrinia) that was tested to compare function with the Homona-derived Hm-oscar gene. In addition, following the reviewer's suggestions, we have revised the figure and included more detailed information on how we adjusted the expressions in the M&M section.
A previous study (Shoji et al., 2017, RNA 23:86–97) demonstrated that the Fem piRNA (29 bp) in Bombyx mori requires a 17 bp complementary sequence from its 5' region for its function. However, in species other than B. mori, no significant homology (i.e., over 17 bp matches) was found between the B. mori Fem piRNA and the masc genes analyzed in this study. Therefore, it is likely that the Fem piRNA expressed in BmN-4 cells is unable to suppress the masculinizing function driven by masc genes in other lepidopteran species. In addition, we did not quantify the levels of piRNA in this system, but the expression levels of masc are probably too high to be suppressed.
Figure 4 legend: spelling of Spodoptera.
Revised accordingly.
Reviewer #2 (Recommendations for the authors):
In Figure 2, what is the dsx splicing type for the hatched male in the Hm-oscar-injected group and the wHm-t infected line? Dsx-F or dsx-M?
Thank you for your suggestion. Unfortunately, we have not tested splicing in the hatched male neonates (1st instar larvae), partly due to difficulties in obtaining sufficient material for RNA extraction. Based on the previous publication in the Ostrinia system, where Oscar-bearing w_Sca induces MK, the hatched males (ZZ) exhibit female type _dsx as observed in the male embryos (Herran et al., 2022). The hatched Homona males may show double bands for dsx-M and dsx-F as observed in this study.
The size of the markers (in kilobase pairs) should be indicated in Figure 2.
We have accordingly included the marker information in the revised Figure 2b and the figure legends.
In Figure 3, could the authors identify which genes exhibit higher expression levels in the Hm-oscar-injected group and the wHm-t infected line? Could they provide hints for the possible mechanism of male-killing?
In the RNA-seq data shown in Figure 3a-d, we observed that both the Hm-oscar-injected and w_Hm-infected groups generally exhibited upregulated expression of Z-linked genes. Rather than the upregulation or downregulation of a specific gene, we consider that global upregulation of Z-linked genes, caused by improper dosage compensation, is lethal for males. The Z chromosome contains various genes involved in key biological processes such as endocrine function and detoxification, and disruption of these processes may contribute to male lethality. Additionally, in this revised manuscript, we have provided more detailed information on the expression level of the Z-linked gene _tpi. We have also discussed the potential mechanisms of MK in the Discussion section (lines 245-254).
The format of the references should be consistent. Gene and species names should be italicized.
We have accordingly formatted.
Reviewer #3 (Recommendations for the authors):
The authors use the term "upstream" (e.g., Oscar suppressed the function of masculinizer, the upstream male sex determinant...), which was sometimes confusing. In many cases, it reads as though the masculinizer was upstream of oscar, but what I think the authors are trying to convey is that masculinizer is a primary sex-determining factor.
Thank you for your suggestion. We have accordingly revised the term.
Line 101: which insect is wFur from?
It is from Ostrinia furnacalis - line 104 has been revised.
Figure 1: it would be helpful to indicate the statistical results on the figure.
Accordingly, we have added statistical data (binominal test) for Figure 1. The data for the Steel-Dwass test have been included in the supplementary data.
Figure 2b: please label the ladder on the gel.
Thank you for your suggestion. We have accordingly labeled the DNA ladder on the gel.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Response to the Joint Public Review:
We are indebted to eLife’s reviewing process for helping us improve our manuscript and for highlighting that our study provides new molecular insights into SFT pathogenesis.
Response to Reviewers:
(1) The authors state that "NAB2-STAT6 localization is exclusively driven by EGR1 binding" yet WT1 motives are also consistently enriched. Can you please touch upon the potential involvement of WT1 (or lack thereof, and why)?
Our data suggest that EGR1 is the primary driver of NAB2-STAT6 localization. In fact, EGR1 is the most significantly enriched motif (Fig. 4) at NAB2-STAT6 binding sites and we detect an interaction between the fusion protein and EGR1 (Fig. 5). Conversely, we did not identify an interaction between NAB2-STAT6 and WT1. However, WT1 also belongs to the C2H2 zinc finger subclass and recognizes a motif bearing striking similarities to the EGR1/2 consensus. EGR1 has been previously described to bind WT1 motifs and to function as an activator of WT1 targets (as opposed to WT1 repressive abilities). See https://www.jbc.org/article/S0021-9258(20)74720-4/fulltext and https://www.sciencedirect.com/science/article/pii/S0378111901005935.
(2) In the description of Figure 5C the authors observe nuclear staining of both NAB2 and STAT6 following NAB2-STAT6 fusion induction. They interpret this as the fusion stimulates nuclear translocation of endogenous NAB2. This statement can only be rigorously made if the authors can unequivocally demonstrate that their antibody exclusively detects endogenous NAB2 and not the NAB2 portion of the fusion. As presented, a more likely interpretation is that the NAB2 staining detects NAB2-STAT6 fusion protein. Since there is some cytoplasmic NAB2 signal still present, the findings in Figure 5c do not support nor disprove nuclear translocation of endogenous NAB2. It may be prudent to remove this section. Figure 5B is currently the best direct evidence of nuclear translocation.
We agree with the reviewer that Fig. 5C does not rigorously show that NAB2-STAT6 fusion proteins drag endogenous NAB2 into the nucleus. The immunostaining reveals that wt NAB2 localization is overwhelmingly cytoplasmic at steady-state conditions (and prior to expression of the fusion protein). Instead, Figure 5B shows that endogenous NAB2 translocates to the nucleus upon NAB2-STAT6 expression. Additionally, figure 5A (along with Suppl. Fig. 5 E-F) demonstrates that endogenous NAB2 co-precipitates with NAB2-STAT6 fusions in nuclear extracts of U2OS and HEK293T cells. We have rephrased the paragraph accordingly.
(3) Figure 5D: for the interpretation of the presented data to hold up, namely, NAB1 nuclear translocation upon NAB2-STAT6 expression, it is important to demonstrate that NAB1 antibodies do not cross-react with NAB2 given the similarity between NAB1 and NAB2. Without such control, another likely interpretation of the results in Figure 5D is that NAB1 antibody detects the NAB2 portion of the overexpressed fusion protein. This needs to be acknowledged in the text.
We had similar concerns, therefore we confirmed that the NAB1 antibody does not cross react with NAB2 by immunoblot (see figure below). We overexpressed FLAG-NAB2, HA-NAB1 and GFP constructs in HEK293T cells, we performed immunoprecipitation with either HA or FLAG from whole cell extracts followed by western blot using anti-NAB2 and anti-NAB1 polyclonal antibodies. We did not observe cross-reactivity of these antibodies. We acknowledged antibody validation in the revised text.
Author response image 1.
(4) Also, to support the notion that NAB2-STAT6 fusion promotes nuclear translocation of the entire complex, an imaging approach detecting EGR1 similar to Figure 5C-D would be helpful. EGR1 staining also avoids the potential pitfall of NAB1/2 antibodies detecting NAB2-STAT6 overexpressed fusion instead of endogenous proteins.
We agree with the reviewer that this would be a helpful approach. Unfortunately, none of the commercially available EGR1 antibodies that we tested were suitable for immunocytochemistry, as they either failed to show a proper signal or were marred by high nonspecific background signal.
(5) The authors found increased mRNA expression of certain cytokines and secreted neuropeptides in SFTs. While this may be consistent with a secretory phenotype, additional evidence such as detection of elevated levels of these proteins in tumor lysates or in culture media is necessary to formally make this claim. Please rephrase.
We have rephrased our claims as suggested. The revised text is now as follows: “We also identified a distinct secretory gene signature associated with SFTs. In fact, IGF2 is the most upregulated gene, via activation of an intronic enhancer by EGR1. IGF2 was pinpointed as the cause of hypoglycemia occurring in a very small subset of SFTs (Doege–Potter syndrome)(52). Our data suggest that IGF2 (and IGF1) upregulation is a common feature of all SFTs. In addition to insulin-like growth factors, STFs may secrete a host of peptides with diverse functions in neuronal processes, chemotaxis, and growth stimulation. The previously unrecognized neuronal features and the putative secretory phenotype of STFs set them apart from mesenchymal malignancies and relate them to neuroendocrine malignancies such as pheochromocytoma, oligodendroglioma and neuroblastoma.”
(6) GSEA with 500 randomly selected genes from target datasets needs a more detailed description to clarify the method.
To improve clarity, we added the following description: “Gene set enrichment analysis (GSEA) was done with 500 randomly selected genes from the given set of genes across the C2 collection of the human molecular signatures database or custom signatures using the GSEA function in clusterProfiler package in R (v4.6.2).
(7) In the IP-MS description, please double check the NaCl concentration in the second extraction step - 0.5mM seems low. Also, in the IP part, a buffer recipe appears to have been incorrectly pasted.
We thank the reviewer for identifying this typo. Indeed, we used 0.5M NaCl instead of 0.5mM. We have corrected the co-IP buffer recipe accordingly.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1:
The paper by Auer et. makes several contributions: (1) The study developed a novel approach to map the microstructural organization of the human amygdala by applying radiomics and dimensionality reduction techniques to high-resolution histological data from the BigBrain dataset. (2) The method identified two main axes of microstructural variation in the amygdala, which could be translated to in vivo 7 Tesla MRI data in individual subjects. (3) Functional connectivity analysis using resting-state fMRI suggests that microstructurally defined amygdala subregions had distinct patterns of functional connectivity to cortical networks, particularly the limbic, frontoparietal, and default mode networks. (4) Meta-analytic decoding was used to suggest that the superior amygdala subregion's connectivity is associated with autobiographical memory, while the inferior subregion was linked to emotional face processing. (5) Overall, the data-driven, multimodal approach provides an account of amygdala microstructure and possibly function that can be applied at the individual subject level, potentially advancing research on amygdala organization.
We thank the Reviewer for the positive comments and insightful evaluation of the work.
(1.1) Although these are meritorious contributions there are some concerns that I will summarize below. The paper makes little-to-no contact with the monkey literature regarding the anatomy of amygdala subregions, their functionality, and their patterns of anatomical connectivity. This is surprising because such literature on non-human primates is a very important starting point for understanding the human amygdala. I recommend taking a careful look at the work by Helen Barbas, among others. There are too many papers to cite but a notable example is: Ghashghaei, H. T., Hilgetag, C. C., & Barbas, H. (2007). Sequence of information processing for emotions based on the anatomic dialogue between prefrontal cortex and amygdala. Neuroimage, 34(3), 905-923. The work of Amaral is also highly relevant.
As suggested, we included the important work of Amaral et al. as well as Ghashghaei et al. highlighting its contribution to mapping the intricate anatomy and function of the amygdala in non-human primates. We comment on this in the Introduction of the manuscript. Please see P.3.
“Early research on the amygdala in non-human primates has been instrumental in understanding its intricate structure, function and patterns of anatomical connectivity (Amaral and Price 1984; Ghashghaei et al. 2007). This foundational study highlights the amygdala’s different subdivisions, most notably the basomedial nucleus (BM), basolateral nucleus (BL), and central nucleus (Ce) (Amaral et al. 1992). Furthermore, this work describes a dense network between these subdivisions and the prefrontal cortex, most strongly found in the posterior orbitofrontal and anterior cingulate areas.”
(1.2) Furthermore, the authors subscribe to a model with LB, CM, and SF sectors. How does the SF sector relate to monkey anatomy?
The overall organization of these subregions is largely conserved between humans and monkeys, reflecting their evolutionary relationship. While the basic subregional organization is conserved, there are still some important structural and functional differences between human and monkey amygdalae. For example, the SF subregion, often described in humans includes parts of the cortical nuclei (VCo), anterior amygdaloid area (AAA), amygdalohippocampal transition area (AHi), amygdalopiriform transition area (APir) as well as the lateral olfactory tract (LOT). This remark was added in the Discussion, on P.12:
“However, this region has been previously described as consisting of three main subdivisions: LB, CM, and SF, each composed of smaller subnuclei with distinct connectivity patterns and functions (Amunts et al. 2005; Ball et al. 2007; Bzdok et al. 2013; de Olmos and Heimer 1999). These subregions are largely conserved between humans and monkeys, reflecting their evolutionary relationship. However, there are still some considerable differences such as in the SF subregion, where its description in monkeys additionally contains the lateral olfactory tract (LOT) (De Olmos 1990).”
(1.3) The authors use meta-analytical decoding via NeuroSynth. If the authors like those results of course they should keep them but the quality of coordinate reporting in the literature is insufficient to conclude much in the context of amygdala subregion function in my opinion. I believe the results reported are at most "somewhat suggestive".
We agree with the Reviewer that use of data from NeuroSynth poses unique challenges, particularly relating to investigations of a small structure such as the amygdala. However, to clarify, these analyses decode the cortex-wide functional connectivity patterns of amygdala subregions and not activations within subregions defined by our microanatomical analyses. Additionally, comments from Reviewer 2 suggested expanding the NeuroSynth decoding to the contralateral hemisphere. As such, we decided to keep this analysis in the main manuscript but rephrase the interpretation of these findings in the Discussion to emphasize their exploratory nature on P.13:
“Functional decoding of subregional functional connectivity patterns indicated possible dissociations in cognitive (e.g., memory) and affective (e.g., emotional face processing) functions of the amygdala, echoing previous accounts of this region’s involvement in associative processing of emotional stimuli. Notably, these findings link the functional connectivity profile of a subregion partially co-localizing with LB to emotional face processing. The LB subregion has been previously linked to associative processing related to the integration of sensory information (Bzdok et al. 2013; Ghods-Sharifi, St Onge, and Floresco 2009; Pessoa 2010; Winstanley et al. 2004; Boyer 2008), which is consistent with the association with visual emotional information processing identified in the present work.”
(1.4) Another significant concern has to do with the results in Figure 3. The red and yellow clusters identified are quite distinct but the differences in functional connectivity are very modest. Figure 3C reveals very similar functional connectivity with the networks investigated. This is very surprising, and the authors should include a careful comparison with related findings in the literature. Overall, there is limited comparison between the observed results and those obtained via other methods. On a more pessimistic note, the results of Figure 3 seem to question the validity of the general approach.
We agree with the Reviewer that we can indeed observe considerable overlap between functional connectivity profiles of amygdala subregions. The amygdala is a relatively small structure, leading to likely interconnectivity between its subregions (Bzdok et al. 2013) in addition to considering BOLD signal autocorrelation within this region. In addition, functional signals in the amygdala are affected by relatively lower signal-to-noise ratio (SNR), a limitation extending to temporobasal and mesiotemporal regions. Despite these challenges, our technique remained sensitive to detect subtle differences in connectivity patterns even in this small group of subjects in this restricted subcortical territory.
In the revised manuscript, we further highlight these caveats in the Discussion (P.13):
“Although these findings are promising, we also observe considerable overlap between functional connectivity networks of both our defined subregions. Indeed, the amygdala is a relatively small structure, leading to likely interconnectivity between its subregions and locally high signal autocorrelation. Functional connectivity and microstructure in the amygdala are certainly related, however previous work suggests they do not perfectly overlap (Bzdok et al. 2013). In addition, this region is affected by relatively low signal-to-noise ratio (SNR), as is observed in broader temporobasal and mesiotemporal territories.”
(1.5) Some statements in the Discussion feel unwarranted. For example, "significant dissociation in functional connectivity to prefrontal structures that support self-referential, reward-related, and socio-affective processes." This feels way beyond what can be stated based on the analyses performed.
We agree that this interpretation may reach beyond the analyses performed and reported findings. We have adjusted this portion of the text accordingly in our Discussion on functional connectivity findings (P.13):
“Qualitatively, we found that the subregion defined by the highest 25% of U1 values mainly overlapped with what is commonly defined as the superficial and centromedial subregions, whereas the lowest 25% U1 values subregion overlapped mostly with the laterobasal division. Interestingly, CM and SF characterized subregions showed significantly stronger functional connectivity to prefrontal structures. This finding aligns with previous work demonstrating unique affiliations between the CM subregion and anterior cingulate and frontal cortices (Kapp, Supple, and Whalen 1994; Barbour et al. 2010), as well as between the SF subregion and the orbitofrontal cortex (Goossens et al. 2009; Caparelli et al. 2017; Pessoa 2010; Klein-Flügge et al. 2022).”
Additionally, we have also edited our Discussion to ensure that our interpretations are grounded in the analyses conducted, while framing the findings as potential avenues for future work. Please see P.13.
“Functional decoding of functional connectivity results indicated possible dissociations in cognitive (e.g., memory) and affective (e.g., emotional face processing) functions of the amygdala, echoing previous accounts of this region’s functional specialization and subregional segregation of associative processing of emotional stimuli.”
Recommendations for the authors:
(1.6) Figure 1 has panels A-I but only A-D are discussed in the caption. The orientation of the slices is not indicated which makes it very hard to follow for most readers.
The subpanels are now referred to in the revised Results. We also added a notation on the orientation of the slices and described them accordingly in our Figure 1 description. (P.5-6):
“(A) The amygdala was segmented from the 100-micron resolution BigBrain dataset using an existing subcortical parcellation (Xiao et al. 2019). Slice orientation is consistent across all panels in this figure.”
(1.7) Some figure references in the text seem to be incorrect; please check that the text refers to the correct figure number and panel.
We thank the Reviewer for pointing this out. We thoroughly revised the correspondence between figure panel labels and their referencing in the text.
Reviewer #2:
This study bridges a micro- to macroscale understanding of the organization of the amygdala. First, using a data-driven approach, the authors identify structural clusters in the human amygdala from high-resolution post-mortem histological data. Next, multimodal imaging data to identify structural subunits of the amygdala and the functional networks in which they are involved. This approach is exciting because it permits the identification of both structural amygdalar subunits, and their functional implications, in individual subjects. There are, however, some differences in the macro and microscale levels of organization that should be addressed.
Strengths:
The use of data-driven parcellation on a structure that is important for human emotion and cognition, and the combination of this with high-resolution individual imaging-based parcellation, is a powerful and exciting approach, addressing both the need for a template-level understanding of organization as well as a parcellation that is valid for individuals. The functional decoding of rsfMRI permits valuable insight into the functional role of structural subunits. Overall, the combination of micro to macro, structure, and function, and general organization to individual relevance is an impressive holistic approach to brain mapping.
We thank the Reviewer for their constructive and helpful feedback on our work.
Weaknesses:
(2.1) UMAP 1, as calculated from the histological data, appears to correlate well across individuals, and decently with the MRI data, although the medial-lateral coordinate axis is an outlier. UMAP 2, on the other hand, does not appear to correlate well with imaging data or across individuals. This does pose a problem with the claim that this paper bridges micro- and macroscale parcellations. One might certainly expect, however, that different levels of organization might parcellate differently, but the authors should address this in the discussion and offer ways forward.
Data driven methods hold several advantages for the quantitative extraction of signal from the underlying data in an observer-independent manner. However, these techniques are also sensitive to potential idiosyncrasies in the data. In the present work, our main analyses rely on the processing of a histological dataset (BigBrain) providing a unique opportunity for high-resolution analysis of amygdala histology and in vivo translation of findings leveraging ultra-high field MRI (n=10). However, both datasets are limited by their small sample size (n=1 for BigBrain and n=10 for MICA-PNI). As a result, we speculate that signal variations captured by U2 may be sensitive to artifacts or subject-specific sources of variance. Moving forward, this hypothesis could be assessed in future work via the analysis of larger histological and neuroimaging datasets to better track recurring features picked up by U2 or the association of these unique topographies with behavioural markers.
As suggested, we included a section in our Discussion highlighting this shortcoming and the importance for larger datasets moving forward. Please see P.11-12.
“However, it is important to note that both datasets analyzed in this work are limited by their small sample size (n=1 for BigBrain and n=10 for MICA-PNI). We speculate that the signal variations captured by U2 may be sensitive to artifacts or subject-specific sources of variance, potentially explaining why it was not consistent between subjects and modalities. Moving forward, this hypothesis could be assessed in future work via the analysis of larger histological and neuroimaging datasets to better track recurring features picked up by U2 or the association of these unique topographies with behavioural markers.”
(2.1) It would be interesting to see functional decoding for the right amygdala. This could be included in the supplementary material. A discussion of differences in the results in the two hemispheres could be illuminating.
In accordance with the Reviewer’s suggestion, we added Supplementary figure S2 exploring the decoding of connectivity profiles of the right amygdala stratified by its cytoarchitectural embedding with UMAP.
Upon analysis, dissociation in functional connectivity patterns over the right amygdala were less evident, leading to overall similar functional decoding across the two clusters. We refer to this Supplementary Figure in our Discussion on P.13.
“For the right amygdala, dissociation in functional connectivity patterns were more subtle, leading to overall similar functional decoding across the two clusters. (Figure S2)”
(2.3) The authors acknowledge that this mapping matches some but not all subunits that have been previously described in the amygdala. It would be helpful to neuroanatomists if the authors could discuss these differences in more detail in the discussion, to identify how this mapping differs and what the implications of this are.
In our work, we focus on mapping the three well characterized amygdala subregions, specifically the superficial (SF), centromedial (CM) and laterobasal (LB) subdivisions. Qualitative histological accounts have indeed delineated multiple subunits within these subregions which we now describe in the revised manuscript. Due to the lower resolution of in vivo MRI data used in this work relative to post mortem histology, we focused our analyses on larger subregions that could be more reliably mapped to native quantitative T1 spaces of each participant. We now overview this issue in the Discussion. Please see P.12.
“Although qualitative histological accounts have indeed delineated multiple subunits within these general regions, the present work focuses on three subdivisions (Amunts et al. 2005) to account for resolution disparities when translating our findings to in vivo MRI data. The LB subdivision includes the basomedial nucleus (Bm), basolateral nucleus (BL), lateral nucleus (LA) and paralaminar nucleus (PL). Moving medially, the CM subdivision includes the central (Ce) and medial nuclei (Me), while the SF subdivision includes the anterior amygdaloid area (AAA), amygdalohippocampal transition area (AHi), amygdalopiriform transition area (APir), and ventral cortical nucleus (VCo) (Heimer et al. 1999). However, disagreement on the precise attribution of nuclei to broader subdivisions motivated our investigations of probabilistic subunits of the amygdala (Kedo et al. 2018). The development of new tools to segment amygdala subnuclei in vivo offers opens opportunities for future work to further validate our framework at the precision of these nuclei within subjects (Saygin et al. 2017).”
(2.4) The acronym UMAP is not explained. A brief explanation and description would be useful to the reader.
We moved the expanded acronym from the Methods to the first instance of the term UMAP in our paper, found in the Introduction. As suggested, we also added a sentence describing the technique. Please see P.6.
“We then applied Uniform Manifold Approximation and Projection (UMAP), a non-linear dimensionality reduction technique that preserves the local and global structure of high-dimensional data by projecting it into a lower-dimensional space (Becht et al. 2018), to the resulting 20-feature matrix to derive a 2-dimensional embedding of amygdala cytoarchitecture (Figure 1D).”
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Reviewer 1:
(1) Reward Interpretation and Skin Conductance Responses (SCR):
The reviewer raises a valid point, as the model from which we derive prediction errors describes predictive learning—specifically, the occurrence of shock—without incorporating additional reward learning effects. SCRs are used to fit the model’s hyperparameters but do not directly measure reward; rather, they serve as a marker of arousal.
In our paradigm, SCRs are measured during CS presentation and primarily reflect predictive learning, as they are closely linked to contingency awareness. The association between estimated prediction errors during unexpected US omissions and reward remains reliant on existing literature.
In the revised manuscript, we will further elaborate on these points to clarify the distinction between predictive learning and direct reward processing, while contextualizing our findings within the broader literature on reward signaling and fear extinction.
(2) Reinforcement Agent and SCR Modeling:
Notably, we do not use SCR as a personalized expectation measure due to its limited reliability at the individual level; instead, the model's hyperparameters are fitted on the entire SCR dataset, yielding per-trial prediction and prediction error estimates for each CS sequence rather than for individual participants.
(3) Clarity and Visualization of Results:
We recognize that the presentation of our results can be improved and will take steps to enhance figure clarity, also ensuring that trend-level results are clearly distinguished.
(4) Theoretical Context for Paradigm Phases:
Regarding the differences across experimental phases, we recognize the theoretical significance of these distinctions. However, our primary focus is on identifying commonalities in unexpected US omission responses across phases rather than emphasizing phase-specific differences. Nevertheless, we will provide a brief clarification on phase differences to enhance the manuscript’s interpretability.
(5) Cerebellum-VTA Connectivity Analysis:
Furthermore, we acknowledge that our conclusion regarding the modulation of the dopaminergic system by the cerebellum should be framed more cautiously. We will temper our claims to better reflect the bidirectional and potentially indirect nature of cerebellum-VTA interactions. Additionally, we plan to include PPI results using a cerebellar seed showing the VTA, potentially in the supplementary material.
Reviewer 2:
(1) Success of extinction learning based on Self-reports and SCRs?
The reviewer points to a problem, which is inherent to extinction learning: The initial fear association is not erased, but merely inhibited, and is prone to return. Although the recall phase follows the extinction phase, we did not expect a complete inhibition of the conditioned response; instead, spontaneous recovery is expected. In fact, the spontaneous recovery observed in the recall phase provided us with an additional opportunity to investigate unexpected US omissions, which was our primary focus.
(2) Concerns on reliability of event-based contrasts using three events:
Regarding concerns about the reliability of analyses based on three events, we believe that the consistency of our parametric modulation analysis— which incorporates all events— combined with the three-event analysis results, provides further support for the observed patterns. We are currently discussing ways of additional analysis for further verification of the reliability of using three events.
(3) Deviations from preregistration:
Finally, we will carefully review all deviations from our preregistration to ensure transparency. Any methodological or analytical changes will be explicitly addressed in the revised manuscript.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Reviewer 1:
Summary:
Identifying drugs that target specific disease phenotypes remains a persistent challenge. Many current methods are only applicable to well-characterized small molecules, such as those with known structures. In contrast, methods based on transcriptional responses offer broader applicability because they do not require prior information about small molecules. Additionally, they can be rapidly applied to new small molecules. One of the most promising strategies involves the use of “drug response signatures”-specific sets of genes whose differential expression can serve as markers for the response to a small molecule. By comparing drug response signatures with expression profiles characteristic of a disease, it is possible to identify drugs that modulate the disease profile, indicating a potential therapeutic connection.
This study aims to prioritize potential drug candidates and to forecast novel drug combinations that may be effective in treating triple-negative breast cancer (TNBC). Large consortia, such as the LINCS-L1000 project, offer transcriptional signatures across various time points after exposing numerous cell lines to hundreds of compounds at different concentrations. While this data is highly valuable, its direct applicability to pathophysiological contexts is constrained by the challenges in extracting consistent drug response profiles from these extensive datasets. The authors use their method to create drug response profiles for three different TNBC cell lines from LINCS.
To create a more precise, cancer-specific disease profile, the authors highlight the use of single-cell RNA sequencing (scRNA-seq) data. They focus on TNBC epithelial cells collected from 26 diseased individuals compared to epithelial cells collected from 10 healthy volunteers. The authors are further leveraging drug response data to develop inhibitor combinations.
Strengths:
The authors of this study contribute to an ongoing effort to develop automated, robust approaches that leverage gene expression similarities across various cell lines and different treatment regimens, aiming to predict drug response signatures more accurately. The authors are trying to address the gap that remains in computational methods for inferring drug responses at the cell subpopulation level.
Weaknesses:
One weakness is that the authors do not compare their method to previous studies. The authors develop a drug response profile by summarizing the time points, concentrations, and cell lines. The computational challenge of creating a single gene list that represents the transcriptional response to a drug across different cell lines and treatment protocols has been previously addressed. The Prototype Ranked List (PRL) procedure, developed by Iorio and co-authors (PNAS, 2010, doi:10.1073/pnas.1000138107), uses a hierarchical majority-voting scheme to rank genes. This method generates a list of genes that are consistently overexpressed or downregulated across individual conditions, which then hold top positions in the PRL. The PRL methodology was used by Aissa and co-authors (Nature Comm 2021, doi:10.1038/s41467-021-21884-z) to analyze drug effects on selective cell populations using scRNA-seq datasets. They combined PRL with Gene Set Enrichment Analysis (GSEA), a method that compares a ranked list of genes like PRL against a specific set of genes of interest. GSEA calculates a Normalized Enrichment Score (NES), which indicates how well the genes of interest are represented among the top genes in the PRL. Compared to the method described in the current manuscript, the PRL method allows for the identification of both upregulated and downregulated transcriptional signatures relevant to the drug’s effects. It also gives equal weight to each cell line’s contribution to the drug’s overall response signature.
The authors performed experimental validation of the top two identified drugs; however, the effect was modest. In addition, the effect on TNBC cell lines was cell-line specific as the identified drugs were effective against BT20, whose transcriptional signatures from LINCS were used for drug identification, but not against the other two cell lines analyzed. An incorrect choice of genes for the signature may result in capturing similarities tied to experimental conditions (e.g., the same cell line) rather than the drug’s actual effects. This reflects the challenges faced by drug response signature methods in both selecting the appropriate subset of genes that make up the signature and managing the multiple expression profiles generated by treating different cell lines with the same drug.
We appreciate the reviewer’s thoughtful feedback and their suggestion to refer to the Prototype Ranked List (PRL) manuscript. Unfortunately, since this methodology for the PRL isn’t implemented in an open-source package, direct comparison with our approach is challenging. Nonetheless, we investigated whether using ranks would yield similar results for the most likely active drug pairs identified by retriever. To do this, we calculated and compared the rankings of the average effect sizes provided by retriever. Although the Spearman (ρ \= 0.98) correlation coefficient was high, we observed that key genes are disadvantaged when using ranks compared to effect sizes. This difference is particularly evident in the gene set enrichment analysis, where using average ranks identified only one pathway as statistically significantly enriched. The code to replicate these analyses is available at https://github.com/dosorio/L1000-TNBC/blob/main/Code/.
Author response image 1.
Given the similarity in purpose between retriever and the PRL approach, we have added the following statement to the introduction: “Previously, this goal was approached using a majority-voting scheme to rank genes across various cell types, concentrations, and time points. This approach generates a prototype ranked list (PRL) that represents the consistent ranks of genes across several cell lines in response to a specific drug.”
Regarding the experimental validation, we believe there is a misunderstanding about the evidence we provided. We would like to claridy that we used three different TNBC cell lines: CAL120, BT20, and DU4475. It’s important to note that CAL120 and DU4475 were not included in the signature generation process. Despite this, we observed effects that exceeded the additive effects expectations, particularly in the CAL120 cell line (Figure 5, Panel F).
Reviewer 2:
Summary:
In their study, Osorio and colleagues present ‘retriever,’ an innovative computational tool designed to extract disease-specific transcriptional drug response profiles from the LINCS-L1000 project. This tool has been effectively applied to TNBC, leveraging single-cell RNA sequencing data to predict drug combinations that may effectively target the disease. The public review highlights the significant integration of extensive pharmacological data with high-resolution transcriptomic information, which enhances the potential for personalized therapeutic applications.
Strengths:
A key finding of the study is the prediction and validation of the drug combination QL-XII-47 and GSK-690693 for the treatment of TNBC. The methodology employed is robust, with a clear pathway from data analysis to experimental confirmation.
Weaknesses:
However, several issues need to be addressed. The predictive accuracy of ’retriever’ is contingent upon the quality and comprehensiveness of the LINCS-L1000 and single-cell datasets utilized, which is an important caveat as these datasets may not fully capture the heterogeneity of patient responses to treatment. While the in vitro validation of the drug combinations is promising, further in vivo studies and clinical trials are necessary to establish their efficacy and safety. The applicability of these findings to other cancer types also warrants additional investigation. Expanding the application of ’retriever’ to a broader range of cancer types and integrating it with clinical data will be crucial for realizing its potential in personalized medicine. Furthermore, as the study primarily focuses on kinase inhibitors, it remains to be seen how well these findings translate to other drug classes.
We thank the reviewer for their thoughtful and constructive feedback. We appreciate your insights and agree that several important considerations need to be addressed.
We recognize that the predictive accuracy of retriever depends on the LINCS-L1000 and single-cell datasets. These resources may not fully represent the complete range of transcriptional responses to disease and treatment across different patients. As you mentioned, this is an important limitation. However, we believe that by extrapolating the evaluation of the most likely active compound to each individual patient, we can help address this issue. This approach will provide valuable insights into which patients in the study are most likely to respond positively to treatment.
On the in-vitro validation of drug combinations, we agree that while promising, these results are not sufficient on their own to establish clinical efficacy. Additional in-vivo studies will be essential in assessing the therapeutic potential and safety of these combinations, and clinical trials will be an important next step to validate the translational impact of our findings.
Lastly, we appreciate the reviewer’s comment about the focus of our study on kinase inhibitors. This result was unexpected, as we tested the full set of compounds from the LINCS-L1000 project. We agree that exploring other top candidates, including different drug classes, will be important for assessing how broadly retriever approach can be applied.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public Review):
In the present work the authors explore the molecular driving events involved in the establishment of constitutive heterochromatin during embryo development. The experiments have been carried out in a very accurate manner and clearly fulfill the proposed hypotheses.
Regarding the methodology, the use of: i) an efficient system for conversion of ESCs to 2C-like cells by Dux overexpression; ii) a global approach through IPOTD that reveals the chromatome at each stage of development and iii) the STORM technology that allows visualization of DNA decompaction at high resolution, helps to provide clear and comprehensive answers to the conclusion raised.
The contribution of the present work to the field is very important as it provides valuable information on chromatin-bound proteins at key stages of embryonic development that may help to understand other relevant processes beyond heterochromatin maintenance.
The study could be improved through a more mechanistic approach that focuses on how SMARCAD1 and TOPBP1 cooperate and how they functionally connect with H3K9me3, HP1b and heterochromatin regulation during embryonic development. For example, addressing why topoisomerase activity is required or whether it connects (or not) to SWI/SNF function and the latter to heterochromatin establishment, are questions that would help to understand more deeply how SMARCAD1 and TOPBP1 operate in embryonic development.
We would like to thank the reviewer for the positive evaluation of our work and the methodology we employed. We greatly appreciated the reviewer’s recognition of our study to “provide valuable information on chromatin-bound proteins at key stages of embryonic development that may help to understand other relevant processes beyond heterochromatin maintenance”. While we acknowledge the value of including mechanistic studies, such an addition would require a substantial amount of experimental work that exceeds our current resources.
Reviewer #1 (Recommendations For The Authors):
In my opinion, the authors could improve the study by deciphering -to a certain extent- the possible mechanism by which SMARCAD1 and TOPBP1 are cooperating in their system to establish H3K9me3 and consequently heterochromatin; and whether it is different (or not) from that already reported in yeast (ref 27). In fact, is it only SMARCAD1 that participates in this process or the whole SWI/SNF complex? Could the lack of SMARCAD1 compromise the proper assembly of the SWI/SNF complex? In this regard, a model describing the main findings of the study and the discussion of the possible mechanisms involved -based on the current bibliography- would be appreciated. This, although speculative, would illustrate the range of possibilities that could be operating in the maintenance of heterochromatin during embryonic development. In conclusion, it would be great if the authors could link -mechanistically- the dots connecting SMARCD1, TOPBP1, H3K9me3/HP1/heterochromatin.
As suggested by the reviewer and to enrich the discussion, we have included some additional sentences and references in the revised discussion section.
As a minor point, In Figure 3A, left panel it appears that the protein precipitating with H3K9me3 reacts with TOPBP1 but its molecular weight does not exactly match to the TOPBP1 band found in the input. The authors should clarify this point and it is also recommended that IPs and inputs are run in the same gel. Please replace Figure 3A right panel.
Following the reviewer’s suggestion and to improve the reading flow, we have restructured the order of the figures and removed the original Figure 3A. The revised Figure 3A-C panel illustrates the SMARCAD1 association with H3K9me3 in ESCs and 2C- cells, while capturing the reduced SMARCAD1-H3K9me3 association in 2C<sup>+</sup> cells.
Reviewer #2 (Public Review):
The manuscript by Sebastian-Perez describes determinants of heterochromatin domain formation (chromocenters) at the 2-cell stage of mouse embryonic development. They implement an inducible system for transition from ESC to 2C-like cells (referred to as 2C<sup>+</sup>) together with proteomic approaches to identify temporal changes in associated proteins. The conversion of ESCs to 2C<sup>+</sup> is accompanied by dissolution of chromocenter domains marked by HP1b and H3K9me3, which reform upon transition back to the 2C-like state. The innovation in this study is the incorporation of proteomic analysis to identify chromatin-associated proteins, which revealed SMARCAD1 and TOPBP1 as key regulators of chromocenter formation.
In the model system used, doxycycline induction of DUX leads to activation of EGFP reporter regulated by the MERVL-LTR in 2C<sup>+</sup> cells that can be sorted for further analysis. A doxycycline-inducible luciferase cell line is used as a control and does not activate the MERVL-LTR GFP reporter. The authors do see groups of proteins anticipated for each developmental stage that suggest the overall strategy is effective.
The major strengths of the paper involve the proteomic screen and initial validation. From there, however, the focus on TOPBP1 and SMARCAD1 is not well justified. In addition, how data is presented in the results section does not follow a logical flow. Overall, my suggestion is that these structural issues need to be resolved before engaging in comprehensive review of the submission. This may be best achieved by separating the proteomic/morphological analyses from the characterization of TOPBP1 and SMARCAD1.
We appreciate the reviewer’s positive evaluation of our inducible system to trigger the transition from ESCs to 2C-like cells, and the strength of the chromatin proteomics we conducted. In response to the reviewer’s suggestion, we have reorganized the order of the figures, particularly Figure 1 and Figure 2, and revised the text to improve readability and flow.
Reviewer #2 (Recommendations For The Authors):
There are some very interesting components to the study but, as noted, the narrative requires changes and the rationale for focusing on TOPBP1 and SMARCAD1 is not strong at present. Specific comments are noted below
(1) Inclusion of authentic 2C cells for comparative chromocenter analysis (or at least a more fulsome discussion of how the system has been benchmarked in previous studies).
We have included more detail in the revised methods section, in the “Cell lines and culture conditions” paragraph. We have added: “The Dux overexpression system was benchmarked according to previously reported features. Dux overexpression resulted in the loss of DAPI-dense chromocenters and the loss of the pluripotency transcription factor OCT4 (fig. S1E) (6, 7), upregulation of specific genes of the 2-cell transcriptional program such as endogenous Dux, MERVL, and major satellites (MajSat) (fig. S1F) (6, 7, 11, 26, 58), and accumulation in the G2/M cell cycle phase (fig. S1G), with a reduced S phase consistent in several clonal lines (fig. S1H) (15).”
(2) In Figure 1A, the text indicates a loss of chromocenters, but it may be better described as decompaction because the DAPI/H3K9me3 staining shows diffuse/expanded structures (this is in fact how it is described in relation to Figure 2).
We have changed the text accordingly, now describing it as “decompaction”.
(3) Table S1 has 6 separate tabs but these are not specified in the text. It would be useful to separate the 397 proteins unique to Luc and 2C- cells since they form much of the basis for the remaining analysis. This approach also assumes it is the absence of a protein in the 2C<sup>+</sup> that accounts for the lack of chromocenters (noting there are 510 proteins unique to the 2C<sup>+</sup> state that are not discussed).
We have referenced the supplementary table as Table S1 in the text for simplicity. It includes: Table S1A - List of Protein Groups identified by mass spectrometry in -EdU, Luc, 2C- and 2C<sup>+</sup> cells; Table S1B - Input data for SAINT analysis; Table S1C - SAINT results of the comparison 2C- vs Luc and 2C<sup>+</sup> vs Luc; Table S1D - SAINT results of the comparison Luc vs 2C- and 2C<sup>+</sup> vs 2C-; Table S1E - SAINT results of the comparison Luc vs 2C<sup>+</sup> and 2C- vs 2C<sup>+</sup>; and Table S1F - Total number of PSM per protein in the different cells and conditions tested.
(4) Since there is no change in H3K9me3 levels, loss of SUV420H2 from 2C<sup>+</sup> chromatin (figure 1G) coupled with potential changes in H4K20me3 could contribute the morphological differences. SUV420H2 is known to regulate chromocenter clustering in a way the requires H4K20me3 but this is not addressed or cited (PUBMED: 23599346).
As suggested by the reviewer, we have added additional sentences and references in the revised manuscript.
(5) In Figure 1C, there does appear to be overlap between the 2C<sup>+</sup> and 2C- populations (while the Luc population is distinct) even though they are morphologically distinct when imaged in Figure 2A. The 2C- cells are thought to be an intermediate, low Dux expressing population.
Chromatome profiling through genome capture provides a snapshot of the chromatin-bound proteome in the analyzed samples (shown in revised Fig. 2B). As indicated by the reviewer and previously reported in the literature, 2C- cells are an intermediate population before reaching 2C<sup>+</sup> cells. For this study, we have focused on H3K9me3 morphological changes. Even though 2C- and 2C<sup>+</sup> cells are distinct with respect to H3K9me3 morphology (shown in revised Fig. 1B), analysis of the chromatome data from hundreds of chromatin-bound proteins revealed some overlap between these two populations. However, replicates from the same population tend to cluster together, for example, 2C<sup>+</sup> rep1 and 2C<sup>+</sup> rep3, and 2C- rep1 and 2C- rep2. Collectively, these data suggest that a defined subset of coordinated changes in the chromatome likely triggers the transition from 2C- to 2C<sup>+</sup> cells. Further experimental investigation of the chromatome dataset during the 2C-like transition would be interesting, however, we believe it is beyond the scope of this study.
(6) Data with SUV39H1 and 2 is difficult to accommodate; what about other H3K9 methyltransferases or proteins such as TRIM28 (KAP1) and SETDB1 (this comes up in the discussion but is not assessed in the results section).
We agree that investigating the role of TRIM28 (KAP1) and SETDB1 in this experimental setting could be of interest, however, we believe that these experiments go beyond the scope of the presented study.
(7) Rationale for choosing TOPBP1 needs to be improved. How do TOPBP1 levels relate to TOPI/TOP2A/TOP2B levels across the 3 cell populations? By what criteria does topoisomerase inhibitor treatment increase 2C<sup>+</sup> like cells? Moreover, to what extent will inhibiting topoisomerases lead to global heterochromatin and cell cycle changes regardless of cell type.
Following the reviewer’s suggestion, we have included some additional references throughout the text to strengthen our rationale for selecting TOPBP1, given its well-established critical role in DNA replication and repair. Additionally, we have revised the results and discussion sections to include new sentences that propose a potential mechanism by which topoisomerase inhibitors may indirectly recruit TOPBP1 to facilitate DNA repair, ultimately leading to an increase in 2C<sup>+</sup> cells.
(8) Likewise, the decision to look at SMARCAD1 based solely on its interaction with TOPBP1 seems somewhat arbitrary and it did not seem to come up as of interest in the iPOTD analysis. Moreover, they were not able to validate the interaction with their own analyses.
We have revised the text to clarify the connection further.
(9) The flow of results is confusing. The first section concludes with a focus on TOPBP1 and SMARCAD1, then progresses to morphological characterization of heterochromatin regions in the next two sections before returning to TOPBP1 and SMARCAD1. It seems like it would make more sense to describe the model system and morphological characterization at the beginning of the results section and then transition to the proteomic analysis and characterization of TOPBP1 and SMARCAD1 (with the expectation that the rationale be improved).
As suggested by the reviewer, we have reordered the figures, particularly Figure 1 and Figure 2, and rephased the text to improve the overall reading flow.
(10) There has been considerable work done on characterizing chromatin structure, epigenetic changes, and morphology during early embryonic development. It is therefore difficult to see what validating some of these changes in the inducible model is adding much in the way of new knowledge. It may, but this is not articulated in the current text.
As detailed before, we have rephrased the text to improve the overall reading flow, which we hope has improved the understanding of the impact of our results.
(11) It is difficult to disentangle broader effects of both TOPBP1 and SMARCAD1 from those described here; they may induce phenotypes, but these may not be unique to this model system.
We agree with the reviewer, but to address this point would require additional experiments which would go beyond the scope of the presented study.
(12) One of the issues with this assay is global chromatin recovery; it is not focused on heterochromatin compartments. The statement "We identified a total of 2396 proteins, suggesting an efficient pull-down of chromatin-associated factors (fig. S2D and Table S1)" does not demonstrate efficiency. Additional functional annotation would be required to establish this claim, including what fraction are known chromatin-associated proteins (with a focus on the heterochromatin compartment).
We have changed the text accordingly. The resulting statement reads as: “We identified a total of 2396 proteins, suggesting an effective pull-down of putative chromatin-associated factors (fig. S2D and Table S1)”.
Reviewer #3 (Public Review):
The manuscript entitled "SMARCAD1 and TOPBP1 contribute to heterochromatin maintenance at the transition from the 2C-like to the pluripotent state" by Sebastian-Perez et al. adopted the iPOTD method to compare the chromatin-bound proteome in ESCs and 2C-like cells generated by Dux overexpression. The authors identified 397 chromatin-bound proteins enriched only in ESC and 2C- cells, among which they further investigated TOPBP1 due to its potential role in controlling chromocenter reorganization. SMARCD1, a known interacting protein of TOPBP1, was also investigated in parallel. The authors observed increased size and decreased number of H3K9me3-heterochromatin foci in Dux-induced 2C<sup>+</sup> cells. Interestingly, depletion of TOPBP1 or SMARCD1 also led to increased size and decreased number of H3K9me3 foci. However, depletion of these proteins did not affect entry into or exit from the 2C-like state. Nevertheless, the authors showed that both TOPBP1 and SMARCD1 are required for early embryonic development.
Although this manuscript provides new insights into the features of 2C-like cells regarding H3K9me3-heterochromatin reorganization, it remains largely descriptive at this stage. It does not provide new insights into the following important aspects: 1) how SMARCD1 associates with H3K9me3 and contributes to heterochromatin maintenance, 2) how TOPBP1 regulates the expression of SMARCD1 and facilitates its localization in heterochromatin foci, 3) whether the remodelling of chromocenter is causally related to the mutual transitions between ESCs and 2C-like cells. Furthermore, some results are over-interpreted. Additional experiments and analyses are needed to increase the strength of mechanistic insights and to support all claims in the manuscript.
We would like to thank the reviewer for their positive and thorough evaluation of our manuscript. We have revised the text and hope that the overall flow is now clearer. Moreover, while we acknowledge the value of including mechanistic studies, such an addition would require a substantial amount of experimental work that exceeds our current resources.
Reviewer #3 (Recommendations For The Authors):
Major points:
(1) Fig.2: the DNA decompaction of the chromatin fibers shown in 2C<sup>+</sup> cells may be more related to a relaxed 3D chromatin conformation (Zhu, NAR 2021; Olbrich, Nat Commun 2021) than chromatin accessibility. The authors should discuss this point.
As suggested by the reviewer, we have included some additional sentences and references in the revised manuscript to address this concern.
(2) Chemical inhibition of topoisomerases resulted in an increase in the percentage of 2C<sup>+</sup> cells. Does depletion of TOPBP1 also resulted in increased percentage of 2C<sup>+</sup> cells? Please include this result in Fig. 3E. Additionally, it should be noted that DDR and p53 have been reported to activate Dux (Stashpaz, eLife 2020; Grow, Nat Genet 2021), and thus, may contribute to the increased percentage of 2C<sup>+</sup> cells observed upon topoisomerase inhibition. This point should be discussed in the manuscript.
To address this concern, we have included some additional sentences and references in the revised manuscript.
(3) Fig 3A: the TOPBP1 band in the IP sample is questionable, and therefore the conclusion that TOPBP1 is associated with H3K9me3 is difficult to draw from Fig 3A. Additionally, the authors mentioned that association of TOPBP1 and SMARCAD1 is undetected in ESCs, likely due to the suboptimal efficiency of available antibodies. As these are key conclusions in this study, the authors are suggested to try other commercially available TOPBP1 antibodies (e.g., Abcam #ab-105109, used by ElInati, PNAS 2017) or knock-in tags to perform the co-IP experiment.
Following the reviewer’s suggestion and to improve the reading flow, we have restructured the order of figures and removed the original Figure 3A. The revised Figure 3A-C panel illustrates the SMARCAD1 association with H3K9me3 in ESCs and 2C- cells, while capturing the reduced SMARCAD1-H3K9me3 association in 2C<sup>+</sup> cells.
(4) Fig. 3C-D, Fig. S3D: the authors claimed reduction of both SMARCAD1 expression and its co-localization with H3K9me3 foci in 2C<sup>+</sup> cells, but did not perform mechanistic studies. It is important to know if TOPBP1 expression also decreases in 2C<sup>+</sup> cells. Additionally, it is unclear if the reduced co-localization of SMARCAD1 with H3K9me3 foci results from its altered nuclear localization or simply from reduced expression level? In either case, please provide some mechanistic insights.
While we acknowledge the value of including mechanistic studies, such an addition would require a substantial amount of experimental work that exceeds our current resources.
(5) Fig. 3K, Fig. S4D-E: does SMARCAD1 expression decrease upon TOPBP1 depletion? Statistical analysis of SMARCAD1 intensity in Fig. S4E is needed, and a Western blot analysis is strongly suggested. Additionally, it is unclear if the reduced co-localization of SMARCAD1 with H3K9me3 foci results from its altered nuclear localization or simply from reduced expression level? In Fig. 3K, TOPBP1-depleted cells appear to show decreased size and increased number of H3K9me3 foci, which is inconsistent with Fig. S4B-C. The authors should clarify this discrepancy. Furthermore, statistics should be performed to determine whether Smarcad1/Topbp1 knockdown could further increase the size and decrease the number of H3K9me3 foci in 2C<sup>+</sup> cells. This would provide additional evidence for the involvement of these proteins in heterochromatin maintenance.
We did not observe Smarcad1 downregulation after Topbp1 knockdown (shown in fig. S4A). In Figs. S4B and S4C, we observed that the number of H3K9me3 foci decreased, and their area became larger after knocking down either Smarcad1 or Topbp1, compared to scramble controls. These results align with the reviewer’s comment. Additionally, it should be noted that these findings were derived from the quantification of tens of cells and hundreds of foci, as indicated in the figure legend. This resulted in statistical significance after applying the test indicated in the figure legend.
(6) Fig. 3J is suggested to be moved to Fig. 4. Additionally, performing immunostaining of SMARCAD1, TOPBP1, and H3K9me3 during pre-implantation development would provide valuable information on their protein-level dynamics, interactions, and functions in early embryos. This would further strengthen the conclusions drawn in the manuscript.
We agree that performing these additional experiments would provide additional valuable information, however this would require a substantial amount of experimental work that exceeds our current resources.
(7) Fig. 4 and Fig. S5: the authors observed reduced H3K9me3 signal in the Smarcad1 MO embryos at the 8-cell stage, but claim that they failed to examine Topbp1 MO embryos at the 8-cell stage due to their developmental arrest at the 4-cell stage. However, based on Fig. 4A, not all Topbp1 MO embryos were arrested at the 4-cell stage, and it is still possible to examine the H3K9me3 signal in 8-cell Topbp1 MO embryos, which is critical for demonstrating its function in early embryos. Also, how to interpret the increased HP1b signal in Topbp1 MO embryos?
For Topbp1 silencing, we observed an even more severe phenotype compared to Smarcad1 MO. All the Topbp1 MO-injected embryos (100 %) arrested at the 4-cell stage and did not develop further (shown in Fig. 4A and 4B). Therefore, the severity of the Topbp1 morpholino phenotype posed a technical challenge in evaluating the H3K9me3 signal in 8-cell Topbp1 MO embryos, as none of the injected embryos developed beyond the 4-cell stage.
We believe the increased HP1b signal in Topbp1 MO embryos could indicate potential alterations in chromatin organization and heterochromatin stability. Specifically, we observed remodeling of heterochromatin in both 2-cell and 4-cell Topbp1 MO arrested embryos compared to controls, as evidenced by the spreading and increased HP1b signal (shown in fig. S5F-S5I). Further investigations could enhance our understanding of the underlying defects in Topbp1 knockdown embryos, extending beyond heterochromatin-related errors.
Minor points:
(1) Page 4, the third row from the bottom: please revise the sentence.
We have reviewed the text and it now reads correctly in the revised manuscript.
(2) Fig. 1C: The authors claimed "Luc replicates clustered separately from 2C<sup>+</sup> and 2C- conditions", however, Luc rep3 is apparently clustered with 2C conditions.
(3) The GFP signal in Fig. S1E is confusing.
(4) Please include ESC in Fig. 2D-E. Also label the colors in Fig. 2E.
As indicated in the figure legend of the revised Fig. 1F: “Cells with a GFP intensity score > 0.2 are colored in green. Black dots indicate 2C- cells and green dots indicate 2C<sup>+</sup> cells.”
(5) Fig. 2G: Transposition of the heatmap (show genes in rows) is suggested to improve readability.
(6) Page 7, the third row from the bottom: incorrect citation of Fig. 1K.
Thank you for spotting this incorrect citation. We have corrected it in the revised manuscript.
(7) Page 8, row 15, Fig. S3D should be cited to support the decreased expression of SMARCAD1 in 2C<sup>+</sup> cells.
We have cited the corresponding supplementary figure S3D in the mentioned sentence.
(8) Fig. 2H: what is the difference between "2C-" and "ESC-like"?
We named 2C- to those cells not expressing the GFP reporter in the transition from ESCs to 2C<sup>+</sup> cells. We named ESC-like cells to those cells that do not express the GFP reporter during exit, meaning from sorted and purified 2C<sup>+</sup> to a GFP negative state.
(9) Fig. S4A-C: compared with shTopbp1#2, shTopbp1#1 appears to be slightly more effective in knockdown, but less dramatic changes in the size/number of H3K9me3 foci.
(10) Fig. 4: please show the effectiveness of Topbp1 MO by Immunostaining of TOPBP1.
(11) Fig. 4C: please label the developmental stage as in Fig. 4E and 4G.
We have added a “8-cell” label in the Figure 4C, as suggested by the reviewer.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public Review):
Summary:
In this study, Zhao and colleagues investigate inflammasome activation by E. tarda infections. They show that E. tarda induces the activation of the NLRC4 inflammasome as well as the non-canonical pathway in human THP1 macrophages. Further dissecting NLRC4 activation, they find that T3SS translocon components eseB, eseC and eseD are necessary for NLRC4 activation and that delivery of purified eseB is sufficient to trigger NAIP-dependent NLRC4 activation. Sequence analysis reveals that eseB shares homology within the C-terminus with T3SS needle and rod proteins, leading the authors to test if this region is necessary for inflammasome activation. They show that the eseB CT is required and that it mediates interaction with NAIP. Finally, they that homologs of eseB in other bacteria also share the same sequence and that they can activate NLRC4 in a HEK293T cell overexpression system.
Strengths:
This is a very nice study that convincingly shows that eseB and its homologs can be recognized by the human NAIP/NLRC4 inflammasome. The experiments are well designed, controlled and described, and the papers is convincing as a whole.
Weaknesses:
The authors need to discuss their study in the context of previous papers that have shown an important role for E. tarda flagellin in inflammasome activation and test whether flagellin and/or E. tarda T3SSs needle or rod can activate NLRC4.
The authors show that eseB and its homologs can activate NLRC4, but there are also other translocon proteins that are very different such as YopB or PopB. and share little homology with eseB. It would be nice to include a section comparing the different type 3 secretion systems. are there 2 different families of T3SSs, those that feature translocon components that are recognized by NAIP-NLRC4 and those that cannot be recognized?
(1) The authors need to discuss their study in the context of previous papers that have shown an important role for E. tarda flagellin in inflammasome activation and test whether flagellin and/or E. tarda T3SSs needle or rod can activate NLRC4.
According to the reviewer’s suggestion, we added the relevant discussion (lines 326-334) and carried out additional experiments to examine whether E. tarda flagellin, needle, and rod could activate NLRC4. The relevant results are shown in Figure S3, Figure S5, and lines 226-230 and 269-274.
(2) The authors show that eseB and its homologs can activate NLRC4, but there are also other translocon proteins that are very different such as YopB or PopB. and share little homology with eseB. It would be nice to include a section comparing the different type 3 secretion systems. are there 2 different families of T3SSs, those that feature translocon components that are recognized by NAIP-NLRC4 and those that cannot be recognized?
According to the reviewer’s suggestion, additional experiments were performed to examine the NLRC4-activating potentials of 14 translocator proteins that share low sequence identities with EseB. The relevant results and discussion are shown in Figure S8 and lines 289-301; 364-372, and 377-379.
Reviewer #2 (Public Review):
Summary:
This work by Zhao et al. demonstrates the role of the Edwardsiella tarda type 3 secretion system translocon in activating human macrophage inflammation and pyroptosis. The authors show the requirement of both the bacterial translocon proteins and particular host inflammasome components for E. tarda-induced pyroptosis. In addition, the authors show that the C-terminal region of the translocon protein, EseB, is both necessary and sufficient to induce pyroptosis when present in the cytoplasm. The most terminal region of EseB was determined to be highly conserved among other T3SS-encoding pathogenic bacteria and a subset of these exhibited functionally similar effects on inflammasome activation. Overall, the data support the conclusions and interpretations and provide interesting insights into interactions between bacterial T3SS components and the host immune system.
Strengths:
The authors use established and reliable molecular biology and bacterial genetics strategies to characterize the roles of the bacterial T3SS translocon and host inflammasome pathways to E. tarda-induced pyroptosis in human macrophages. These observations are naturally expanded upon by demonstrating the specific regions of EseB that are required for inflammasome activation and the conservation of this sequence among other pathogenic bacteria.
Weaknesses:
The functional assessment of EseB homologues is limited to inflammasome activation at the protein level but does not include the effects on cell viability as shown for E. tarda EseB. Confirmation that EseB homologues have similar effects on cell death would strengthen this portion of the manuscript.
According to the reviewer’s suggestion, the effects of representative EseB homologs on cell death were examined in the revised manuscripts (Figure 5D, Figure S7 and line 289).
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
I only have a few suggestions on how to improve the study:
Activation of caspase-4 requires entry into the host cytosol. Can this be observed with E. tarda and is it T3SS dependent? The fact that deleting the translocon components abrogates all GSDMD activation (see Fig. 2D) suggests that also Casp4 activation requires an active T3SS. It would be useful for the reader to include some more information on the cellular biology of E. tarda.
In our study, we found that E. tarda could enter THP-1 cells (Figure S1), and host cell entry was not affected by deletion of eseB-D (Δ_eseB-D_) in the T3SS system (Figure 2B, C). Additional experiments showed that Δ_eseB-D_ abolished the ability of E. tarda to activate Casp4 (Figure S2), implying that Casp4 activation required an active T3SS. Relevant changes in the revised manuscript: lines 223 and 224, 341-342.
The data presented by the authors suggest that escB is sensed by NLRC4 when overexpressed, they do however not prove that during an infection escB is the main factor that drives NLRC4 activation, since deficiency in escB also abrogated translocation of other potential activators of NLRC4, e.g. flagellin and T3SS needle and rod subunits. I would thus find it essential to properly test if E. tarda flagellin can activate NLRC4 by comparing a WT and flagellin deficient strain, and/or by transfecting or expressing E.t. flagellin in these cells, as well as testing whether E.t. rod and needle subunits act as NLRC4 activators. This is important as previous studies suggested that flagellin is the main activator of cytotoxicity during E. tarda infection.
Previous studies have shown that flagellin is required for E. tarda-induced macrophage death in fish [1] but not in mice [2]. In the revised manuscript, we performed additional experiments to examine whether E. tarda flagellin, needle, and rod could activate NLRC4. The relevant results are shown in Figure S3, Figure S5, and lines 226-230 and 269-274, and 326-334.
References
(1) Xie HX, Lu JF, Rolhion N, Holden DW, Nie P, Zhou Y, et al. Edwardsiella tarda-induced cytotoxicity depends on its type III secretion system and flagellin. Infect Immun. 2014;82(8):3436-45. doi: 10.1128/IAI.01065-13.
(2) Chen H, Yang D, Han F, Tan J, Zhang L, Xiao J, et al. The bacterial T6SS effector EvpP prevents NLRP3 inflammasome activation by inhibiting the Ca<sup>2+</sup>-dependent MAPK-JNK pathway. Cell Host Microbe. 2017;21(1):47-58. doi: 10.1016/j.chom.2016.12.004.
Figure 5/S4, please list the names of the eseB homologs. It is cumbersome to have to access GenBank with the accession number to be able to understand what proteins the authors define as homologs of eseB.
The names were added to the revised Table S2, Figure 5 and Figure S6 (the original Figure S4).
The authors mention that other translocon proteins, such as YopB/D and PopB/D, were suggested to cause inflammasome activation. How do these compare to eseB and its homologs? Do they share the CT motif?
Additional experiments were performed to compare the inflammasome activation abilities of EseB and other translocator proteins including YopD and PopD. The relevant results and discussion are shown in Figure S8 and lines 289-301, 364-372, and 377-379.
It would be nice to show that there are potentially two groups of translocon proteins, one group sharing homology to needle subunits within the CT region and another that is different. A quick look at the sequence of these proteins suggests that they are quite different and much larger than eseB.
In our study, additional experiments with more translocator proteins indicated that the possession of EseB T6R-like terminal residues does not necessarily guarantee the protein to activate the NLRC4 inflammasome. Relevant results and discussion are shown in lines 289-301, 364-372, and 377-379.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
In this manuscript, Satouh et al. report giant organelle complexes in oocytes and early embryos. Although these structures have often been observed in oocytes and early embryos, their exact nature has not been characterized. The authors named these structures "endosomal-lysosomal organelles form assembly structures (ELYSAs)". ELYSAs contain organelles such as endosomes, lysosomes, and probably autophagic structures. ELYSAs are initially formed in the perinuclear region and then migrate to the periphery in an actin-dependent manner. When ELYSAs are disassembled after the 2-cell stage, the V-ATPase V1 subunit is recruited to make lysosomes more acidic and active. The ELYSAs are most likely the same as the "endolysosomal vesicular assemblies (ELVAs)", reported by Elvan Böke's group earlier this year (Zaffagnini et al. doi.org/10.1016/j.cell.2024.01.031). However, it is clear that Satouh et al. identified and characterized these structures independently. These two studies could be complementary. Although the nature of the present study is generally descriptive, this paper provides valuable information about these giant structures. The data are mostly convincing, and only some minor modifications are needed for clarification and further explanation to fully understand the results.
Reviewer #2 (Public Review):
Satouh et al report the presence of spherical structures composed of endosomes, lysosomes, and autophagosomes within immature mouse oocytes. These endolysosomal compartments have been named as Endosomal-LYSosomal organellar Assembly (ELYSA). ELYSAs increase in size as the oocytes undergo maturation. ELYSAs are distributed throughout the oocyte cytoplasm of GV stage immature oocytes but these structures become mostly cortical in the mature oocytes. Interestingly, they tend to avoid the region which contains metaphase II spindle and chromosomes. They show that the endolysosomal compartments in oocytes are less acidic and therefore non-degradative but their pH decreases and becomes degradative as the ELYSAs begin to disassemble in the embryos post-fertilization. This manuscript shows that lysosomal switching does not happen during oocyte development, and the formation of ELYSAs prevents lysosomes from being activated. Structures similar to these ELYSAs have been previously described in mouse oocytes (Zaffagnini et al, 2024) and these vesicular assemblies are important for sequestering protein aggregates in the oocytes but facilitate proteolysis after fertilization. The current manuscript, however, provides further details of endolysosomal disassembly post-fertilization. Specifically, the V1-subunit of V-ATPase targeting the ELYSAs increases the acidity of lysosomal compartments in the embryos. This is a well-conducted study and their model is supported by experimental evidence and data analyses.
Reviewer #3 (Public Review):
Fertilization converts a cell defined as an egg to a cell defined as an embryo. An essential component of this switch in cell fate is the degradation (autophagy) of cellular elements that serve a function in the development of the egg but could impede the development of the embryo. Here, the authors have focused on the behavior during the egg-to-embryo transition of endosomes and lysosomes, which are cytoplasmic structures that mediate autophagy. By carefully mapping and tracking the intracellular location of well-established marker proteins, the authors show that in oocytes endosomes and lysosomes aggregate into giant structures that they term Endosomal LYSosomal organellar Assembl[ies] (ELYSA). Both the size distribution of the ELYSAs and their position within the cell change during oocyte meiotic maturation and after fertilization. Notably, during maturation, there is a net actin-dependent movement towards the periphery of the oocyte. By the late 2-cell stage, the ELYSAs are beginning to disintegrate. At this stage, the endo-lysosomes become acidified, likely reflecting the activation of their function to degrade cellular components.
This is a carefully performed and quantified study. The fluorescent images obtained using well-known markers, using both antibodies and tagged proteins, support the interpretations, and the quantification method is sophisticated and clearly explained. Notably, this type of quantification of confocal z-stack images is rarely performed and so represents a real strength of the study. It provides sound support for the conclusions regarding changes in the size and position of the ELYSAs. Another strength is the use of multiple markers, including those that indicate the activity state of the endo-lysosomes. Altogether, the manuscript provides convincing evidence for the existence of ELYSAs and also for regulated changes in their location and properties during oocyte maturation and the first few embryonic cell cycles following fertilization.
At present, precisely how the changes in the location and properties of the ELYSAs affect the function of the endo-lysosomal system is not known. While the authors' proposal that they are stored in an inactive state is plausible, it remains speculative. Nonetheless, this study lays the foundation for future work to address this question.
Minor point: l. 299. If I am not mistaken, there is a typo. It should read that the inhibitors of actin polymerization prevent redistribution from the cytoplasm to the cortex during maturation.
Minor point: A few statements in the Introduction would benefit from clarification. These are noted in the comments to the authors.
We sincerely appreciate the editorial board of eLife and the reviewers for their helpful and constructive comments on our manuscript. We are pleased that the reviewers acknowledged that we identified and characterized this assembly structure independently. In the revised manuscript, we have carefully considered the reviewers’ comments and conducted additional analysis to address each of them.
Regarding the typographical errors, we revised the description to fit with our findings and the reviewers’ comments. We also found that the primer sequence was correct, and we carefully checked the accuracy of the entire manuscript.
We hope that the revised version will now be deemed suitable for publication in eLife.
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
Q. 1) The authors state in the Abstract that ELYSAs contain autophagosome-like membranes in the outer layer. However, this seems to be just speculation based on the LC3 staining results and is not directly shown. Are there autophagosome-like double membrane structures in ELYSAs?
We appreciate this comment. We also agree with this concern; however, it was difficult to assert that they are autophagosomes based on the observation of the electron micrographs. For this reason, we rephrased it to be "Most ELYSAs are also positive for an autophagy regulator, LC3.” (lines 33). In addition, we revised the notation to LC3-positive structures in the Result and Discussion section (line 165-169, 286).
Q. 2) The data in Figure 2A, showing a decrease in the number of LAMP1 structures, seems to contradict the data in Figure 1B, showing an apparent increase in LAMP1 structures. Please explain this discrepancy. If the authors did not count structures just below the plasma membrane, please explain the rationale for this.
We really appreciate the valuable comment. Regarding the number of LAMP1-positive structures, it is not suitable for comparison with Figure 1B, etc., as pointed out by the reviewer, since the distribution of the LAMP1 signal differs from plane to plane. To avoid any potential confusion, we added new images of the Z-projection of the immunostained images that can better reflect the number of positive structures in the whole oocyte/embryo in Figure 2.
In addition, as the reviewer pointed out, there is a technical difficulty in measuring the LAMP1-positive signal on the plasma membrane or just below it. We explained how and why we had to delete plasma membrane signals in our response #21.
Q. 3) The actin dependence is not observed in Figure 5C. What is the difference between Figure 5C and 5E? Please explain further.
We apologize for the lack of clarity; Figures 5C and 5E show the average number of LAMP1-positive structures (5C) and the percentage of the sum of granule volumes in LAMP1 positive structure (5E), respectively, after classifying the LAMP1 positive granules by their diameters.
We removed Figure 5E for the sake of conciseness since we already mentioned a similar fact in Figure 5C. To clarify the corresponding explanations, we moved figures that were not classified by diameter to Supplementary Figure 8 to improve readability. Moreover, we have rewritten the main text on lines 200–211.
Q. 4) While the actin inhibitors reduce the number of peripheral LAMP1 structures (Figure 5F), they do not affect their number in the central region (Figure 5G). How can the authors conclude that actin inhibitors inhibit the migration of LAMP1 structures?
We appreciate the comment. As pointed out, the number of large LAMP1-positive structures in the medial region did not change. Therefore, we have avoided the description that ELYSAs migrate from the middle region to the cell periphery and have unified the description of whether large structures in the periphery occur. Please refer to the subsection title (line 188), the following descriptions (lines 189–199), the related description in the Results (lines 200–211), and the title and the legend of Figure 5.
Q. 5) The authors show that the V1A subunit associates with the surface of LAMP1 structures as punctate structures (Figure 6B). What are these V1A-positive structures? Is V1A recruited to some specific domains of ELYSAs, or are V1A-positive active lysosomes recruited to ELYSAs? Please provide an interpretation of these data. The phrase "The V1-subunit of V-ATPase is targeted to these structures" (line 262) is not appropriate because it is indistinguishable whether only the V1 subunits are recruited or active lysosomes containing the V1 subunit are recruited.
Thank you for the valuable comment. Indeed, our analysis, including the analysis of Fig. 8 described on line 262, did not clarify whether free V1A-mCherry molecules accessed the ELYSA periphery or whether lysosomes with V1A-mCherry molecules newly merged into the ELYSA. Therefore, we added this interpretation to lines 232–234 of the Results and revised the Discussion as "The number of membrane structures positive for V1A-mCherry increase upon ELYSA disassembly, indicating further acidification of the endosomal/lysosomal compartment" (lines 292–294).
Q. 6) Why did the authors use LysoSensor as a marker for ELYSA instead of LAMP1 in Figure 8 and 9? Some reasons should be given.
There is a clear technical reason for this: when LAMP1-EGFP was expressed in a zygote, it was largely migrated to the plasma membrane before and after the 2-cell stage, making it difficult to capture the change of ELYSAs. To circumvent this difficulty, we used Lysosensor to visualize ELYSAs instead of LAMP1-EGFP. This explanation was added to lines 258–260.
Q. 7) In Figure 9A, it is not clear whether the activity of LysoSensor-positive structures is lower at this stage compared to other stages. It may be shown in Figure S7, but the data are not clearly visible. A direct comparison would be ideal.
A new analysis similar to that shown in Fig. 9 for early 2-cells and 4-cells was performed and added to Figure S7. To support direct comparison, the ranges of axes were set to be similar.
As a result, the quantified MagicRed signal on the isolated LysoSensor-positive punctate structure in MII oocyte was nearly the same as that in early 2-cells and 4-cells. In early 2-cells, LysoSensor gave a signal at the cellular boundary, where MagicRed staining was not observed, confirming that MagicRed activity is higher in the interior than in the cell periphery in post-fertilization embryos. We have included an additional description in the main text (lines 280–282).
Q. 8) In the phrase "pregnant mare serum gonadotropin or an anti-inhibin antibody" (line 382), is "or" correct?
When inducing superovulatory stimulation, an anti-inhibin antibody (distributed as CARD HyperOva) can be used as a substitute for PMSG (after additional stimulation with hCG), which results in the production of eggs of similar quality to those of PMSG. This was used in most experiments. To amend the lack of clarity, a reference (Takeo and Nakagata Plos One, 2015) was added to the description of HyperOva (line 417).
Q. 9) In almost all graphs, please indicate what the X-axis is indicating (not just "number") so that readers can understand what number is being represented without reading the legends.
We revised the axis titles in all figures.
Q. 10) Since grayscale images provide better contrast than color images, it is recommended that single-color images be shown in grayscale.
We replaced all single-color images with grayscale images.
Reviewer #2 (Recommendations For The Authors):
Specific comments:
Q. 11) Figure 1 and S1- Both Rab5 and Rab7 co-localize with LAMP1. However, there seems to be a lot of LAMP1-free Rab5 dots as compared to the Lamp1-free Rab7. As a result, LAMP1 and Rab7 are co-localized more frequently than LAMP1 and Rab5 (video1). Could it be that early endosomes (Rab5+) are yet to be incorporated into ELYSAs? If so, a brief discussion of this phenomenon would be nice.
Thank you very much for the comment. We agree with the reviewer’s interpretation. In accordance with this suggestion, we clearly stated in the main text: “Although small punctate structures that are RAB5-positive but LAMP1-negative also spread over the cytosol, most giant structures were positive for RAB5 and LAMP1 (Video 1)” (lines 91–93). In the Discussion section, a brief statement was included: “Considering the large number of RAB5-positive and LAMP1-negative punctate structures in MII oocytes, these layers may also reflect the assembly mechanism of the ELYSA” (lines 318–320).
Q. 12) Video 3 (and Figure 6) clearly shows the dynamics of LAMP1-labelled vesicles during maturation, which is impressive. In contrast to the live cell imaging after LAMP1 mRNA injection, Figure 1 used anti-LAMP1 Ab to detect endogenous levels of LAMP1. It appears that mRNA microinjection causes LAMP1 overexpression causing more (but smaller) vesicles to form. It should be easy to quantify and compare the vesicles in Figure 1 and 6
We appreciate the comment. As mentioned, injections of EGFP-LAMP1 mRNA are useful for the visualization of LAMP1 dynamics during the maturation phase from GV to MII by live cell imaging, which is not feasible with immunostaining. However, the fluorescence emitted by EGFP-LAMP1 is only a few tenths of that of antibody staining, and because of the technical difficulty of microinjection into GV oocytes, the signal-to-noise ratio sufficient for imaging was merely one in ten oocytes. In addition, live cell imaging of oocytes in Figure 6 had to be carried out with very low excitation light exposure to reduce the toxicity. It was also performed with a low magnification lens and a longer step size in the z-axis. For these reasons, in examining the point raised, we performed an additional 3D object analysis, in the same way as in Figure 2, on the data of IVM oocytes injected with EGFP-LAMP1 mRNA using the same lens as in Figure 1 and with a longer exposure time than in live imaging. The results were compared with the MII data of Figures 1 and 2.
As a result, as shown in the new Figure S8, more objects with a diameter of 0.2–0.4 µm were found than in the immunostaining data, which fits the reviewer’s point. In addition, the counts were lower for the 0.6–1.0 µm diameter, but there was no significant difference in the number of larger LAMP1 positive structures corresponding to the ELYSA size. We consider that this was appropriate for the original purpose of characterizing the ELYSA formation process. A description of these points has been added to lines 221–225.
Q. 13) In Figure 4A and B- Seems like not all LAMP1-positive structures were LC3-positive. Is there any size or location within the oocyte that determines LC3 positivity?
We appreciate the valuable comment. To answer this comment, we proceeded with a new 3D object-based co-localization analysis on Lamp1 and LC3, determined the number, volume, and distribution within the oocyte, and incorporated the results as Supplementary Figure 6. To examine the positivity, we further analyzed the percentage of double-positive structures of all the LAMP1-positive structures. The results showed that their average diameter significantly shifted from 2.36 µm (GV) to 3.78 µm (MII). Moreover, it was clearly indicated that LAMP1-positive structures smaller than 2 µm in diameter are rarely positive for LC3. In terms of location, measuring the distance of the double positive structures from the oocyte center (the cellular geometric center) indicated that they tend to be observed at the periphery of both stages of oocytes (more than 80% in > 30 µm in the MII oocyte). Of note, no clear tendency of double positivity was observed. A description of these points has been added to lines 174–186.
Q. 14) In discussion, line 256- Small ELYSAs are formed in GV oocytes. Since you haven't checked the smaller-sized, growing oocytes, I suggest rephrasing this sentence as 'are present' rather than 'are formed'.
We agree with the reviewer’s suggestion and changed it to "present" (line 287).
Q. 15) Line 188- ELISA should instead be ELYSA
Thank you for pointing this out. We have found a few more typographical errors, and all of them have been corrected (lines 213 and 321).
Reviewer #3 (Recommendations For The Authors):
Q. 16) Line 42: What do you mean by 'zygotic gene expression following the degradation of the cellular components of each maternal and paternal gamete'? ZGA requires this degradation? Please provide supporting references from the literature.
We apologize for the confusing wording. We meant to say that both ZGA and degradation of parental components are required. To avoid misunderstanding, we have revised “zygotic gene expression as well as the degradation of the cellular components of each maternal and paternal gamete” and inserted a new reference (line 44).
Q. 17) 50: MII means metaphase II, not meiosis II.
We corrected the clerical mistake (line 50).
Q. 18) 51: Define LC3.
We added the definition of LC3 (line 51-52).
Q. 19) 60: 'lysosomal activity in oocytes is upregulated by sperm-derived factors as the oocytes grow and mature'. As written, the sentence implies that oocytes grow and mature after fertilization. This may be true for maturation, but I would be surprised to learn that there is growth of the oocyte after fertilization.
We appreciate this valuable comment.
The C. elegans lives mainly as a hermaphrodite, which contains a couple of U-shaped gonad arms including the ovary, spermatheca and uterus in the body. Oocytes grow in the ovary and maturate upon receiving major sperm proteins secreted from sperms and ovulated to the spermatheca for fertilization. In 2017, Kenyon’s group reported that major sperm proteins act as sperm-secreted hormones to upregulates the lysosomal activity in oocytes during oocyte growth and maturation. We have revised our manuscript to avoid misunderstanding, to ' lysosomal activity in oocytes is upregulated by major sperm proteins secreted from sperms as the oocytes grow and mature '. (L. 61-66).
Q. 20) 94 and Figure 1B: While it is clear that many LAMP1 foci at the late 2-cell stage do not also contain RAB5, it seems that the majority of RAB5 loci also stain for LAMP1. This may be a minor point in the context of the paper but could be clarified.
We could not easily agree with the suggestion because of the possibility that the images might give different impressions on each plane. Therefore, as a way to verify this point, we attempted to quantify the co-localization by reconstructing the 3D puncta information based on the two types of antibody staining data. Unfortunately, as shown in Fig. 1AB, Rab5 had a high cytoplasmic background, and although we were able to extract peaks, we could not reliably recalibrate the three-dimensional punctate structure (please refer to the new Supplementary Fig. 6). Therefore, co-localization on each other's punctate structure (LAMP1/RAB5 vs. RAB5/LAMP1) could not be verified. The validation using specific planes also showed large differences between planes, with overlapping punctate structures counted separately in adjacent planes, making reliable quantification difficult. This is an issue that will be addressed in the future.
On the other hand, the newly added Z-projection figure (Fig. 1AB) shows that RAB5-positive and LAMP1-negative punctate structures tend to accumulate along the LAMP1-positive punctate structures larger than 1 µm at the late 2-cell stage in all observed embryos; we added this statement on lines 99–101.
Q. 21) 100-102 and Figure 2A: Does the decrease in the total number of LAMP1 foci refer just to cytoplasmic or also to membrane foci? If the former, what was the reason for not including the membrane in the analysis?
We appreciate the critical question. The LAMP1 signal on the plasma membrane interfered with the measurement of the signals just below the plasma membrane. The biological cause of this increased signal on the plasma membrane, as shown in Fig. 2E, seemed to be caused by the migration of the LAMP1 signals post-fertilization, which was also reported in a previous paper by Zaffagnini et al. (2024), published in Cell.
In our analysis, oocytes are giant cells, and confocal imaging has a technical limitation in obtaining the same fluorescent intensity along the z-axis. However, 3D-object analysis requires thresholding based on absolute values. As a result of this situation, the presence of the plasma membrane signal caused punctate structures located close to the membrane to be captured and recognized as a single, very large LAMP1-positive structure, resulting in the loss of the punctate structure that should be measured.
To avoid this issue, we have used several programs to correct the fluorescence difference along the z-axis; nonetheless, these attempts were unsuccessful. Therefore, as described in the Materials and Methods section, we applied only background subtraction at each z-position and then manually removed the plasma membrane signal (which was thin and continuous at the edges). Furthermore, when the plasma membrane and punctate structure signals overlapped, we paid attention not to remove the signals but to separate them. Thus, we believe that the decrease in the number and volume of LAMP1-positive structures after fertilization is still a phenomenon associated with the shift of LAMP1 to the plasma membrane.
Q. 22) Figure 2B, F, G: As the x-axis does not represent a continuous variable, adjacent data points should not be connected by a line. The histogram representations in A, C, and E are much easier to understand. I suggest presenting all data in this format.
We revised the line graphs to bar graphs. Besides, to make the significance among populations clearer, the significances are now expressed using alphabetical indicators.
Q. 23) Figure 2B, C: It seems that the values for the different stages are expressed relative to the value at MII. Why not use the GV value at the base-line? This would follow the developmental trajectory of the oocyte/embryo more directly and would not (I believe) change the conclusions.
We appreciated the comment. We meant to express that ELYSA develops most in the MII phase and that it decreases after fertilization, so considering the reviewer’s suggestion, we expressed GV-MII changes based on GV and changes after fertilization based on the MII phase (Fig. 2C, D).
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public Review):
Summary:
This paper details a study of endothelial cell vessel formation during zebrafish development. The results focus on the role of aquaporins, which mediate the flow of water across the cell membrane, leading to cell movement. The authors show that actin and water flow together drive endothelial cell migration and vessel formation. If any of these two elements are perturbed, there are observed defects in vessels. Overall, the paper significantly improves our understanding of cell migration during morphogenesis in organisms.
Strengths:
The data are extensive and are of high quality. There is a good amount of quantification with convincing statistical significance. The overall conclusion is justified given the evidence.
Weaknesses:
There are two weaknesses, which if addressed, would improve the paper.
(1) The paper focuses on aquaporins, which while mediates water flow, cannot drive directional water flow. If the osmotic engine model is correct, then ion channels such as NHE1 are the driving force for water flow. Indeed this water is shown in previous studies. Moreover, NHE1 can drive water intake because the export of H+ leads to increased HCO3 due to the reaction between CO2+H2O, which increases the cytoplasmic osmolarity (see Li, Zhou and Sun, Frontiers in Cell Dev. Bio. 2021). If NHE cannot be easily perturbed in zebrafish, it might be of interest to perturb Cl channels such as SWELL1, which was recently shown to work together with NHE (see Zhang, et al, Nat. Comm. 2022).
(2) In some places the discussion seems a little confusing where the text goes from hydrostatic pressure to osmotic gradient. It might improve the paper if some background is given. For example, mention water flow follows osmotic gradients, which will build up hydrostatic pressure. The osmotic gradients across the membrane are generated by active ion exchangers. This point is often confused in literature and somewhere in the intro, this could be made clearer.
Reviewer #1 (Recommendations For The Authors):
(1) The paper focuses on aquaporins, which while mediating water flow, cannot drive directional water flow. If the osmotic engine model is correct, then ion channels such as NHE1 are the driving force for water flow. Indeed this water is shown in previous studies. Moreover, NHE1 can drive water intake because the export of H+ leads to increased HCO3 due to the reaction between CO2+H2O, which increases the cytoplasmic osmolarity (see Li, Zhou and Sun, Frontiers in Cell Dev. Bio. 2021). If NHE cannot be easily perturbed in zebrafish, it might be of interest to perturb Cl channels such as SWELL1, which was recently shown to work together with NHE (see Zhang, et al, Nat. Comm. 2022).
We thank Reviewer #1 for this very important comment and the suggestion to examine the function of ion channels in establishing an osmotic gradient to drive directional flow. We have taken on board the reviewer’s suggestion and examined the expression of NHE1 and SWELL1 in endothelial cells using published scRNAseq of 24 hpf ECs (Gurung et al, 2022, Sci. Rep.). We found that slc9a1a, slc9a6a, slc9a7, slc9a8, lrrc8aa and lrrc8ab are expressed in different endothelial subtypes. To examine the function of NHE1 and SWELL1 in endothelial cell migration, we used the pharmacological compounds, 5-(N-ethyl-Nisopropyl)amiloride (EIPA) and DCPIB, respectively. While we were unable to observe an ISV phenotype after EIPA treatment at 5, 10 and 50µM, we were able to observe impaired ISV formation after DCPIB treatment that was very similar to that observed in Aquaporin mutants. We were very encouraged by these results and proceeded to perform more detailed experiments whose results have yielded a new figure (Figure 6) and are described and discussed in lines 266 to 289 and 396 to 407, respectively, in the revised manuscript.
(2) In some places the discussion seems a little confusing where the text goes from hydrostatic pressure to osmotic gradient. It might improve the paper if some background is given. For example, mention water flow follows osmotic gradients, which will build up hydrostatic pressure. The osmotic gradients across the membrane are generated by active ion exchangers. This point is often confused in literature and somewhere in the intro, this could be made clearer.
Thank you for pointing out the deficiency in explaining how osmotic gradients drive water flow to build up hydrostatic pressure. We have clarified this in lines 50, 53 - 54 and 385.
The two recommendations listed above would improve the paper. They are however not mandatory. The paper would be acceptable with some clarifying rewrites. I am not an expert on zebrafish genetics, so it might be difficult to perturb ion channels in this model organism. Have the authors tried to perturb ion channels in these cells?
We hope that our attempts at addressing Reviewer’s 1 comments are satisfactory and sufficient to clarify the concerns outlined.
Reviewer #2 (Public Review):
Summary:
Directional migration is an integral aspect of sprouting angiogenesis and requires a cell to change its shape and sense a chemotactic or growth factor stimulus. Kondrychyn I. et al. provide data that indicate a requirement for zebrafish aquaporins 1 and 8, in cellular water inflow and sprouting angiogenesis. Zebrafish mutants lacking aqp1a.1 and aqp8a.1 have significantly lower tip cell volume and migration velocity, which delays vascular development. Inhibition of actin formation and filopodia dynamics further aggravates this phenotype. The link between water inflow, hydrostatic pressure, and actin dynamics driving endothelial cell sprouting and migration during angiogenesis is highly novel.
Strengths:
The zebrafish genetics, microscopy imaging, and measurements performed are of very high quality. The study data and interpretations are very well-presented in this manuscript.
Weaknesses:
Some of the mechanobiology findings and interpretations could be strengthened by more advanced measurements and experimental manipulations. Also, a better comparison and integration of the authors' findings, with other previously published findings in mice and zebrafish would strengthen the paper.
We thank Reviewer #2 for the critique that the paper can be strengthened by more advanced measurements and experimental manipulations. One of the technical challenges that we face is how to visualize and measure water flow directly in the zebrafish. We have therefore taken indirect approaches to assess water abundance in endothelial cells in vivo. One approach was to measure the diffusion of GEM nanoparticles in tip cell cytoplasm in wildtype and Aquaporin mutants, but results were inconclusive. The second was to measure the volume of tip cells, which should reflect water in/outflow. As the second approach produced clear and robust differences between wildtype ECs, ECs lacking Aqp1a.1 and Aqp8a.1 and ECs overexpressing Aqp1a.1 (revised Fig. 5), we decided to present these data in this manuscript.
We have also taken Reviewer 2 advice to better incorporate previously published data in our discussion (see below and lines 374 to 383 of the revised manuscript).
Reviewer #2 (Recommendations For The Authors):
I have a few comments that the authors may address to further improve their manuscript analysis, quality, and impact.
Major comments:
(1) Citation and discussion of published literature
The authors have failed to cite and discuss recently published results on the role of aqp1a.1 and aqp8a.1 in ISV formation and caliber in zebrafish (Chen C et al. Cardiovascular Research 2024). That study showed a similar impairment of ISV formation when aqp1a.1 is absent but demonstrated a stronger phenotype on ISV morphology in the absence of aqp8a.1 than the current manuscript by Kondrychyn I et al. Furthermore, Chen C et al show an overall decrease in ISV diameter in single aquaporin mutants suggesting that the cell volume of all ECs in an ISV is affected equally. Given this published data, are ISV diameters affected in single and double mutants in the current study by Kondrochyn I et al? An overall effect on ISVs would suggest that aquaporin-mediated cell volume changes are not an inherent feature of endothelial tip cells. The authors need to analyse/compare and discuss all differences and similarities of their findings to what has been published recently.
We apologise for having failed and discussed the recently published paper by Chen et al. This has been corrected and discussed in lines 374 to 383.
In the paper by Chen et al, the authors describe a role of Aqp1a.1 and Aqp8a.1 in regulating ISV diameter (ISV diameter was analysed at 48 hpf) but they did not examine the earlier stages of sprouting angiogenesis between 20 to 30 hpf, which is the focus of our study. We therefore cannot directly compare the ISV phenotypes with theirs. Nevertheless, we recognise that there are differences in ISV phenotypes from 2 dpf. For example, they did not observe incompletely formed or missing ISVs at 2 and 3 dpf, which we clearly observe in our study. This could be explained by differences in the mutations generated. In Chen et al., the sgRNA used targeted the end of exon 2 that resulted in the generation of a 169 amino acid truncated aqp1a.1 protein. However, in our approach, our sgRNA targeted exon 1 of the gene that resulted in a truncated aqp1a.1 protein that is 76 amino acid long. As for the aqp8a.1 zebrafish mutant that we generated, our sgRNA targeted exon 1 of the gene that resulted in a truncated protein that is 73 amino acids long. In Chen et al., the authors did not generate an aqp8a.1 mutant but instead used a crispant approach, which leads to genetic mosaicism and high experimental variability.
Following the reviewer’s suggestion, we have now measured the diameters of arterial ISVs (aISVs) and venous ISVs (vISVs) in aqp1a.1<sup>-/-</sup>, aqp8a.1<sup>-/-</sup> and aqp1a.1<sup>-/-</sup>;aqp8a.1<sup>-/-</sup> zebrafish. In our lab, we always make a distinction between aISVs and vISVs are their diameters are significantly different from each other. The results are in Fig S11A. While we corroborate a decrease in diameter in both aISVs and vISVs in single aqp1a.1<sup>-/-</sup> and double aqp1a.1<sup>-/-</sup>;aqp8a.1<sup>-/-</sup>.zebrafish, we observed a slight increase in diameter in both aISVs and vISVs in aqp8a.1<sup>-/-</sup> zebrafish at 2 dpf. We also measured the diameter of aISV and vISV in Tg(fli1ep:aqp1a.1-mEmerald) and Tg(fli1ep:aqp8a.1-mEmerald) zebrafish at 2 dpf (Fig S11B) and unlike in Chen et al., we could not detect a difference in the diameter between control and aqp1a.1- or aqp8a.1-overexpressing endothelial cells.
We also would also like to point out that, because ISVs are incompletely formed or are missing in aqp1a.1<sup>-/-</sup>;aqp8a.1<sup>-/-</sup> zebrafish (Fig. 3G – L), blood flow is most likely altered in the zebrafish trunk of these mutants, and this can have a secondary effect on blood vessel calibre or diameter. In fact, we often observed wider ISVs adjacent to unperfused ISVs (Fig. 3J) as more blood flow enters the lumenized ISV. Therefore, to determine the cell autonomous function of Aquaporin in mediating cell volume changes in vessel diameter regulation, one would need to perform cell transplantation experiments where we would measure the volume of single aqp1a.1<sup>-/-</sup>;aqp8a.1<sup>-/-</sup> endothelial cells in wildtype embryos with normal blood flow. As this is beyond the scope of the present study, we have not done this experiment during the revision process.
(2) Expression of aqp1a.1 and aqp8a.1
The quantification shown in Figure 1G shows a relative abundance of expression between tip and stalk cells. However, it seems aqp8a.1 is almost never detected in most tip cells. The authors could show in addition, the % of Tip and stalk cells with detectable expression of the 2 aquaporins. It seems aqp8a1 is really weakly or not expressed in the initial stages. Ofcourse the protein may have a different dynamic from the RNA.
We would like to clarify that aqp8a.1 mRNA is not detected in tip cells of newly formed ISVs at 20hpf. At 22 hpf, it is expressed in both tip cells (22 out of 23 tip cells analysed) and stalk cells of ISVs at 22hpf. This is clarified in lines 107 - 109. We also include below a graph showing that although aqp8a.1 mRNA is expressed in tip cells, its expression is higher in stalk cells.
Author response image 1.
Could the authors show endogenously expressed or tagged protein by antibody staining? The analysis of the Tg(fli1ep:aqp8a.1-mEmerald)rk31 zebrafish line is a good complement, but unfortunately, it does not reveal the localization of the endogenously expressed protein. Do the authors have any data supporting that the endogenously expressed aqp8a.1 protein is present in sprouting tip cells?
We tested several antibodies against AQP1 (Alpha Diagnostic International, AQP11-A; ThermoFisher Scientific, MA1-20214; Alomone Labs, AQP-001) and AQP8 (Sigma Aldrich, SAB 1403559; Alpha Diagnostic International, AQP81-A; Almone Labs, AQP-008) but unfortunately none worked. As such, we do not have data demonstrating endogenous expression and localisation of Aqp1a.1 and Aqp8a.1 proteins in endothelial cells.
Could the authors perform F0 CRISPR/Cas9 mediated knockin of a small tag (i.e. HA epitope) in zebrafish and read the endogenous protein localization with anti-HA Ab?
CRISPR/Cas9 mediated in-frame knock-in of a tag into a genomic locus is a technical challenge that our lab has not established. We therefore cannot do this experiment within the revision period.
Given the double mutant phenotypic data shown, is aqp8a.1 expression upregulated and perhaps more important in aqp1a.1 mutants?
In our analysis of aqp1a.1 homozygous zebrafish, there is a slight down_regulation in _aqp8a.1 expression (Fig. S5C). Because the loss of Aqp1a.1 leads to a stronger impairment in ISV formation than the loss of Aqp8a.1 (see Fig. S6F, G, I and J), we believe that Aqp1a.1 has a stronger function than Aqp8a.1 in EC migration during sprouting angiogenesis.
Regarding the regulation of expression by the Vegfr inhibitor Ki8751, does this inhibitor affect Vegfr/ERK signalling in zebrafish and the sprouting of ISVs significantly?
ki8751 has been demonstrated to inhibit ERK signalling in tip cells in the zebrafish by Costa et al., 2016 in Nature Cell Biology. In our experiments, treatment with 5 µM ki8751 for 6 hours from 20 hpf also inhibited sprouting of ISVs.
The data presented suggest that tip cells overexpressing aqp1a.1-mEmerald (Figure 2C) need more than 6 times longer to migrate the same distance as tip cells expressing aqp8a.1mEmerald (Figure 2D). How does this compare with cells expressing only Emerald? A similar time difference can be seen in Movie S1 and Movie S2. Is it just a coincidence? Could aqp8a.1, when expressed at similar levels than aqp1a, be more functional and induce faster cell migration? These experiments were interpreted only for the localization of the proteins, but not for the potential role of the overexpressed proteins on function. Chen C et al. Cardiovascular Research 2024 also has some Aqp overexpression data.
The still images prepared for Fig. 2 C and D were selected to illustrate the localization of Aqp1a.1-mEmerald and Aqp8a.1-mEmerald at the leading edge of migrating tip cells. We did not notice that the tip cell overexpressing Aqp1a.1-mEmerald (Figure 2C) needed more than 6 times longer to migrate the same distance as the tip cell expressing aqp8a.1-mEmerald (Figure 2D), which the reviewer astutely detected. To ascertain whether there is a difference in migration speed between Aqp1a.1-mEmerald and Aqp8a.1-mEmerald overexpressing endothelial cells, we measured tip cell migration velocity of three ISVs from Tg(fli1ep:aqp1a.1-mEmerald) and Tg(fli1ep:aqp8a.1-mEmerald) zebrafish during the period of ISV formation (24 to 29 hpf) using the Manual Tracking plugin in Fiji. As shown in the graph, there is no significant difference in the migration speed of ECs overexpressing Aqp1a.1-mEmerald and Aqp8a.1-mEmerald, suggesting that Aqp8a.1-overexpressing cells migrate at a similar rate as Aqp1a.1-overexpressing cells. As we have not generated a Tg(fli1ep:mEmerald) zebrafish line, we are unable to determine whether endothelial cells migrate faster in Tg(fli1ep:aqp1a.1mEmerald) and Tg(fli1ep:aqp8a.1-mEmerald) zebrafish compared to endothelial cell expressing only mEmerald. As for the observation that tip cells overexpressing aqp1a.1mEmerald (Figure 2C) need more than 6 times longer to migrate the same distance as tip cells expressing aqp8a.1-mEmerald, we can only surmise that it is coincidental that the images selected “showed” faster migration of one ISV from Tg(fli1ep:aqp8a.1-mEmerald) zebrafish. We do not know whether the Aqp1a.1 and Aqp8a.1 are overexpressed to the same levels in Tg(fli1ep:aqp1a.1mEmerald) and Tg(fli1ep:aqp8a.1-mEmerald) zebrafish.
We would also like to point out that when we analysed the lengths of ISVs at 28 hpf in aqp1a.1<sup>-/-</sup> and aqp8a.1<sup>-/-</sup> zebrafish, ISVs were shorter in aqp1a.1<sup>-/-</sup> zebrafish compared to aqp8a.1<sup>-/-</sup> zebrafish (Fig. S6 F to J). These results indicate that the loss of Aqp1a.1 function causes slower migration than the loss of aqp8a.1 function, and suggest that Aqp1a.1 induces faster endothelial cell migration that Aqp8a.1.
Author response image 2.
The data on Aqps expression after the Notch inhibitor DBZ seems unnecessary, and is at the moment not properly discussed. It is also against what is set in the field. aqp8a.1 levels seem to increase only 24h after DBZ, not at 6h, and still authors conclude that Notch activation inhibits aqp8a.1 expression (Line 138-139). In the field, Notch is considered to be more active in stalk cells, where aqp8a.1 expression seems higher (not lower). Maybe the analysis of tip vs stalk cell markers in the scRNAseq data, and their correlation with Hes1/Hey1/Hey2 and aqp1 vs aqp8 mRNA levels will be more clear than just showing qRT-PCR data after DBZ.
As our scRNAseq data did not include ECs from earlier during development when ISVs are developing, we have analysed of scRNAseq data of 24 hpf endothelial cells published by Gurung et al, 2022 in Scientific Reports during the revision of this manuscript. However, we are unable to detect separate clusters of tip and stalk cells. As such, we are unable to correlate hes1/hey1/hey2 expression (which would be higher in stalk cells) with that of aqp1a.1/aqp8a.1. Also, we have decided to remove the DBZ-treatment results from our manuscript as we agree with the two reviewers that they are unnecessary.
The paper would also benefit from some more analysis and interpretation of available scRNAseq data in development/injury/disease/angiogenesis models (zebrafish, mice or humans) for the aquaporin genes characterized here. To potentially raise a broader interest at the start of the paper.
We thank the reviewer for suggesting examining aquaporin genes in other angiogenesis/disease/regeneration models to expand the scope of aquaporin function. We will do this in future studies.
(3) Role of aqp1a.1 and aqp8a.1 on cytoplasmic volume changes and related phenotypes
In Figure 5 the authors show that Aqp1/Aqp8 mutant endothelial tip cells have a lower cytoplasmic volume than tip cells from wildtype fish. If aquaporin-mediated water inflow occurs locally at the leading edge of endothelial tip cells (Figure 2, line 314-318), why doesn't cytoplasmic volume expand specifically only at that location (as shown in immune cells by Boer et al. 2023)? Can the observed reduction in cytoplasmic volume simply be a side-effect of impaired filopodia formation (Figure 4F-I)?
We believe that water influx not only expands filopodia but also the leading front of tip cells (see bracket region in Fig. 4D), where Aqp1a.1-mEmerald/Aqp8a.1-mEmerald accumulate (Fig. 2), to generate an elongated protrusion and forward expansion of the tip cell. The decrease in cytoplasmic volume observed in the aqp1a.1;aqp8a.1 double mutant zebrafish is a result of decreased formation of these elongated protrusions at the leading front of migration tip cells as shown in Fig. 4E (compare to Fig. 4D), not from just a decrease in filopodia number. In fact, in the method used to quantify cell volume, mEmerald/EGFP localization is limited to the cytoplasm and does not label filopodia well (compare mEmerald/EGFP in green with membrane tagged-mCherry in Fig. 5A - C). The volume measured therefore reflects cytoplasmic volume of the tip cell, not filopodia volume.
Do the authors have data on cytoplasmic volume changes of endothelial tip cells in latrunculin B treated fish? The images in Figures 6 A,B suggest that there is a difference in cell volume upon lat b treatment only.
No, unfortunately we have not performed single cell labelling and measurement of tip cells in Latrunculin B-treated embryos. We can speculate that as there is a decrease in actindriven membrane protrusions in this experiment, one would also expect a decrease in cell volume as the reviewer has observed.
(4) Combined loss of aquaporins and actin-based force generation.
Lines 331-332 " we show that hydrostatic pressure is the driving force for EC migration in the absence of actin-based force generation"....better leave it more open and stick to the data. The authors show that aquaporin-mediated water inflow partially compensates for the loss of actin-based force generation in cell migration. Not that it is the key driving/rescuing force in the absence of actin-based force.
We have changed it to “we show that hydrostatic pressure can generate force for EC migration in the absence of actin-based force generation” in line 348.
(5) Aquaporins and their role in EC proliferation
In the study by Phnk LK et al. 2013, the authors have shown that proliferation is not affected when actin polymerization or filopodia formation is inhibited. However, in the current manuscript by Kondrychyn I. et al. this has not been analysed carefully. In Movie S4 the authors indicate by arrows tip cells that fail to invade the zebrafish trunk demonstrating a severe defect of sprouting initiation in these mutants. Yet, when only looking at ISVs that reach the dorsal side in Movie S4, it appears that they are comprised of fewer EC nuclei/ISV than the ISVs in Movie S3. At the beginning of DLAV formation, most ISVs in control Movie S3 consist of 3-4 EC nuclei, while in double mutants Movie S4 it appears to be only 2-3 EC nuclei. At the end of the Movie S4, one ISV on the left side even appears to consist of only a single EC when touching the dorsal roof. The authors provide convincing data on how the absence of aquaporin channels affects sprouting initiation and migration speed, resulting in severe delay in ISV formation. However, the authors should also analyse EC proliferation, as it may also be affected in these mutants, and may also contribute to the observed phenotype. We know that effects on cell migration may indirectly change the number of cells and proliferation at the ISVs, but this has not been carefully analysed in this paper.
We thank the reviewer for highlighting the lack of information on EC number and division in the aquaporin mutants. We have now quantified EC number in ISVs that are fully formed (i.e. connecting the DA or PCV to the DLAV) at 2 and 3 dpf and the results are displayed in Figure S10A and B. At 2 dpf, there is a slight but significant reduction in EC number in both aISVs and vISVs in aqp1a.1<sup>-/-</sup> zebrafish and an even greater reduction in the double aqp1a. aqp1a.1<sup>/-</sup>;aqp8a.1<sup>-/-</sup> zebrafish. No significant change in EC number was observed in aqp8a.1<sup>-/-</sup> zebrafish. EC number was also significantly decreased at 3 dpf for aqp1a.1<sup>-/-</sup>, aqp8a.1<sup>-/-</sup> and aqp1a.1<sup>-/-</sup>;aqp8a.1<sup>-/-</sup> zebrafish. The decreased in EC number per ISV may therefore contribute to the observed phenotype.
We have also quantified the number of cell divisions during sprouting angiogenesis (from 21 to 30 hpf) to assess whether the lack of Aquaporin function affects EC proliferation. This analysis shows that there is no significant difference in the number of mitotic events between aqp1a.1<sup>+/-</sup>; aqp8a.1<sup>+/-</sup> and aqp1a.1<sup>-/-</sup>;aqp8a.1<sup>-/-</sup> zebrafish (Figure S10 C), suggesting that the reduction in EC number is not caused by a decrease in EC proliferation.
These new data are reported on lines 198 to 205 of the manuscript.
Minor comments:
- Figure 3K data seems not to be necessary and even partially misleading after seeing Figure 3E. Fig. 3E represents the true strength of the phenotype in the different mutants.
Figure 3K has been removed from Figure 3.
- Typo Figure 3L (VII should be VI).
Thank you for spotting this typo. VII has been changed to VI.
- Line 242: The word "required" is too strong because there is vessel formation without Aqps in endothelial cells.
This has been changed to “ …Aqp1a.1 and Aqp8a.1 regulate sprouting angiogenesis…” (lines 238 - 239).
- From Figure S2, the doublets cluster should be removed.
We have performed a new analysis of 24 hpf, 34hpf and 3 dpf endothelial cells scRNAseq data (the previous analysis did not consist of 24 hpf endothelial cells). The doublets cluster is not included in the UMAP analysis.
- Better indicate the fluorescence markers/alleles/transgenes used for imaging in Figures 6A-D.
The transgenic lines used for this experiment are now indicated in the figure (this figure is now Figure 7).
Reviewer #3 (Public Review):
Summary:
Kondrychyn and colleagues describe the contribution of two Aquaporins Aqp1a.1 and Aqp8a.1 towards angiogenic sprouting in the zebrafish embryo. By whole-mount in situ hybridization, RNAscope, and scRNA-seq, they show that both genes are expressed in endothelial cells in partly overlapping spatiotemporal patterns. Pharmacological inhibition experiments indicate a requirement for VEGR2 signaling (but not Notch) in transcriptional activation.
To assess the role of both genes during vascular development the authors generate genetic mutations. While homozygous single mutants appear normal, aqp1a.1;aqp8a.1 double mutants exhibit defects in EC sprouting and ISV formation.
At the cellular level, the aquaporin mutants display a reduction of filopodia in number and length. Furthermore, a reduction in cell volume is observed indicating a defect in water uptake.
The authors conclude, that polarized water uptake mediated by aquaporins is required for the initiation of endothelial sprouting and (tip) cell migration during ISV formation. They further propose that water influx increases hydrostatic pressure within the cells which may facilitate actin polymerization and formation membrane protrusions.
Strengths:
The authors provide a detailed analysis of Aqp1a.1 and Aqp8a.1 during blood vessel formation in vivo, using zebrafish intersomitic vessels as a model. State-of-the-art imaging demonstrates an essential role in aquaporins in different aspects of endothelial cell activation and migration during angiogenesis.
Weaknesses:
With respect to the connection between Aqp1/8 and actin polymerization/filopodia formation, the evidence appears preliminary and the authors' interpretation is guided by evidence from other experimental systems.
Reviewer #3 (Recommendations For The Authors):
Figure 1 H, J:
The differential response of aqp1/-8 to ki8751 vs DBZ after 6h treatment is quite obvious. Why do the authors show the effect after 24h? The effect is more likely than not indirect.
We agree with the reviewer and we have now removed 24 hour Ki8751 treatment and all DBZ treatments from Figure 1.
Figure 2:
According to the authors' model anterior localization of Aqp1 protein is critical. The authors perform transient injections to mosaically express Aqp fusion proteins using an endothelial (fli1) promoter. For the interpretation, it would be helpful to also show the mCherry-CAAX channel in separate panels. From the images, it is not possible to discern how many cells we are looking at. In particular the movie in panel D may show two cells at the tip of the sprout. A marker labelling cell-cell junctions would help. Furthermore, the authors are using a strong exogenous promoter, thus potentially overexpressing the fusion protein, which may lead to mislocalization. For Aqp1a.1 an antibody has been published to work in zebrafish (e.g. Kwong et al., Plos1, 2013).
We would like to clarify that we generated transgenic lines - Tg(fli1ep:aqp1a.1-mEmerald) and Tg(fli1ep:aqp8a.1-mEmerald) - to visualize the localization of Aqp1a.1 and Aqp8a.1 in endothelial cells, and the images displayed in Fig. 2 are from the transgenic lines (not transient, mosaic expression).
To aid visualization and interpretation, we have now added mCherry-CAAX only channel to accompany the Aqp1a.1/Aqp8a.1-mEmerald channel in Fig. 2A and B. To discern how many cells there are in the ISVs at this stage, we have crossed Tg(fli1ep:aqp1a.1-mEmerald) and Tg(fli1ep:aqp8a.1-mEmerald) zebrafish to TgKI(tjp1a-tdTomato)<sup>pd1224</sup> (Levic et al., 2021) to visualize ZO1 at cell-cell junction. However, because tjp1-tdTomato is expressed in all cell types including the skin that lies just above the ISV and the signal in ECs in ISVs is very weak at 22 to 25 hpf, it was very difficult to obtain good quality images that can properly delineate cell boundaries to determine the number of cells in the ISVs at this early stage. Instead, we have annotated endothelial cell boundaries based on more intense mCherryCAAX fluorescence at cell-cell borders, and from the mosaic expression of mCherryCAAX that is intrinsic to the Tg(kdrl:ras-mCherry)<sup>s916</sup> zebrafish line.
In Fig. 2D, there are two endothelial cells in the ISV during the period shown but there is only 1 cell occupying the tip cell position i.e. there is one tip cell in this ISV. Unlike the mouse retina where it has been demonstrated that two endothelial cells can occupy the tip cell position side-by-side (Pelton et al., 2014), this is usually not observed in zebrafish ISVs. This is demonstrated in Movie S3, where it is clear that one nucleus (belonging to the tip cell) occupies the tip of the growing ISV. The accumulation of intracellular membranes is often observed in tip cells that may serve as a reservoir of membranes for the generation of membrane protrusions at the leading edge of tip cells.
We agree that by generating transgenic Tg(fli1ep:aqp1a.1-mEmerald) and Tg(fli1ep:aqp8a.1mEmerald) zebrafish, Aqp1a.1 and Aqp8a.1 are overexpressed that may affect their localization. The eel anti-Aqp1a.1 antibody used in (Kwong et la., 2013) was a gift from Dr. Gordon Cramb, Univ. of St Andrews, Scotland and it was first published in 2001. This antibody is not available commercially. Instead, we have tried to several other antibodies against AQP1 (Alpha Diagnostic International , AQP11-A; ThermoFisher Scientific, MA120214; Alomone Labs, AQP-001) and AQP8 (Sigma Aldrich, SAB 1403559; Alpha Diagnostic International, AQP81-A; Almone Labs, AQP-008) but unfortunately none worked. As such, we cannot compare localization of Aqp1a.1-mEmerald and Aqp8a.1-mEmerald with the endogenous proteins.
Figure 3:
E: the quantification is difficult to read. Wouldn't it be better to set the y-axis in % of the DV axis? (see also Figure S6).
We would like to show the absolute length of the ISVs, and to illustrate that the ISV length decreases from anterior to posterior of the zebrafish trunk. We have increased the size of Fig. 3E to enable easier reading of the bars.
K: This quantification appears arbitrary.
We have removed this panel from Figure 3.
G-J: The magenta channel is difficult to see. Is the lifeact-mCherry mosaic? In panel J there appears to be a nucleus between the sprout and the DLAV. It would be helpful to crop the contralateral side of the image.
No, the Tg(fli1:Lifeact-mCherry) line is not mosaic. The “missing” vessels are not because of mosaicism in transgene but because of truncated ISVs that is a phenotype of loss Aquaporin function. We have changed the magenta channel to grey and hope that by doing so, the reviewer will be able to see the shape of the blood vessels more clearly. We would like to leave the contralateral side in the images, as it shows that the defective vessel is only on one side of body. Furthermore, when we tried to remove it (reducing the number of Z-stacks) neighbour ISV looks incomplete because the embryos were not mounted flat. To clarify what the nucleus between the sprout and the DLAV is, we have indicated that it is that of the contralateral ISV.
L: I do not quite understand the significance of the different classes of phenotypes. Do the authors propose different morphogenetic events or contexts of how these differences come about?
Here, we report the different types of ISV phenotypes that we observe in 3 dpf aqp1a.1<sup>-/-</sup>; aqp8a.1<sup>-/-</sup> zebrafish (Fig. 3 and Fig. S7). As demonstrated in Fig. 4, most of the phenotypes can be explained by the delayed emergence of tip cells from the dorsal aorta and slower tip cell migration. However, in some instances, we also observed retraction of tip cells (Movie S4) and failure of tip cells to emerge from the dorsal aorta or endothelial cell death (see attached figure on page 14), which can give rise to the Class II phenotype. In the dominant class I phenotype (in contrast to class II), secondary sprouting from the posterior cardinal vein is unaffected, and the secondary sprout migrates dorsally passing the level of horizontal myoseptum but cannot complete the formation of vISV (it stops beneath the spinal cord). The Class III phenotype appears to result from a failure of the secondary sprout to fuse with the regressed primary ISV. In the Class IV phenotype, the ventral EC does not maintain a connection to the dorsal aorta. We did not examine how Class III and IV phenotypes arise in detail in this current study.
Author response image 3.
Figure 4:
This figure nicely demonstrates the defects in cell behavior in aqp mutants.
In panel F it would be helpful to show the single channels as well as the merge.
We have now added single channels for PLCd1PH and Lifeact signal in panels F and G.
In Figure 1 the authors argue that the reduction of Aqp1/8 by VEGFR2 inhibition may account for part of that phenotype. In turn, the aqp phenotype seems to resemble incomplete VEGFR2 inhibition. The authors should check whether expression Aqp1Emerald can partially rescue ki8751 inhibition.
To address the reviewer’s comment, we have treated Tg(fli1ep:Aqp1-Emerald) embryos with ki8751 from 20 hpf for 6 hours but we were unable to observe a rescue in sprouting. It could be because VEGFR2 inhibition also affects other downstream signalling pathways that also control cell migration as well as proliferation.
Based on previous studies (Loitto et al.; Papadopoulus et al.) the authors propose that also in ISVs aquaporin-mediated water influx may promote actin polymerization and thereby filopodia formation. However, while the effect on filopodia number and length is well demonstrated, the underlying cause is less clear. For example, filopodia formation could be affected by reduced cell polarization. This can be tested by using a transgenic golgi marker (Kwon et al., 2016).
We have examined tip cell polarity of wildtype, aqp1a.1<sup>-/-</sup> and aqp8a. 1<sup>-/-</sup> embryos at 24-26 hpf by analysing Golgi position relative to the nucleus. We were unable to analyze polarity in aqp1a.1<sup>rk28/rk28</sup>; aqp8a.1<sup>rk29/rk29</sup> embryos as they exist in an mCherry-containing transgenic zebrafish line (the Golgi marker is also tagged to mCherry). The results show that tip cell polarity is similar, if not more polarised, in aqp1a.1<sup>-/-</sup> and aqp8a. 1<sup>-/-</sup> embryos when compared to wildtype embryos (Fig. S10D). This new data is discussed in lines 234 to 237.
Figure 5:
Panel D should be part of Figure 4.
Panel 5D is now in panel J of Figure 4 and described in lines 231 and 235.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Public Reviews:
Reviewer #1 (Public review):
Summary:
This paper is an elegant, mostly observational work, detailing observations that polysome accumulation appears to drive nucleoid splitting and segregation. Overall I think this is an insightful work with solid observations.
Thank you for your appreciation and positive comments. In our view, an appealing aspect of this proposed biophysical mechanism for nucleoid segregation is its self-organizing nature and its ability to intrinsically couple nucleoid segregation to biomass growth, regardless of nutrient conditions.
Strengths:
The strengths of this paper are the careful and rigorous observational work that leads to their hypothesis. They find the accumulation of polysomes correlates with nucleoid splitting, and that the nucleoid segregation occurring right after splitting correlates with polysome segregation. These correlations are also backed up by other observations:
(1) Faster polysome accumulation and DNA segregation at faster growth rates.
(2) Polysome distribution negatively correlating with DNA positioning near asymmetric nucleoids.
(3) Polysomes form in regions inaccessible to similarly sized particles.
These above points are observational, I have no comments on these observations leading to their hypothesis.
Thank you!
Weaknesses:
It is hard to state weaknesses in any of the observational findings, and furthermore, their two tests of causality, while not being completely definitive, are likely the best one could do to examine this interesting phenomenon.
It is indeed difficult to prove causality in a definitive manner when the proposed coupling mechanism between nucleoid segregation and gene expression is self-organizing, i.e., does not involve a dedicated regulatory molecule (e.g., a protein, RNA, metabolite) that we could have depleted through genetic engineering to establish causality. We are grateful to the reviewer for recognizing that our two causality tests are the best that can be done in this context.
Points to consider / address:
Notably, demonstrating causality here is very difficult (given the coupling between transcription, growth, and many other processes) but an important part of the paper. They do two experiments toward demonstrating causality that help bolster - but not prove - their hypothesis. These experiments have minor caveats, my first two points.
(1) First, "Blocking transcription (with rifampicin) should instantly reduce the rate of polysome production to zero, causing an immediate arrest of nucleoid segregation". Here they show that adding rifampicin does indeed lead to polysome loss and an immediate halting of segregation - data that does fit their model. This is not definitive proof of causation, as rifampicin also (a) stops cell growth, and (b) stops the translation of secreted proteins. Neither of these two possibilities is ruled out fully.
That’s correct; cell growth also stops when gene expression is inhibited, which is consistent with our model in which gene expression within the nucleoid promotes nucleoid segregation and biomass growth (i.e., cell growth), inherently coupling these two processes. This said, we understand the reviewer’s point: the rifampicin experiment doesn’t exclude the possibility that protein secretion and cell growth drive nucleoid segregation. We are assuming that the reviewer is envisioning an alternative model in which sister nucleoids would move apart because they would be attached to the membrane through coupled transcription-translation-protein secretion (transertion) and the membrane would expand between the separating nucleoids, similar to the model proposed by Jacob et al in 1963 (doi:10.1101/SQB.1963.028.01.048). There are several observations arguing against this cell elongation/transertion model.
(1) For this alternative mechanism to work, membrane growth must be localized at the middle of the splitting nucleoids (i.e., midcell position for slow growth and ¼ and ¾ cell positions for fast growth) to create a directional motion. To our knowledge, there is no evidence of such localized membrane incorporation. Furthermore, even if membrane growth was localized at the right places, the fluidity of the cytoplasmic membrane (PMID: 6996724, 20159151, 24735432, 27705775) would be problematic. To circumvent the membrane fluidity issue, one could potentially evoke an additional connection to the rigid peptidoglycan, but then again, peptidoglycan growth would have to be localized at the middle of the splitting nucleoid. However, peptidoglycan growth is dispersed early in the cell division cycle when the nucleoid splitting happens in fast growing cells and only appears to be zonal after the onset of cell constriction (PMID: 35705811, 36097171, 2656655).
(2) Even if we ignore the aforementioned caveats, Paul Wiggins’s group ruled out the cell elongation/transertion model by showing that the rate of cell elongation is slower than the rate of chromosome segregation (PMID: 23775792). In the revised manuscript, we wil clarify this point and provide confirmatory data showing that the cell elongation rate is indeed slower than the nucleoid segregation rate, indicating that it cannot be the main driver.
(3) Furthermore, our correlation analysis comparing the rate of nucleoid segregation to the rate of either cell elongation or polysome accumulation argues that polysome accumulation plays a larger role than cell elongation in nucleoid segregation. These data were already shown in Figure 1H and Figure 1 – figure supplement 3 of the original manuscript but were not highlighted in this context. We will revise the text to clarify this point.
(4) The asymmetries in nucleoid compaction that we described in our paper are predicted by our model. We do not see how they could be explained by cell growth or protein secretion.
(5) We also show that polysome accumulation at ectopic sites (outside the nucleoid) results in correlated nucleoid dynamics, consistent with our proposed mechanism. These nucleoid dynamics cannot be explained by cell growth or protein secretion (transertion).
(1a) As rifampicin also stops all translation, it also stops translational insertion of membrane proteins, which in many old models has been put forward as a possible driver of nucleoid segregation, and perhaps independent of growth. This should at last be mentioned in the discussion, or if there are past experiments that rule this out it would be great to note them.
It is not clear to us how the attachment of the DNA to the cytoplasmic membrane could alone create a directional force to move the sister nucleoids. We agree that old models have proposed a role for cell elongation (providing the force) and transertion (providing the membrane tether). Please see our response above for the evidence (from the literature and our work) against it. This was mentioned in the introduction and Results section, but we agree that this was not well explained. We will add experimental data and revise the text to clarify these points.
(1b) They address at great length in the discussion the possibility that growth may play a role in nucleoid segregation. However, this is testable - by stopping surface growth with antibiotics. Cells should still accumulate polysomes for some time, it would be easy to see if nucleoids are still segregated, and to what extent, thereby possibly decoupling growth and polysome production. If successful, this or similar experiments would further validate their model.
We reviewed the literature and could not find a drug that stops cell growth without stopping gene expression. Any drug that affects the membrane integrity or potential stops gene expression, which requires ATP. However, our experiment in which we drive polysome accumulation at ectopic sites decouples polysome accumulation from cell growth. In this experiment, by redirecting most of chromosome gene expression to a single plasmid-encoded gene, we reduce the rate of cell growth but still create a large accumulation of polysomes at an ectopic location. This ectopic polysome accumulation is sufficient to affect nucleoid dynamics in a correlated fashion. In the revised manuscript, we will clarify this point and add model simulations to show that our experimental observations are predicted by our model.
(2) In the second experiment, they express excess TagBFP2 to delocalize polysomes from midcell. Here they again see the anticorrelation of the nucleoid and the polysomes, and in some cells, it appears similar to normal (polysomes separating the nucleoid) whereas in others the nucleoid has not separated. The one concern about this data - and the differences between the "separated" and "non-separated" nuclei - is that the over-expression of TagBFP2 has a huge impact on growth, which may also have an indirect effect on DNA replication and termination in some of these cells. Could the authors demonstrate these cells contain 2 fully replicated DNA molecules that are able to segregate?
We will perform the requested experiment.
(3) What is not clearly stated and is needed in this paper is to explain how polysomes do (or could) "exert force" in this system to segregate the nucleoid: what a "compaction force" is by definition, and what mechanisms causes this to arise (what causes the "force") as the "compaction force" arises from new polysomes being added into the gaps between them caused by thermal motions.
They state, "polysomes exert an effective force", and they note their model requires "steric effects (repulsion) between DNA and polysomes" for the polysomes to segregate, which makes sense. But this makes it unclear to the reader what is giving the force. As written, it is unclear if (a) these repulsions alone are making the force, or (b) is it the accumulation of new polysomes in the center by adding more "repulsive" material, the force causes the nucleoids to move. If polysomes are concentrated more between nucleoids, and the polysome concentration does not increase, the DNA will not be driven apart (as in the first case) However, in the second case (which seems to be their model), the addition of new material (new polysomes) into a sterically crowded space is not exerting force - it is filling in the gaps between the molecules in that region, space that needs to arise somehow (like via Brownian motion). In other words, if the polysome region is crowded with polysomes, space must be made between these polysomes for new polysomes to be inserted, and this space must be made by thermal (or ATP-driven) fluctuations of the molecules. Thus, if polysome accumulation drives the DNA segregation, it is not "exerting force", but rather the addition of new polysomes is iteratively rectifying gaps being made by Brownian motion.
We apologize for the understandable confusion. In our picture, the polysomes and DNA (conceptually considered as small plectonemic segments) basically behave as dissolved particles. If these particles were noninteracting, they would simply mix. However, both polysomes and DNA segments are large enough to interact sterically. So as density increases, steric avoidance implies a reduced conformational entropy and thus a higher free energy per particle. We argue (based on Miangolarra et al. PNAS 2021 PMID: 34675077 and Xiang et al. Cell 2021 PMID: 34186018) that the demixing of polysomes and DNA segments occurs because DNA segments pack better with each other than they do with polysomes. This raises the free energy cost associated with DNA-polysome interactions compared to DNA-DNA interactions. We model this effect by introducing a term in the free energy χ_np, which refer to as a repulsion between DNA and polysomes, though as explained above it arises from entropic effects. At realistic cellular densities of DNA and polysomes this repulsive interaction is strong enough to cause the DNA and polysomes to phase separate.
This same density-dependent free energy that causes phase separation can also give rise to forces, just in the way that a higher pressure on one side of a wall can give rise to a net force on the wall. Indeed, the “compaction force” we refer to is fundamentally an osmotic pressure difference. At some stages during nucleoid segregation, the region of the cell between nucleoids has a higher polysome concentration, and therefore a higher osmotic pressure, than the regions near the poles. This results in a net poleward force on the sister nucleoids that drives their migration toward the poles. This migration continues until the osmotic pressure equilibrates. Therefore, both phase separation (due to the steric repulsion described above) and nonequilibrium polysome production and degradation (which creates the initial accumulation of polysomes around midcell) are essential ingredients for nucleoid segregation.
This will be clarified in the revised text, with the support of additional simulation results.
The authors use polysome accumulation and phase separation to describe what is driving nucleoid segregation. Both terms are accurate, but it might help the less physically inclined reader to have one term, or have what each of these means explicitly defined at the start. I say this most especially in terms of "phase separation", as the currently huge momentum toward liquid-liquid interactions in biology causes the phrase "phase separation" to often evoke a number of wider (and less defined) phenomena and ideas that may not apply here. Thus, a simple clear definition at the start might help some readers.
Phase separation means that the DNA-polysome steric repulsion is strong enough to drive their demixing, which creates a compact nucleoid. As mentioned in a previous point, this effect is captured in the free energy by the χ_np term, which is an effective repulsion between DNA and polysomes, though as explained above it arises from entropic effects.
In the revised manuscript, we will illustrate this with our theoretical model by initializing a cell with a diffuse nucleoid and low polysome concentration. For the sake of simplicity, we assume that the cell does not elongate. We observe that the DNA-polysome steric repulsion is sufficient to compact the nucleoid and place it at mid-cell.
(4) Line 478. "Altogether, these results support the notion that ectopic polysome accumulation drives nucleoid dynamics". Is this right? Should it not read "results support the notion that ectopic polysome accumulation inhibits/redirects nucleoid dynamics"?
We think that this is correct; the ectopic polysome accumulation drives nucleoid dynamics. In our theoretical model, we can introduce polysome production at fixed sources to mimic the experiments where ectopic polysome production is achieved by high plasmid expression (Fig. 6). The model is able to recapitulate the two main phenotypes observed in experiments. These new simulation results will be added to the revised manuscript.
(5) It would be helpful to clarify what happens as the RplA-GFP signal decreases at midcell in Figure 1- is the signal then increasing in the less "dense" parts of the cell? That is, (a) are the polysomes at midcell redistributing throughout the cell? (b) is the total concentration of polysomes in the entire cell increasing over time?
It is a redistribution—the RplA-GFP signal remains constant in concentration from cell birth to division (Figure 1 – Figure Supplement 1E). This will be clarified in the revised text.
(6) Line 154. "Cell constriction contributed to the apparent depletion of ribosomal signal from the mid-cell region at the end of the cell division cycle (Figure 1B-C and Movie S1)" - It would be helpful if when cell constriction began and ended was indicated in Figures 1B and C.
Good idea. We will add markers to indicate the start of cell constriction. We will also indicate that cell birth and division correspond to the first and last images/timepoint in Fig. 1B and C, respectively.
(7) In Figure 7 they demonstrate that radial confinement is needed for longitudinal nucleoid segregation. It should be noted (and cited) that past experiments of Bacillus l-forms in microfluidic channels showed a clear requirement role for rod shape (and a given width) in the positing and the spacing of the nucleoids.
Wu et al, Nature Communications, 2020 . "Geometric principles underlying the proliferation of a model cell system" https://dx.doi.org/10.1038/s41467-020-17988-7
Good point. We will add this reference. Thank you.
(8) "The correlated variability in polysome and nucleoid patterning across cells suggests that the size of the polysome-depleted spaces helps determine where the chromosomal DNA is most concentrated along the cell length. This patterning is likely reinforced through the displacement of the polysomes away from the DNA dense region"
It should be noted this likely functions not just in one direction (polysomes dictating DNA location), but also in the reverse - as the footprint of compacted DNA should also exclude (and thus affect) the location of polysomes
We agree that the effects could go both ways at this early stage of the story. We will revise the text accordingly.
(9) Line 159. Rifampicin is a transcription inhibitor that causes polysome depletion over time. This indicates that all ribosomal enrichments consist of polysomes and therefore will be referred to as polysome accumulations hereafter". Here and throughout this paper they use the term polysome, but cells also have monosomes (and 2 somes, etc). Rifampicin stops the assembly of all of these, and thus the loss of localization could occur from both. Thus, is it accurate to state that all transcription events occur in polysomes? Or are they grouping all of the n-somes into one group?
In the discussion, we noted that our term “polysomes” also includes monosomes for simplicity, but we agree that the term should have been defined much earlier. This will be done in the revised manuscript.
Thank you for the valuable comments and suggestions!
Reviewer #2 (Public review):
Summary:
The authors perform a remarkably comprehensive, rigorous, and extensive investigation into the spatiotemporal dynamics between ribosomal accumulation, nucleoid segregation, and cell division. Using detailed experimental characterization and rigorous physical models, they offer a compelling argument that nucleoid segregation rates are determined at least in part by the accumulation of ribosomes in the center of the cell, exerting a steric force to drive nucleoid segregation prior to cell division. This evolutionarily ingenious mechanism means cells can rely on ribosomal biogenesis as the sole determinant for the growth rate and cell division rate, avoiding the need for two separate 'sensors,' which would require careful coupling.
Terrific summary! Thank you for your positive assessment.
Strengths:
In terms of strengths; the paper is very well written, the data are of extremely high quality, and the work is of fundamental importance to the field of cell growth and division. This is an important and innovative discovery enabled through a combination of rigorous experimental work and innovative conceptual, statistical, and physical modeling.
Thank you!
Weaknesses:
In terms of weaknesses, I have three specific thoughts.
Firstly, my biggest question (and this may or may not be a bona fide weakness) is how unambiguously the authors can be sure their ribosomal labeling is reporting on polysomes, specifically. My reading of the work is that the loss of spatial density upon rifampicin treatment is used to infer that spatial density corresponds to polysomes, yet this feels like a relatively indirect way to get at this question, given rifampicin targets RNA polymerase and not translation. It would be good if a more direct way to confirm polysome dependence were possible.
The heterogeneity of ribosome distribution inside E. coli cells has been attributed to polysomes by many labs (PMID: 25056965, 38678067, 22624875, 31150626, 34186018, 10675340). The attribution is also consistent with single-molecule tracking experiments showing that slow-moving ribosomes (polysomes) are excluded by the nucleoid whereas fast-diffusing ribosomes (free ribosomal subunits) are distributed throughout the cytoplasm (PMID: 25056965, 22624875).
Furthermore, inhibition of translation initiation with kasugamycin treatment, which decreases the pool of polysomes, results in a homogenization of ribosomes and expansion of the nucleoid (see Author response image 1). This further supports the rifampicin experiments. Given that the attribution of ribosome heterogeneity to polysomes is well accepted in the field, we would prefer to not include these kasugamycin data in the revised manuscript because long-term exposure to this drug leads to nucleoid re-compaction (PMID: 25250841 and PMID: 34186018). This secondary effect may possibly be due to a dysregulated increase in synthesis of naked rRNAs (PMID: 14460744, PMID: 2114400, and PMID: 2448483) or ribosome aggregation, which we are currently investigating.
Author response image 1.
Effects of kasugamycin treatment on the intracellular distribution of ribosomes and nucleoids. Representative single cell (CJW7323) growing in M9gluCAAT. Kasugamycin (3 mg/mL) was added at time = 0 min. Show is the early response (0-30 min) to the drug characterized by the homogenization of the ribosomal RplA-GFP fluorescence and the expansion of the HupA-mCherry-labeled nucleoids. For each segmented cell, the RplA-GFP and HupA-mCherry signals were normalized by the average fluorescence.
Second, the authors invoke a phase separation model to explain the data, yet it is unclear whether there is any particular evidence supporting such a model, whether they can exclude simpler models of entanglement/local diffusion (and/or perhaps this is what is meant by phase separation?) and it's not clear if claiming phase separation offers any additional insight/predictive power/utility. I am OK with this being proposed as a hypothesis/idea/working model, and I agree the model is consistent with the data, BUT I also feel other models are consistent with the data. I also very much do not think that this specific aspect of the paper has any bearing on the paper's impact and importance.
We appreciate the reviewer’s comment, but the output of our reaction-diffusion model is a bona fide phase separation (spinodal decomposition). So, we feel that we need to use the term when reporting the modeling results. Inside the cell, the situation is more complicated. As the reviewer points out, there likely are entanglements (not considered in our model) and other important factors (please see our discussion on the model limitations). This said, we will revise our text to clarify our terms and proposed mechanism.
Finally, the writing and the figures are of extremely high quality, but the sheer volume of data here is potentially overwhelming. I wonder if there is any way for the authors to consider stripping down the text/figures to streamline things a bit? I also think it would be useful to include visually consistent schematics of the question/hypothesis/idea each of the figures is addressing to help keep readers on the same page as to what is going on in each figure. Again, there was no figure or section I felt was particularly unclear, but the sheer volume of text/data made reading this quite the mental endurance sport! I am completely guilty of this myself, so I don't think I have any super strong suggestions for how to fix this, but just something to consider.
We agree that there is a lot to digest. We will add schematics and a didactic simulation. We will also try to streamline the text.
Reviewer #3 (Public review):
Summary:
Papagiannakis et al. present a detailed study exploring the relationship between DNA/polysome phase separation and nucleoid segregation in Escherichia coli. Using a combination of experiments and modelling, the authors aim to link physical principles with biological processes to better understand nucleoid organisation and segregation during cell growth.
Strengths:
The authors have conducted a large number of experiments under different growth conditions and physiological perturbations (using antibiotics) to analyse the biophysical factors underlying the spatial organisation of nucleoids within growing E. coli cells. A simple model of ribosome-nucleoid segregation has been developed to explain the observations.
Weaknesses:
While the study addresses an important topic, several aspects of the modelling, assumptions, and claims warrant further consideration.
Thank you for your feedback. Please see below for a response to each concern.
Major Concerns:
Oversimplification of Modelling Assumptions:
The model simplifies nucleoid organisation by focusing on the axial (long-axis) dimension of the cell while neglecting the radial dimension (cell width). While this approach simplifies the model, it fails to explain key experimental observations, such as:
(1) Inconsistencies with Experimental Evidence:
The simplified model presented in this study predicts that translation-inhibiting drugs like chloramphenicol would maintain separated nucleoids due to increased polysome fractions. However, experimental evidence shows the opposite-separated nucleoids condense into a single lobe post-treatment (Bakshi et al 2014), indicating limitations in the model's assumptions/predictions. For the nucleoids to coalesce into a single lobe, polysomes must cross the nucleoid zones via the radial shells around the nucleoid lobes.
We do not think that the results from chloramphenicol-treated cells are inconsistent with our model. Our proposed mechanism predicts that nucleoids will condense in the presence of chloramphenicol, consistent with experiments. It also predicts that nucleoids that were still relatively close at the time of chloramphenicol treatment could fuse if they eventually touched through diffusion (thermal fluctuation) to reduce their interaction with the polysomes and minimize their conformational energy. Fusion is, however, not expected for well-separated nucleoids since their diffusion is slow in the crowded cytoplasm. This is consistent with our experimental observations: In the presence of a growth-inhibitory concentration of chloramphenicol (70 μg/mL), nucleoids in relatively close proximity can fuse, but well-separated nucleoids condense and do not fuse. Since the growth rate inhibition is not immediate upon chloramphenicol treatment, many cells with well-separated condensed nucleoids divide during the first hour. As a result, the non-fusion phenotype is more obvious in non-dividing cells, achieved by pre-treating cells with the cell division inhibitor cephalexin (50μg/mL). In these polyploid elongated cells, well-separated nucleoids condensed but did not fuse, not even after an hour in the presence of chloramphenicol (as illustrated in Author response image 2).
In Bakshi et al, 2014, nucleoid fusion was shown for a single cell in which the sister nucleoids were relatively close to each other at the time of chloramphenicol treatment. Population statistics were provided for the relative length and width of the nucleoids, but not for the fusion events. So, it is unclear whether the illustrated fusion was universal or not. Also, we note that Bakshi et al (2014) used a chloramphenicol concentration of 300 μg/mL, which is 20-fold higher than the minimal inhibitory concentration for growth, compared to 70 μg/mL in our experiments.
Author response image 2.
Effects of chloramphenicol treatment on the intracellular distribution of ribosomes and nucleoids in non-dividing cells. Exponentially growing cells (M9glyCAAT at 30°C) were pre-treated with cephalexin for one hour before being spotted on an 1% agarose pad for time-lapse imaging. The agarose pad contained M9glyCAAT, cephalexin, and chloramphenicol. (A) Phase contrast, RplA-GFP fluorescence and HupA-mCherry fluorescence images of a representative single cell. Three timepoints are shown, including the first image after spotting on the agarose pad (at 0 min), 30 minutes and one hour of chloramphenicol treatment. (B) One-dimensional profiles of the ribosomal (RplA-GFP) and nucleoid (HupA-mCherry) fluorescence from the cells shown in panel A. These intensity profiles correspond to the average fluorescence along the medial axis of the cell considering a 6-pixel region (0.4 μm) centered on the central line of the cell. The fluorescence intensity is plotted along the relative cell length, scaled from 0 to 100% between the two poles, illustrating the relative nucleoid length (L<sub>DNA</sub>/L<sub>cell</sub>) that was plotted by Bakshi et al in 2014 (PMID: 25250841).
(2) The peripheral localisation of nucleoids observed after A22 treatment in this study and others (e.g., Japaridze et al., 2020; Wu et al., 2019), which conflicts with the model's assumptions and predictions. The assumption of radial confinement would predict nucleoids to fill up the volume or ribosomes to go near the cell wall, not the nucleoid, as seen in the data.
The reviewer makes a good point that DNA attachment to the membrane through transertion likely contributes to the nucleoid being peripherally localized in A22 cells. We will revise the text to add this point. However, we do not think that this contradicts the proposed nucleoid segregation mechanism based on phase separation and out-of-equilibrium dynamics described in our model. On the contrary, by attaching the nucleoid to the cytoplasmic membrane along the cell width, transertion might help reduce the diffusion and thus exchange of polysomes across nucleoids. We will revise the text to discuss transertion over radial confinement.
(3) The radial compaction of the nucleoid upon rifampicin or chloramphenicol treatment, as reported by Bakshi et al. (2014) and Spahn et al. (2023), also contradicts the model's predictions. This is not expected if the nucleoid is already radially confined.
We originally evoked radial confinement to explain the observation that polysome accumulations do not equilibrate between DNA-free regions. We agree that transertion is an alternative explanation. Thank you for bringing it to our attention. However, please note that this does not contradict the model. In our view, it actually supports the 1D model by providing a reasonable explanation for the slow exchange of polysomes across DNA-free regions. The attachment of the nucleoid to the membrane along the cell width may act as diffusion barrier. We will revise the text and the title of the manuscript accordingly.
(4) Radial Distribution of Nucleoid and Ribosomal Shell:
The study does not account for well-documented features such as the membrane attachment of chromosomes and the ribosomal shell surrounding the nucleoid, observed in super-resolution studies (Bakshi et al., 2012; Sanamrad et al., 2014). These features are critical for understanding nucleoid dynamics, particularly under conditions of transcription-translation coupling or drug-induced detachment. Work by Yongren et al. (2014) has also shown that the radial organisation of the nucleoid is highly sensitive to growth and the multifork nature of DNA replication in bacteria.
We will discuss the membrane attachment. Please see the previous response.
The omission of organisation in the radial dimension and the entropic effects it entails, such as ribosome localisation near the membrane and nucleoid centralisation in expanded cells, undermines the model's explanatory power and predictive ability. Some observations have been previously explained by the membrane attachment of nucleoids (a hypothesis proposed by Rabinovitch et al., 2003, and supported by experiments from Bakshi et al., 2014, and recent super-resolution measurements by Spahn et al.).
We agree—we will add a discussion about membrane attachment in the radial dimension. See previous responses.
Ignoring the radial dimension and membrane attachment of nucleoid (which might coordinate cell growth with nucleoid expansion and segregation) presents a simplistic but potentially misleading picture of the underlying factors.
As mentioned above, we will discuss membrane attachment in the revised manuscript.
This reviewer suggests that the authors consider an alternative mechanism, supported by strong experimental evidence, as a potential explanation for the observed phenomena:
Nucleoids may transiently attach to the cell membrane, possibly through transertion, allowing for coordinated increases in nucleoid volume and length alongside cell growth and DNA replication. Polysomes likely occupy cellular spaces devoid of the nucleoid, contributing to nucleoid compaction due to mutual exclusion effects. After the nucleoids separate following ter separation, axial expansion of the cell membrane could lead to their spatial separation.
This “membrane attachment/cell elongation” model is reminiscent to the hypothesis proposed by Jacob et al in 1963 (doi:10.1101/SQB.1963.028.01.048). There are several lines of evidence arguing against it as the major driver of nucleoid segregation:
(Below is a slightly modified version of our response to a comment from Reviewer 1—see page 3)
(1) For this alternative model to work, axial membrane expansion (i.e., cell elongation) would have to be localized at the middle of the splitting nucleoids (i.e., midcell position for slow growth and ¼ and ¾ cell positions for fast growth) to create a directional motion. To our knowledge, there is no evidence of such localized membrane incorporation. Furthermore, even if membrane growth was localized at the right places, the fluidity of the cytoplasmic membrane (PMID: 6996724, 20159151, 24735432, 27705775) would be problematic. To go around this fluidity issue, one could potentially evoke a potential connection to the rigid peptidoglycan, but then again, peptidoglycan growth would have to be localized at the middle of the splitting nucleoid to “push” the sister nucleoid apart from each other. However, peptidoglycan growth is dispersed prior to cell constriction (PMID: 35705811, 36097171, 2656655).
(2) Even if we ignore the aforementioned caveats, Paul Wiggins’s group ruled out the cell elongation/transertion model by showing that the rate of cell elongation is slower than the rate of chromosome segregation (PMID: 23775792). In the revised manuscript, we will provide additional data showing that the cell elongation rate is indeed slower than the nucleoid segregation rate.
(3) Furthermore, our correlation analysis comparing the rate of nucleoid segregation to the rate of either cell elongation or polysome accumulation argues that polysome accumulation plays a larger role than cell elongation in nucleoid segregation. These data were already shown in the original manuscript (Figure 1I and Figure 1 – figure supplement 3) but were not highlighted in this context. We will revise the text to clarify this point.
(4) The membrane attachment/cell elongation model does not explain the nucleoid asymmetries described in our paper (Figure 3), whereas they can be recapitulated by our model.
(5) The cell elongation/transertion model cannot predict the aberrant nucleoid dynamics observed when chromosomal expression is largely redirected to plasmid expression. In the revised manuscript, we will add simulation results showing that these nucleoid dynamics are predicted by our model.
In line of these arguments, we do not believe that a mechanism based on membrane attachment and cell elongation is the major driver of nucleoid segregations. However, we do believe that it may play a complementary role (see “Nucleoid segregation likely involves multiple factors” in the Discussion). We will revise this section to clarify our thoughts and mention the potential role of transertion.
Incorporating this perspective into the discussion or future iterations of the model may provide a more comprehensive framework that aligns with the experimental observations in this study and previous work.
As noted above, we will revise the text to mention about transertion.
Simplification of Ribosome States:
Combining monomeric and translating ribosomes into a single 'polysome' category may overlook spatial variations in these states, particularly during ribosome accumulation at the mid-cell. Without validating uniform mRNA distribution or conducting experimental controls such as FRAP or single-molecule measurements to estimate the proportions of ribosome states based on diffusion, this assumption remains speculative.
Indeed, for simplicity, we adopt an average description of all polysomes with an average diffusion coefficient and interaction parameters, which is sufficient for capturing the fundamental mechanism underlying nucleoid segregation. To illustrate that considering multiple polysome species does not change the physical picture, we consider an extension of our model, which contains three polysome species, each with a different diffusion coefficient (D<SUB>P</SUB> = 0.018, 0.023, or 0.028 μm<sup>2</sup>/s), reflecting that polysomes with more ribosomes will have a lower diffusion coefficient. Simulation of this model reveals that the different polysome species have essentially the same concentration distribution, suggesting that the average description in our minimal model is sufficient for our purposes. We will present these new simulation results in the revised manuscript.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
Summary:
The manuscript is dedicated heavily to cell type mapping and identification of sub-type markers in the human testis but does not present enough results from cross-investigation between NOA cases versus control. Their findings are mostly based on transcriptome and the authors do not make enough use of the scATAC-seq data in their analyses as they put forward in the title. Overall, the authors should do more to include the differential profile of NOA cases at the molecular level - specific gene expression, chromatin accessibility, TF binding, pathway, and signaling that are perturbed in NOA patients that may be associated with azoospermia.
Strengths:
(1) The establishment of single-cell data (both RNA and ATAC) from the human testicular tissues is noteworthy.
(2) The manuscript includes extensive mapping of sub-cell populations with some claimed as novel, and reports marker gene expression.
(3) The authors present inter-cellular cross-talks in human testicular tissues that may be important in adequate sperm cell differentiation.
Weaknesses:
(1) A low sample size (2 OA and 3 NOA cases). There are no control samples from healthy individuals.
Thank you for your comments. We recognize that the small sample size in this study somewhat limits its generalizability. However, in transcriptomic research, limited sample sizes are a common issue due to the complexities involved in acquiring samples, particularly in studies about the reproductive system. Healthy testicular tissue samples are difficult to obtain, and studies (doi: 10.18632/aging.203675) have used obstructive azoospermia as a control group in which spermatogenesis and development are normal.
(2) Their argument about interactions between germ and Sertoli cells is not based on statistical testing.
Thank you for your comments. Due to limited funding, we have not yet fully and deeply conducted validation experiments, but we plan to carry out related experiments in the later stage. We hope that the publication of this study will help to obtain more financial support to further investigate the interactions between germ cells and Sertoli cells.
(3) Rationale/logic of the study. This study, in its present form, seems to be more about the role of sub-Sertoli population interactions in sperm cell development and does not provide enough insights about NOA.
Thank you for your comments. In Figure 6, we conducted an in-depth analysis and comparison of the differences between the Sertoli cell subtypes and the germ cell subtypes involved in spermatogenesis in the OA and NOA groups. The results revealed that in the NOA group, especially in the NOA3 group, which has a lower sperm count compared to NOA2 and NOA1, there is a significant loss of Sertoli cell subtypes including SC3, SC4, SC5, SC6, and SC8. The NOA1 group, with a sperm count close to that of the OA group, also had a Sertoli cell profile similar to the OA group. The NOA2 group, with a sperm count between that of NOA1 and NOA3, also exhibited an intermediate profile of Sertoli cell subtypes. Therefore, we suggest that change in Sertoli cell subtypes is a key factor affecting sperm count, rather than just the total number of Sertoli cells. We believe that through these analyses, we can provide in-depth insights into NOA, and we hope that the publication of this study will help obtain more funding support to further validate and expand on these findings.
(4) The authors do not make full use of the scATAC-seq data.
Thank you for your comments.We have added analysis of the scATAC-seq data and shown in the revised manuscript.
Reviewer #2 (Public Review):
Summary:
Shimin Wang et al. investigated the role of Sertoli cells in mediating spermatogenesis disorders in non-obstructive azoospermia (NOA) through stage-specific communications. The authors utilized scRNA-seq and scATAC-seq to analyze the molecular and epigenetic profiles of germ cells and Sertoli cells at different stages of spermatogenesis.
Strengths:
By understanding the gene expression patterns and chromatin accessibility changes in Sertoli cells, the authors sought to uncover key regulatory mechanisms underlying male infertility and identify potential targets for therapeutic interventions. They emphasized that the absence of the SC3 subtype would be a major factor contributing to NOA.
Weaknesses:
Although the authors used cutting-edge techniques to support their arguments, it is difficult to find conceptual and scientific advances compared to Zeng S et al.'s paper (Zeng S, Chen L, Liu X, Tang H, Wu H, and Liu C (2023) Single-cell multi-omics analysis reveals dysfunctional Wnt signaling of spermatogonia in non-obstructive azoospermia. Front. Endocrinol. 14:1138386.). Overall, the authors need to improve their manuscript to demonstrate the novelty of their findings in a more logical way.
Thank you for your detailed review of our work. We greatly appreciate your feedback and have made revisions to our manuscript accordingly.
Regarding the novelty of our research, we believe our study offers conceptual and scientific advances in several ways:
We have systematically revealed the stage-specific roles of Sertoli cell subtypes in different stages of spermatogenesis, particularly emphasizing the crucial role of the SC3 subtype in non-obstructive azoospermia (NOA). Additionally, we identified that other Sertoli cell subtypes (SC1, SC2, SC3...SC8, etc.) also collaborate in a stage-specific manner with different subpopulations of spermatogenic cells (SSC0, SSC1/SSC2/Diffed, Pa...SPT3). These findings provide new insights into the understanding of spermatogenesis disorders.
Compared to the study by Zeng S et al., our research not only focuses on the functional alterations in Sertoli cells but also comprehensively analyzes the interaction patterns between Sertoli cells and spermatogenic cells using scRNA-seq and scATAC-seq technologies. We uncovered several novel regulatory networks that could serve as potential targets for the diagnosis and treatment of NOA.
We sincerely appreciate your constructive comments and will continue to explore this area further, aiming to make a more significant contribution to the understanding of NOA mechanisms.
Reviewer #3 (Public Review):
Summary:
This study profiled the single-cell transcriptome of human spermatogenesis and provided many potential molecular markers for developing testicular puncture-specific marker kits for NOA patients.
Strengths:
Perform single-cell RNA sequencing (scRNA-seq) and single-cell assay for transposase-accessible chromatin sequencing (scATAC-seq) on testicular tissues from two OA patients and three NOA patients.
Weaknesses:
Most results are analytical and lack specific experiments to support these analytical results and hypotheses.
Thank you for your thorough review of our work. We highly value your feedback and have made revisions to our manuscript accordingly. Indeed, we have conducted immunofluorescence (IF) experiments to validate the data obtained from single-cell sequencing and have expanded the sample size to enhance the reliability of our results. To better present these validation experiments, we have reorganized and renamed the sample information, making it easier for you to understand which samples were used in the specific experiments. Following the publication of this paper, we plan to secure additional funding to deepen our research, particularly in the area of experimental validation. We sincerely appreciate your support and insightful suggestions, which have greatly helped guide our future research directions.
Reviewer #1 (Recommendations For The Authors):
(1) The authors should include results from cross-investigation comparing NOA/OA patients versus controls.
Thank you for your comments. In this study, OA was the control group. Healthy testicular tissue samples are difficult to obtain, and studies (doi: 10.18632/aging.203675) have used OA as a control group in which spermatogenesis and development are normal.
(2) In Table S1, the authors should also include the metric for scATAC-seq, and do more to show the findings the authors obtained in RNA is replicated with chromatin accessibility.
Thank you for your comments. We have added Table S2, which includes the metric for scATAC-seq.
(3) A single sample from each OA and NOA group may not be enough to confirm colocalization. The authors should include results from all available samples and use quantitative measures.
Thank you for your comments. I apologize that the sample size in this study was less than three and we could not conduct quantitative analysis. We will increase the sample size and conduct corresponding experiments in subsequent research.
(4) The Methods section does not include enough description to follow how the analyses were carried out, and is missing information on some of the key procedures such as velocity and cell cycle analyses.
Thank you for your comments. The method about velocity and cell cycle analyses was added in the revised manuscript. The description is as follows:
“Velocity analysis
RNA velocity analysis was conducted using scVelo's (version 0.2.1) generalized dynamical model. The spliced and unspliced mRNA was quantified by Velocity (version 0.17.17).”
“Cell cycle analysis
To quantify the cell cycle phases for individual cell, we employed the CellCycleScoring function from the Seurat package. This function computes cell cycle scores using established marker genes for cell cycle phases as described in a previous study by Nestorowa et al. (2016). Cells showing a strong expression of G2/M-phase or S-phase markers were designated as G2/M-phase or S-phase cells, respectively. Cells that did not exhibit significant expression of markers from either category were classified as G1-phase cells.”
(5) For the purpose of transparency, the authors should upload codes used for analyses so that each figure can be reproduced. All raw and processed data should be made publicly available.
Thank you for your comments. We have deposited scRNA-seq and scATAC-seq data in NCBI. ScRNA-seq data have been deposited in the NCBI Gene Expression Omnibus with the accession number GSE202647, and scATAC-seq data have been deposited in the NCBI database with the accession number PRJNA1177103.
Reviewer #2 (Recommendations For The Authors):
The detailed points the authors need to improve are attached below.
The results presented in the study have several weaknesses:
In Figure 1A, it's required to show HE staining results of all patients who underwent single-cell analysis were provided.
Thank you very much for your valuable suggestions. In Figure 1, we present the HE staining results paired with the single-cell data, covering all patients involved in the single-cell analysis.
- Saying "identification of novel potential molecular markers for distinct cell types" seems unsupported by the data.
Thank you for your comments. I'm sorry for the inaccuracy of my description. We have revised this sentence. The description is as follows: These findings indicate that the scRNA-seq data from this study can serve for cellular classification.
- The methods suggest an integrated analysis of scRNA-seq and scATAC-seq, but from the figures, it seems like separate analyses were performed. It's necessary to have data showing the integrated analysis.
Thank you for your comments. We have added an integrated analysis of scRNA-seq and scATAC-seq. The results were shown in Figure S2.
Figure 2 does not seem to well cover the diversity of germ cell subtypes. The main content appears to be about the differentiation process, and it seems more focused on SSCs (stem cell types), but the intended message is not clearly conveyed.
Thank you for your comments. Figure S1 revealed the diversity of germ cell subtypes. The second part of the results described the integrated findings from Figures 2 and S1.
- In Figure 2B, pseudotime could be shown, and I wonder if the pseudotime in this analysis shows a similar pattern as in Figure 2D.
Thank you for your comments. Figure 2B revealed the pseudotime analysis of 12 germ cell subpopulation. Figure 2D revealed RNA velocity of 12 germ cell subpopulation. The two methods are both used for cell trajectory analysis. The pseudotime in Figure 2B showed a similar pattern as in Figure 2D.
- While staining occurs within one tissue, saying they are co-expressed seems inaccurate as the staining locations are clearly distinct. For example, the staining patterns of A2M and DDX4 (a classical marker) are quite different, so it's hard to claim A2M as a new potential marker just because it's expressed. Also, TSSK6 was separately described as having a similar expression pattern to DDX4, but from the IF results, it doesn't seem similar.
Thank you for your comments. We have revised the Figure.
- It was described that A2M (expressed in SSC0-1), and ASB9 (expressed in SSC2) have open promoter sites in SSC0, SSC2, and Diffing_SPG, but it doesn't seem like they are only open in the promoters of those cell types. For example, there doesn't seem to be a peak in Diffing for either gene. The promoter region of the tracks is not very clear, so overall figure modification seems necessary.
Thank you for your comments. We have revised the Figure.
- The ATAC signal scale for each genomic region should be included, and clear markings for the TSS location and direction of the genes are needed.
Thank you for your comments. We have revised the figure and shown in the revised manuscript.
Figure 3A mostly shows the SSC2 in the G2/M phase, so it seems questionable to call SSC0/1 quiescent. Also, I wonder if the expression of EOMES and GFRA1 is well distinguished in the SSC subtypes as expected.
Thank you for your comments. We will validate in subsequent experiments whether the expression of EOMES and GFRA1 is clearly distinguished in the SSC subtypes.
- In Figure 3C, it would be good to have labels indicating what the x and y axes represent. The figure seems complex, and the description does not seem to fully support it.
Thank you for your comments. We have added labels indicating what the x and y axes represent in the Figure 3C. The x and y axes represent spliced and unspliced mRNA ratios, respectively.
- While TFs are the central focus, it's disappointing that scATAC-seq was not used.
Thank you for your comments. TFs analysis using scATAC-seq will be carried out in the future.
Figure 4: It would be good to have a more detailed discussion of the differences between subtypes, such as through GO analysis. The track images need modification like marking the peaks of interest and focusing more on the promoter region, similar to the previous figures.
Thank you for your comments. GO analysis results were put in Figure S5. The description is as follows:
As shown in Figure S5, SC1 were mainly involved in cell differentiation, cell adhesion and cell communication; SC2 were involved in cell migration, and cell adhesion; SC3 were involved in spermatogenesis, and meiotic cell cycle; SC4 were involved in meiotic cell cycle, and positive regulation of stem cell proliferation; SC5 were involved in cell cycle, and cell division; SC6 were involved in obsolete oxidation−reduction process, and glutathione derivative biosynthetic process; SC7 were involved in viral transcription and translational initiation; SC8 were involved in spermatogenesis and sperm capacitation.
In Figure 5, it would be good to have criteria for the novel Sertoli cell subtype presented. CCDC62 is presented as a representative marker for the SC8 cluster, but from Figure 4C, it seems to be quite expressed in the SC3 cluster as well. Therefore, in Figure 5E's protein-level check, it's unclear if this truly represents a novel SC8 subtype.
Thank you for your comments. CCDC62 expression was higher in SC8 cluster than in SC3. Since some molecular markers were not commercially available in the market, CCDC62 was selected as SC8 marker for immunofluorescence verification. Immunofluorescence results showed that CCDC62 is a novel SC8 marker.
- It might have been more meaningful to use SOX9 as a control and show that markers in the same subtype are expressed in the same location.
Thank you for your comments. To determine PRAP1, BST2, and CCDC62 as new markers for the SC subtype, we co-stained them with SOX9 (a well-known SC marker).
- Figures 4 and 5 could potentially be combined into one figure.
Thank you for your comments. Since combining Figures 4 and 5 into a single image would cause the image to be unclear, two images are used to show it.
In Figure 6, it would be good to support the results with more NOA patient data.
Thank you for your comments. Patient clinical and laboratory characteristics has been presented in Table 1.
- Rather than claiming the importance of SC3 based on 3 single-cell patient data, it would be better to validate using public data with SC3 signature genes (e.g., showing the correlation between germ cell and SC3 ratios).
Thank you for your comments. I'm sorry I didn't find public data with SC3 signature genes. In the future, we will verify the importance of SC3 through in vivo and in vitro experiments.
- 462: It seems to be referring to Figure 6G, not 6D.
Thank you for your comments. We have revised it. The description is as follows: As shown in Figure 6G, State 1 SC3/4/5 were tended to associated with PreLep, SSC0/1/2, and Diffing and Diffed-SPG sperm cells (R > 0.72).
In Figure 7, the spermatogenesis process is basically well-known, so it would be better to emphasize what novel content is being conveyed here. Additionally, emphasizing the importance of SC3 in the overall process based on GO results leaves room for a better approach.
Thank you for your valuable suggestions. Regarding Figure 7, we recognize that the spermatogenesis process is well-known, and we will focus on highlighting the novel content, particularly the role and significance of the SC3 subtype in spermatogenesis disorders. As for the importance of SC3 in the overall process based on GO results, we have validated this in Figure 8 through co-staining experiments between Sertoli cells and spermatogenic cells in OA and NOA groups. The results demonstrate a significant correlation between the number of SC3-positive cells and SPT3 spermatogenic cells, particularly in the NOA5-P8 group, where both SC3 and SPT3 cell counts are notably lower than in the NOA4-P7 group. This further supports the critical role of SC3 in the spermatogenesis process. Your suggestions have prompted us to refine our data presentation and more clearly emphasize the novel aspects of our research. We will continue to strive to ensure that every part of our research contributes meaningfully to the academic community. Thank you again for your guidance.
In Figure 8, only the contents of the IF-stained proteins are listed, which seems slightly insufficient to constitute a subsection on its own. It might have been better to conclude by emphasizing some subtypes.
Thank you for your comments. We have combined this part of the results with other results into one section. The description is as follows:
“Co-localization of subpopulations of Sertoli cells and germ cells
To determine the interaction between Sertoli cells and spermatogenesis, we applied Cell-PhoneDB to infer cellular interactions according to ligand-receptor signalling database. As shown in Figure 6G, compared with other cell types, germ cells were mainly interacted with Sertoli cells. We futher performed Spearman correlation analysis to determine the relationship between Sertoli cells and germ cells. As shown in Figure 6H, State 1 SC3/4/5 were tended to be associated with PreLep, SSC0/1/2, and Diffing and Diffed-SPG sperm cells (R > 0.72). Interestingly, SC3 was significantly positively correlated with all sperm subpopulations (R > 0.5), suggesting an important role for SC3 in spermatogenesis and that SC3 is involved in the entire process of spermatogenesis. Subsequently, to understand whether the functions of germ cells and Sertoli cells correspond to each other, GO term enrichment analysis of germ cells and sertoli cells was carried out (Figure S3, S4). We found that the functions could be divided into 8 categories, namely, material energy metabolism, cell cycle activity, the final stage of sperm cell formation, chemical reaction, signal communication, cell adhesion and migration, stem cells and sex differentiation activity, and stress reaction. These different events were labeled with different colors in order to quickly capture the important events occurring in the cells at each stage. As shown in Figure S3, we discovered that SSC0/1/2 was involved in SRP-dependent cotranslational protein targeting to membrane, and cytoplasmic translation; Diffing SPG was involved in cell division and cell cycle; Diffied SPG was involved in cell cycle and RNA splicing; Pre-Leptotene was involved in cell cycle and meiotic cell cycle; Leptotene_Zygotene was involved in cell cycle and meiotic cell cycle; Pachytene was involved in cilium assembly and spermatogenesis; Diplotene was involved in spermatogenesis and cilium assembly; SPT1 was involved in cilium assembly and flagellated sperm motility; SPT2 was involved in spermatid development and flagellated sperm motility; SPT3 was involved in spermatid development and spermatogenesis. As shown in Figure S4, SC1 were mainly involved in cell differentiation, cell adhesion and cell communication; SC2 were involved in cell migration, and cell adhesion; SC3 were involved in spermatogenesis, and meiotic cell cycle; SC4 were involved in meiotic cell cycle, and positive regulation of stem cell proliferation; SC5 were involved in cell cycle, and cell division; SC6 were involved in obsolete oxidation−reduction process, and glutathione derivative biosynthetic process; SC7 were involved in viral transcription and translational initiation; SC8 were involved in spermatogenesis and sperm capacitation. The above analysis indicated that the functions of 8 Sertoli cell subtypes and 12 germ cell subtypes were closely related.
To further verify that Sertoli cell subtypes have "stage specificity" for each stage of sperm development, we firstly performed HE staining using testicular tissues from OA3-P6, NOA4-P7 and NOA5-P8 samples. The results showed that the OA3-P6 group showed some sperm, with reduced spermatogenesis, thickened basement membranes, and a high number of sertoli cells without spermatogenic cells. The NOA4-P7 group had no sperm initially, but a few malformed sperm were observed after sampling, leading to the removal of affected seminiferous tubules. The NOA5-P8 group showed no sperm in situ (Figure 7A). Immunofluorescence staining in Figure 7B was performed using these tissues for validation. ASB9 (SSC2) was primarily expressed in a wreath-like pattern around the basement membrane of testicular tissue, particularly in the OA group, while ASB9 was barely detectable in the NOA group. SOX2 (SC2) was scattered around SSC2 (ASB9), with nuclear staining, while TF (SC1) expression was not prominent. In NOA patients, SPATS1 (SC3) expression was significantly reduced. C9orf57 (Pa) showed nuclear expression in testicular tissues, primarily extending along the basement membrane toward the spermatogenic center, and was positioned closer to the center than DDX4, suggesting its involvement in germ cell development or differentiation. BEND4, identified as a marker fo SC5, showed a developmental trajectory from the basement membrane toward the spermatogenic center. ST3GAL4 was expressed in the nucleus, forming a circular pattern around the basement membrane, similar to A2M (SSC1), though A2M was more concentrated around the outer edge of the basement membrane, creating a more distinct wreath-like arrangement. In cases of impaired spermatogenesis, this arrangement becomes disorganized and loses its original structure. SMCP (SC6) was concentrated in the midpiece region of the bright blue sperm cell tail. In the OA group, SSC1 (A2M) was sparsely arranged in a rosette pattern around the basement membrane, but in the NOA group, it appeared more scattered. SSC2 (ASB9) expression was not prominent. BST2 (SC7) was a transmembrane protein primarily localized on the cell membrane. In the OA group, A2M (SSC1) was distinctly arranged in a wreath-like pattern around the basement membrane, with expression levels significantly higher than ASB9 (SSC2). TSSK6 (SPT3) was primarily expressed in OA3-P6, while CCDC62 (SC8) was more abundantly expressed in NOA4-P7, with ASB9 (SCC2) showing minimal expression. Taken together, germ cells of a particular stage tended to co-localize with Sertoli cells of the corresponding stages. Germ cells and sertoli cells at each differentiation stage were functionally heterogeneous and stage-specific (Figure 8). This suggests that each stage of sperm development requires the assistance of sertoli cells to complete the corresponding stage of sperm development.”
Reviewer #3 (Recommendations For The Authors):
The authors revealed 11 germ cell subtypes and 8 Sertoli cell subtypes through single-cell analysis of two OA patients and three NOA patients. And found that the Sertoli cell SC3 subtype (marked by SPATS1) plays an important role in spermatogenesis. It also suggests that Notch1/2/3 signaling and integrins are involved in germ cell-Stotoli cell interactions. This is an interesting and useful article that at least gives us a comprehensive understanding of human spermatogenesis. It provides a powerful tool for further research on NOA. However, there are still some issues and questions that need to be addressed.<br /> (1) How to collect testicular tissue, please explain in detail. Extract which part of testicular tissue. It's better to make a schematic diagram.
Thank you for your comments. The process is as follows: Testicular tissues were obtained from two OA patients (OA1-P1 and OA2-P2) and three NOA patients (NOA1-P3, NOA2-P4, NOA3-P5) using micro-dissection of testicular sperm extraction separately.
(2) Whether the tissues of these patients are extracted simultaneously or separately, separated into single cells, and stored, and then single cell analysis is performed simultaneously. Please be specific.
Thank you for your comments. The testicular tissues of these patients were extracted separately, then separated into single cells, and single cell analysis was performed simultaneously.
(3) When performing single-cell analysis, cells from two OA patients were analyzed individually or combined. The same problem occurred in the cells of three NOA patients.
Thank you for your comments. Cells from two OA patients and three NOA patients were analyzed individually.
(4) Can you specifically point out the histological differences between OA and NOA in Figure 1A? This makes it easier for readers to understand the structure change between OA and NOA. Please also label representative supporting cells.
Thank you for your comments. We have revised the description and it was shown in the revised manuscript.
(5) The authors demonstrate that "We speculate that this lack of differentiation may be due to the intense morphological changes occurring in the sperm cells during this period, resulting in relatively minor differences in gene expression." Please provide some verification of this hypothesis? For example, use immunofluorescence staining to observe morphological changes in sperm cells.
Thank you for your comments. Due to limited funds, we will verified this hypothesis in future studies.
(6) The authors demonstrate that " As shown in Figure 5E, we discovered that PRAP1, BST2, and CCDC62 were co-expressed with SOX9 in testes tissues." The staining in Figure 5D is unclear, and it is difficult to explain that SOX9 is co-expressed with PRAP1 BST2 CCDC62 based on the current staining results. The staining patterns of SOX9 (green) and SOX9 (red) are also different. (SOX9 (red) appears as dots, while the background for SOX9 (green) is too dark to tell whether its staining is also in the form of dots.) In summary, increasing the clarity of the staining makes it more convincing. Alternatively, use high magnification to display these results.
Thank you for your comments. I have redyed and updated this part of the immunofluorescence staining results. Please refer to the files named Figure 1, Figure 2, Figure 5, and Figure 8.
(7) In Figure 8, the author emphasized the co-localization of Sertoli cells and Germ cells at corresponding stages and did a lot of staining, but it was difficult to distinguish the specific locations of co-localization, which was similar to Figure 5E. If possible, please mark specific colocalizations with arrows or use high magnification to display these results, in order to facilitate readers to better understand.
Thank you for your comments. We have re-stained and updated this part of the data. Please refer to the immunofluorescence staining data in the updated Figure 8.
(8) The authors emphasize that macrophages may play an important role in spermatogenesis. Therefore, adding relevant macrophage staining to observe the differences in macrophage expression between NOA and OA should better support this idea.
Thank you for your comments. Macrophage-related experiments will be further explored in the future.
(9) Notch1/2/3 signaling and integrin were discovered to be involved in germ cell-Sertoli cell interaction. However there are currently no concrete experiments to support this hypothesis. At least simple verification experiments are needed.
Thank you for your comments. Due to limited funding, studies will be carried out in the future.
(10) Data availability statements should not be limited to the corresponding author, especially for big data analysis. This is crucial to the credibility of this data (Have the scRNA-seq and scATAC-seq in this study been deposited in GEO or other databases, and when will they be released to the public?) The data for such big data analysis needs to be saved in GEO or other databases in advance so that more research can use it.
Thank you for your comments. We have deposited scRNA-seq and scATAC-seq data in NCBI. “ScRNA-seq data have been deposited in the NCBI Gene Expression Omnibus with the accession number GSE202647, and scATAC-seq data have been deposited in the NCBI database with the accession number PRJNA1177103.”
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Joint Public Review:
Strengths:
The paper is solidly based on the ability of the authors to master molecular simulations of highly complex systems. In my opinion, this paper shows no major weaknesses. The simulations are carried out in a technically sound way. Comparative analyses of different systems provide valuable insights, even within the well-known limitations of MD. Plus, the authors further investigate why xCas9 exhibits improved recognition of the TGG PAM sequence compared to SpCas9 via well-tempered metadynamics simulations focusing on the binding of R1335 to the G3 nucleobase and the DNA backbone in both SpCas9 and xCas9. In this context, the authors provide a free-energy profiling that helps support their final model.
The implementation of FEP calculations to mimic directed evolution improvement of DNA binding is also interesting, original and well-conducted.
We thank the reviewer for their positive evaluation of our computational strategy. To further substantiate our findings, we have incorporated additional molecular dynamics and Free Energy Perturbation (FEP) calculations for the system bound to GAT. These results corroborate our previous observations obtained with AAG, reinforcing our conclusions.
Overall, my assessment of this paper is that it represents a strong manuscript, competently designed and conducted, and highly valuable from a technical point of view.
Weaknesses:
To make their impact even more general, the authors may consider expanding their discussion on entropic binding to other recent cases that have been presented in the literature recently (such as e.g. the identification of small molecules for Abeta peptides, or the identification of "fuzzy" mechanisms of binding to protein HMGB1). The point on flexibility helping adaptability and expansion of functional properties is important, and should probably be given more evidence and more direct links with a wider picture.
We have expanded our discussion on the role of entropy in favoring TGG binding to xCas9. To this end, we performed entropy calculations using the Quasi-Harmonic approximation (details provided in the Materials and Methods section). This analysis reveals that R1335 in xCas9 experiences an entropy increase compared to SpCas9, enhancing its adaptability and interaction with the DNA. This analysis and its explanation are detailed on pages 8-9.
Additionally, we have enriched the Discussion section by clarifying how DNA binding is entropically favored in xCas9, thereby facilitating the recognition of alternative PAM sequences. A refined explanation is also included in the Conclusions section, where we contextualize xCas9 within a broader evolutionary framework of protein-DNA recognition. This highlights how structural flexibility can enable sequence diversity while maintaining high specificity.
Recommendations for the authors:
Overall, this is a very interesting and elegant manuscript with compelling results that shed light on the atomistic determinants of genetic-editing technologies.
Since the paper proposes new findings that may be helpful for experimentalists, it would be interesting if the authors point out (in their discussion/conclusions) specific amino acids to mutate/target for future tests by the experimental community. This should just appear as an open hypothesis/proposal for new experiments.
In the Conclusions, we have incorporated a discussion on how modifications in the PAM-binding cleft can enhance the recognition of alternative PAM sequences. As an illustrative example, we reference the recently developed SpRY Cas9 variant, which is capable of recognizing a broader range of PAMs. This variant includes mutations within the PAM-binding cleft that likely increase the flexibility of the interacting residues, as suggested by recent cryo-EM structures (Hibshman et al. Nat. Commun. 2024). The importance of fine-tuning the flexibility of the PAM-interacting cleft for engineering strategies has also been highlighted in the abstract.
Overall, in light of the reviewer’s comments and in consideration of our findings, we revised the manuscript title in: “Flexibility in PAM Recognition Expands DNA Targeting in xCas9.” This new title better highlights the key findings from our research and contextualizes them within the broader goal of expanding DNA targeting capabilities, a critical priority for developing enhanced CRISPR-Cas systems.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public review):
This study by Wu et al. provides valuable computational insights into PROTAC-related protein complexes, focusing on linker roles, protein-protein interaction stability, and lysine residue accessibility. The findings are significant for PROTAC development in cancer treatment, particularly breast and prostate cancers.
The authors' claims about the role of PROTAC linkers and protein-protein interaction stability are generally supported by their computational data. However, the conclusions regarding lysine accessibility could be strengthened with more in-depth analysis. The use of the term "protein functional dynamics" is not fully justified by the presented work, which focuses primarily on structural dynamics rather than functional aspects.
Strengths:
(1) Comprehensive computational analysis of PROTAC-related protein complexes.
(2) Focus on critical aspects: linker role, protein-protein interaction stability, and lysine accessibility.
Weaknesses:
(1) Limited examination of lysine accessibility despite its stated importance.
(2) Use of RMSD as the primary metric for conformational assessment, which may overlook important local structural changes.
Reviewer #1 (Recommendations for the authors):
(1) The authors' claims about the role of PROTAC linkers and protein-protein interaction stability are generally supported by their computational data. However, the conclusions regarding lysine accessibility could be strengthened with more in-depth analysis. Expand the analysis of lysine accessibility, potentially correlating it with other structural features such as linker length.
We thank the reviewers for the suggestions! We performed time dependent correlation analysis to correlate the dihedral angles of the PROTACs and the Lys-Gly distance (Figures 6 and S17). We included detailed explanation on page 16:
“To further examine the correlation between PROTAC rotation and the Lys-Gly interaction, we performed a time-dependent correlation analysis. This analysis showed that PROTAC rotation translates motion over time, leading to the Lys-Gly interaction, with a correlation peak around 60-85 ns, marking the time of the interaction (Figure 6 and Figure S17). In addition, the pseudo dihedral angles also showed a high correlation (0.85 in the case of dBET1) with Lys-Gly distance. This indicated that degradation complex undergoes structural rearrangement and drives the Lys-Gly interaction.”
(2) The use of the term "protein functional dynamics" is not fully justified by the presented work, which focuses primarily on structural dynamics rather than functional aspects. Consider changing "protein functional dynamics" to "protein dynamics" to more accurately reflect the scope of the study.
Thanks to the reviewer for the suggestion to use the more accurate terminology! We agreed with the reviewer that if we keep “protein functional dynamics” in the title, we should focus on how the “overall protein dynamic” links to the “function” – The function is directly related to PROTAC-induced structural dynamics which is commonly seen in “protein-structural-function” relationship, but it is not our main focus. Therefore, we changed the title to replace “functional” by “structural”.
(3) Incorporate more local and specific characterization methods in addition to RMSD for a more comprehensive conformational assessment.
We thank the reviewer for the suggestion. We performed time dependent correlation analysis to understand how the rotation of PROTACs can translate to the Lys-Gly interaction. In addition, we performed dihedral entropies analysis for each dihedral angle in the linker of the PROTACs to better examine the flexibility of each PROTAC.
We included detailed explanation at page 18: “Our dihedral entropies analysis showed that dBET57 has ~0.3 kcal/mol lower entropies than the other three linkers, suggesting dBET57 is less flexible than other PROTACs (Figure S18).”
Reviewer #2 (Public review):
Summary:
The manuscript reports the computational study of the dynamics of PROTAC-induced degradation complexes. The research investigates how different linkers within PROTACs affect the formation and stability of ternary complexes between the target protein BRD4BD1 and Cereblon E3 ligase, and the degradation machinery. Using computational modeling, docking, and molecular dynamics simulations, the study demonstrates that although all PROTACs form ternary complexes, the linkers significantly influence the dynamics and efficacy of protein degradation. The findings highlight that the flexibility and positioning of Lys residues are crucial for successful ubiquitination. The results also discussed the correlated motions between the PROTAC linker and the complex.
Strengths:
The field of PROTAC discovery and design, characterized by its limited research, distinguishes itself from traditional binary ligand-protein interactions by forming a ternary complex involving two proteins. The current understanding of how the structure of PROTAC influences its degradation efficacy remains insufficient. This study investigated the atomic-level dynamics of the degradation complex, offering potentially valuable insights for future research into PROTAC degradability.
Reviewer #2 (Recommendations for the authors):
(1) Regarding the modeling of the ternary complex, the BRD4 structure (3MXF) is from humans, whereas the CRBN structure in 4CI3 is derived from Gallus gallus. Is there a specific reason for not using structures from the same species, especially considering that human CRBN structures are available in the Protein Data Bank (e.g., 8OIZ, 4TZ4)?
We appreciate the reviewer’s insightful comment regarding the choice of crystal structures of BRD4 and CRBN structures from two species. Our initial selection of 4CI3 for CRBN structure was based on its high resolution and publication in Nature journal. Furthermore, the Gallus gallus CRBN structure shares high degree of sequence and structural similarity with Homo sapiens CRBN, especially in the ligand binding region. At the time of our study, we were aware of 4TZ4 as Homo sapiens CRBN, however, we did not use this structure since no publication or detailed experimental was associated with it. Additionally, PDB 8OIZ, was not publicly available yet for other researchers to use at the time.
(2) Based on the crystal structure (PDB ID: 6BNB) discussed in Reference 6, the ternary complex of dBET57 exhibits a conformation distinct from other PROTACs, with CRBN adopting an "open" conformation. Using the same CRBN structure for dBET57 as for other PROTACs might result in inaccurate docking outcomes.
Thank you for the reviewer’s comment! As noted by the authors in Reference 6, the observed open conformation of CRBN in the dBET57 ternary complex may result from the high salt crystallization conditions, which could drive structural rearrangement, and crystal contacts that may induce this conformation. The authors also mentioned that this open conformation could, in part, reflect CRBN’s intrinsic plasticity. However, they acknowledged that further studies are needed to determine whether this conformational flexibility is a characteristic feature of CRBN that enables it to accommodate a variety of substrates. Despite these observations, we believe that the compatibility of the observed BRD4<sup>BD1</sup> binding conformation with both open and closed CRBN states suggests that these conformational changes are all possible. Therefore, we believe using the same initial CRBN structure for dBET57 as for other PROTACs can still reasonably reveal the dynamic nature of the ternary complex and would not significantly affect the accuracy of our docking outcomes either.
(3) Figure 2 displays only a single frame from the simulations, which might not provide a comprehensive representation. Could a contact frequency heatmap of PROTAC with the proteins be included to offer a more detailed view?
We thank the reviewer for the suggestion! We performed the contact map analysis to observe the average distance between PROTACs and BRD4<sup>BD1</sup> over 400ns of MD simulation (new Figure S4 added).
We included detailed explanation at page 8 and 9: “The residues contact map throughout the 400ns MD simulation also showed different pattern of protein-protein interactions, indicating that the linkers were able to adopt different conformations (Figure S4).”
(4) The conclusions in Figure 3 and S11 are based on a single 400 ns trajectory. The reproducibility of these results is therefore uncertain.
We thank the reviewer for the suggestion! We added one more random seed MD simulation for each PROTAC to ensure the reproducibility of the results. The Result is shown in Figure S21 and the details for each MD run are updated in Table 1.
(5) Figure 4 indicates significant differences between the first and last 100 ns of the simulations. Does this suggest that the simulations have not converged? If so, how can the statistical analysis presented in this paper be considered reliable?
We thank the reviewers for the question. The simulation was initiated with a 10-15A gap between BRD4 and Ub to monitor the movement of degradation machinery and Lys-Gly interaction. The significant changes in pseudo dihedral in Figure 4 shows that the large-scale movement of the degradation complex can initiate the Lys-Gly binding. It does not relate to unstable sampling because the system remains very stable when BRD4 comes close to Ub.
(6) In Figure 5, the dihedral angle of dBET57_#9MD1 is marked on a peptide bond. Shouldn't this angle have a high energy barrier for rotation?
We thank the reviewers for catching the error! Indeed, it was an error that the dihedral angles were marked on the peptide bond. We reworked the figure and double checked our dihedral correlation analysis. The updated correlate dihedral angle selection and the correlation coefficient is shown in Figure 5.
(7) Given that crystal structures for dBET 70, 23, and 57 are available, why is there a need to model the complex using protein-protein docking?
We thank the reviewer for the feedback. Only dBET23 has the ternary complex available in a crystal structure, which has the PROTAC and both proteins, while dBET1, dBET57 and dBET70 are not completed as ternary complexes. Although dBET70 has a crystal structure, its PROTAC’s conformation is not resolved, and thus we decided to still perform protein-protein docking with dBET70.
We includeed the explanation at page 8: “Only dBET23 crystal structure is available with the PROTAC and both proteins, while the experimentally determined ternary complexes of dBET1, dBET57 and dBET70 are not available. “
(8) On page 9, it is mentioned that "only one of the 12 PDB files had CRBN bound to DDB1 (PDB ID 4TZ4)." However, there are numerous structures of the DDB1-CRBN complex available, including those used for docking like 4CI3, as well as 4CI1, 4CI2, 8OIZ, etc.
We thank the reviewers for the comment! We acknowledged the existence of several DDB1-CRBN complex crystal structures, such as PDB IDs 4CI1, 4CI2, 4CI3, and the more recent 8OIZ. For our study, we chose to use 4TZ4 to maintain consistency in complex construction and to align with the methodology established in a previously published JBC paper (https://doi.org/10.1016/j.jbc.2022.101653), which successfully utilized the same structure for a similar construct. At the time our study was conducted, the 8OIZ structure had not yet been released. We appreciate your suggestion and will consider incorporating alternative structures in future studies to further investigate our findings.
(9) Table 2 is first referenced on page 8, while Table 1 is mentioned first on page 10. The numbering of these tables should be reversed to reflect their order of appearance in the text.
We thank the reviewer for catching the error! We switched the order of Table 1 and Table 2.
Reviewer #3 (Public review):
The authors offer an interesting computational study on the dynamics of PROTAC-driven protein degradation. They employed a combination of protein-protein docking, structural alignment, atomistic MD simulations, and post-analysis to model a series of CRBN-dBET-BRD4 ternary complexes, as well as the entire degradation machinery complex. These degraders, with different linker properties, were all capable of forming stable ternary complexes but had been shown experimentally to exhibit different degradation capabilities. While in the initial models of the degradation machinery complex, no surface Lys residue(s) of BRD4 were exposed sufficiently for the crucial ubiquitination step, MD simulations illustrated protein functional dynamics of the entire complex and local side-chain arrangements to bring Lys residue(s) to the catalytic pocket of E2/Ub for reactions. Using these simulations, the authors were able to present a hypothesis as to how linker property affects degradation potency. They were able to roughly correlate the distance of Lys residues to the catalytic pocket of E2/Ub with observed DC50/5h values. This is an interesting and timely study that presents interesting tools that could be used to guide future PROTAC design or optimization.
Reviewer #3 (Recommendations for the authors):
(1) My most important comment refers to the MM/PBSA analysis, the results of which are shown in Figure S9: binding affinities of -40 to -50 kcal/mol are unrealistic. This would correspond to a dissociation constant of 10^-37 M. This analysis needs to be removed or corrected.
We thank the reviewer for the comment! MM/PBSA analysis indeed cannot give realistic binding free energy. It does not include the configurational entropy loss which should be a large positive value. In addition, while the implicit PBSA solvent model computes solvation free energy, the absolute values may not be very accurate. However, because this is a commonly used energy calculation, and some readers may like to see quantitative values to ensure that the systems have stable intermolecular attractions, we kept the analysis in SI. We edited the figure legend, moved the Figure S10 in SI page 19, and added sentences to clearly state that the calculations did not include configuration entropy loss “Note that the energy calculations focus on non-bonded intermolecular interactions and solvation free energy calculations using MM/PBSA, where the configuration entropy loss during protein binding was not explicitly included. “.
(2) I think that the analysis of what in the different dBETx makes them cause different degradation potency is underdeveloped. The dihedral angle analysis (Figure 4B) did not explain the observed behavior in my opinion. Please add additional, clearer analysis as to what structural differences in the dBETx make them sample very different conformations.
We thank the reviewer for the suggestions! Based on the suggestion, we further performed dihedral entropy analysis for each dihedral angle in the linker part of the PROTAC to examine the flexibility of each PROTAC. Because each PROTAC has a different linker, we now clearly label them in a new Figure S18 in SI page 27. Low dihedral entropies indicate a more rigid structure and thus less flexibility to make a PROTAC more difficult to rearrange and facilitate the protein structural dynamic necessary for ubiquitination.
We added detailed explanation on page 18: “Our dihedral entropy analysis showed that dBET57 has ~0.3 kcal/mol lower configuration entropies than the other dBETs with three different linkers, suggesting that dBET57 is less flexible than the other PROTACs (Figure S18).”
(3) "The movement of the degradation machinery correlated with rotations of specific dihedrals of the linker region in dBETs (Figure 5).": this is not sufficiently clear from the figure. Definitely not in a quantitative way.
We thank the reviewers for the suggestions! To further understand the correlation between PROTACs dihedral angles and the movement of degradation machinery, we performed time dependent correlation analysis to correlate the dihedral angles of the PROTACs and the Lys-Gly distance (Figures 6 and S17).
We included detailed explanation on page 16:
“To further examine the correlation between PROTAC rotation and the Lys-Gly interaction, we performed a time-dependent correlation analysis. This analysis showed that PROTAC rotation translates motion over time, leading to the Lys-Gly interaction, with a correlation peak around 60-85 ns, marking the time of the interaction (Figure 6 and Figure S17). In addition, the pseudo dihedral angles also showed a high correlation (0.85 in the case of dBET1) with Lys-Gly distance. This indicated that degradation complex undergoes structural rearrangement and drives the Lys-Gly interaction.
(4) Cartoons are needed at multiple stages throughout the paper to enhance the clarity of what the modeled complexes looked like (e.g. which subunits they contained).
We thank the reviewers for the suggestions. We added and remade several Figures with cartoons to better represent the stages. We also used higher resolution and included clearer labels for each protein system.
(5) The difference between CRL4A E3 ligase and CRBN E3 ligase is not clear to the non-expert reader.
Thanks for the reviewer’s comment! To clarify the terms "CRL4A E3 ligase" and "CRBN E3 ligase", which refer to different levels of description for the protein complexes, we added a couple of sentences in the Figure 1 legend. As a result, the non-expert readers can clearly know the differences.
As illustrated in Figure 1,
-
CRL4A E3 ligase refers to the full E3 ligase complex, which includes all protein components such as CRBN, DDB1, CUL4A, and RBX1.
-
CRBN E3 ligase, on the other hand, is a more colloquial term typically used to describe just the CRBN protein, often in isolation from the full CRL4A complex.
(6) Figure 1, legend: unclear why it's E3 in A and E2 in B.
We thank the reviewer for the question! E3 ligase in Figure 1A refers to CRBN E3 ligase, where researchers also simply term it CRBN. We have added a sentence to specify that CRBN E3 ligase is also termed CRBN for simplicity. In Figure 1B, E2 was unclear in the sentences. The full name of E2 should be E2 ubiquitin-conjugating enzyme. Because the name is a bit long, researchers also call it E2 enzyme. We have corrected it and used E2 enzyme to make it clearer.
(7) "Although the protein-protein binding affinities were similar, other degraders such as dBET1 and dBET57 had a DC50/5h of about 500 nM". It's unclear what experimental data supports the assertion that the protein-protein binding affinities are similar.
We thank reviewer for the question. Indeed, the statement is unclear.
We corrected the sentence in page 6: “Although utilizing the exact same warheads, other degraders such as dBET1 and dBET57 had a DC<sub>50/5h</sub> of about 500 nM.”
(8) Was the construction of the degradation machinery complex guided by experimental data (maybe cryo-EM or tomography)? If not, what is the accuracy of the starting complex for MD? This may impact the reliability of the obtained results.
Thank you for your insightful comments! Yes, the construction of the degradation machinery complex was guided by available high-resolution crystal structures, which was selected to maintain consistency and align with the methodology established in a previously published JBC paper (https://doi.org/10.1016/j.jbc.2022.101653).
We acknowledged that static crystal structures represent only a single snapshot of the system and may not capture the full conformational flexibility of the complex. To address this limitation, we performed MD simulations using multiple starting structures. This approach allowed us to explore a broader conformational landscape and reduced the dependence on any single starting configuration, thereby enhancing the reliability of the results.
We hope this clarifies the robustness of our methodology and the steps taken to ensure accuracy in our simulations.
(9) "With quantitative data, we revealed the mechanism underlying dBETx-induced degradation machinery": I think this may be too strong of an assertion. The authors may have developed a mechanistic hypothesis that can be tested experimentally in the future.
We thank the reviewer for the suggestion. This is indeed a strong assertion and needs to be modified. We edited the sentence in page 7: “With quantitative data, we revealed the importance of the structural dynamics of dBETx-induced motions, which arrange positions of surface lysine residues of BRD4<sup>BD1</sup> and the entire degradation machinery.”
(10) Figure S2: are the RMSDs calculated over all residues? Or just the BRD4 residues? Given that the structures are aligned with respect to CRBN, the reported RMSD numbers might be artificially low since there are many more CRBN residues than there are BRD4 residues. Also, why weren't the crystal structures used for dBET 23 and 70 for the modeling? Wouldn't you want to use the most accurate possible structures? Simulations were run for 23. Why not for 70?
We thank the reviewer for the suggestion. We added a sentence to more clearly explain the RMSD calculations in Figure S2: “The structural superposition is performed based on the backbone of CRBN and RMSD calculation is conducted based on the backbone of BRD4<sup>BD1</sup>.”
Although dBET70 has crystal structure, its PROTAC structure is not resolved, and thus we decided to still perform protein-protein docking with dBET70. dBET1 and dBET57 do not have a crystal structure for the ternary complexes.
We included the explanation at page 8: “Only dBET23 crystal structure is available with the PROTACs and both proteins, while the experimentally determined ternary complexes of dBET1, PROTACs of dBET57 and dBET70 are not available. “
a. And there are no crystal structures available for 1 and 57? If so, please clearly say that. Otherwise please report the RMSD.
We thank the reviewer for the suggestion. We included the explanation at page 8: “Only dBET23 crystal structure is available with the PROTACs and both proteins, while the experimentally determined ternary complexes of dBET1, PROTACs of dBET57 and dBET70 are not available.”
(11) Table 2 is referenced before Table 1.
We thank the reviewer for catching the error! We switched the order for Table 1 and Table 2.
(12) Figure S3 is not referenced in the main paper.
We thank the reviewer for catching the error! We now referred Figure S3 on page7.
(13) Minor comments on grammar and sentence structure:
a. It should be "binding of a ternary complex"
b. "Our shows the importance": word missing.
c. "...providing insights into potential orientations for ubiquitination. observe whether the preferred conformations are pre-organized for ubiquitination." Word or words missing.
We thank reviewer for catching the errors! We corrected grammatical errors and unclear sentences throughout the entire paper and revised the sentences to make them easily understandable for non-expert readers.
-
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
eLife assessment
This work describes a convincingly validated non-invasive tool for in vivo metabolic phenotyping of aggressive brain tumors in mice brains. The analysis provides a valuable technique that tackles the unmet need for patient stratification and hence for early assessment of therapeutic efficacy. However, wider clinical applicability of the findings can be attained by expanding the work to include more diverse tumor models.
We thank the Editors for their comments. This concern was also raised by Reviewer 1 in the Public Review, where we address in more detail – please refer to comment PR-R1.C1. In brief, we agree that a more clinically relevant model should provide more translatable results to patients, and acknowledge this better in the revised manuscript: page 18 (lines 14-17), “While patient-derived xenografts and de novo models would be more suited to recapitulate human GBM heterogeneity and infiltration features, and genetic manipulation of glycolysis and mitochondrial oxidation pathways potentially relevant to ascertain DGE-DMI sensitivity for their quantification, (…)”. However, we also believe that the potential of DGE-DMI for application to different glioblastoma models or patients is demonstrated clearly enough with the two immunocompetent models we chose, extensively reported in the literature as reliable models of glioblastoma.
Public Reviews:
Reviewer #1 (Public Review):
Summary:
This work introduces a new imaging tool for profiling tumor microenvironments through glucose conversion kinetics. Using GL261 and CT2A intracranial mouse models, the authors demonstrated that tumor lactate turnover mimicked the glioblastoma phenotype, and differences in peritumoral glutamate-glutamine recycling correlated with tumor invasion capacity, aligning with histopathological characterization. This paper presents a novel method to image and quantify glucose metabolites, reducing background noise and improving the predictability of multiple tumor features. It is, therefore, a valuable tool for studying glioblastoma in mouse models and enhances the understanding of the metabolic heterogeneity of glioblastoma.
Strengths:
By combining novel spectroscopic imaging modalities and recent advances in noise attenuation, Simões et al. improve upon their previously published Dynamic Glucose-Enhanced deuterium metabolic imaging (DGE-DMI) method to resolve spatiotemporal glucose flux rates in two commonly used syngeneic GBM mouse models, CT2A and GL261. This method can be standardized and further enhanced by using tensor PCA for spectral denoising, which improves kinetic modeling performance. It enables the glioblastoma mouse model to be assessed and quantified with higher accuracy using imaging methods.
The study also demonstrated the potential of DGE-DMI by providing spectroscopic imaging of glucose metabolic fluxes in both the tumor and tumor border regions. By comparing these results with histopathological characterization, the authors showed that DGE-DMI could be a powerful tool for analyzing multiple aspects of mouse glioblastoma, such as cell density and proliferation, peritumoral infiltration, and distant migration.
Weaknesses:
(1) Although the paper provides clear evidence that DGE-DMI is a potentially powerful tool for the mouse glioblastoma model, it fails to use this new method to discover novel features of tumors. The data presented mainly confirm tumor features that have been previously reported. While this demonstrates that DGE-DMI is a reliable imaging tool in such circumstances, it also diminishes the novelty of the study.
PR-R1.C1 – We thank the Reviewer for the detailed analysis and reply below to each point. PR-R1.C1.1 - novelty: We thank the Reviewer for the comments and understand their perspective. While we acknowledge that our paper is more methodologically oriented, we also believe that significant methodological advances are critical for new discoveries. This was our main motivation and is demonstrated in the present work, showing the ability to map in vivo metabolic fluxes in mouse glioma, a “hot topic” and very desirable in the cancer field.
PR-R1.C1.2 – additional tumor features: To strengthen the biological relevance of this methodologic novelty, we have now included immune cell infiltration among the tumor features assessed, besides perfusion, histopathology, cellularity and cell proliferation. For this, we performed iba-1 immunostaining for microglia/ macrophages, now included in Fig. 2-B. These new results demonstrate significantly higher microglia/macrophage infiltration in CT2A tumors compared to GL261, particularly at the tumor border. This is very consistent with the respective tumor phenotypes, namely differences in cell density and cellularity between the 2 cohorts and across pooled cohorts, as we now report: page 9 (lines 10-18), “Such phenotype differences were reflected in the regional infiltration of microglia/macrophages: significantly higher at the CT2A peritumoral rim (PT-Rim) compared to GL261, and slightly higher in the tumor region as well (Fig 2B). Further quantitative regional analysis of Tumor-to-PT-Rim ROI ratios revealed: (i) 47% lower cell density (p=0.004) and 32% higher cell proliferation (p=0.026) in GL261 compared to CT2A (Fig 2C, Table S3); and (ii) strong negative correlations in pooled cohorts between microglia/macrophage infiltration and cellularity (R=-0.91, p=<0.001) or cell density (R=-0.77, p=0.016), suggesting more circumscribed tumor growth with higher peripheral/peritumoral infiltration of immune cells.”; and page 16 (lines 13-19), “GL261 tumors were examined earlier after induction than CT2A (17±0 vs. 30±5 days, p = 0.032), displaying similar volumes (57±6 vs. 60±14, p = 0.813) but increased vascular permeability (8.5±1.1 vs 4.3±0.5 10<sup>3</sup>/min: +98%, p=0.001), more disrupted stromal-vascular phenotypes and infiltrative growth (5/5 vs 0/5), consistent with significantly lower tumor cell density (4.9±0.2 vs. 8.2±0.3 10<sup>-3</sup> cells/µm<sup>2</sup>: -40%, p<0.001) and lower peritumoral rim infiltration of microglia/macrophages (2.1±0.7 vs. 10.0±2.3 %: -77%, p=0.008)”.
PR-R1.C1.3 – new tumor features and DGE-DMI: Importantly, such regional differences in cellularity/cell density and immune cell infiltration between the two cohorts were remarkably mirrored by the lactate turnover maps (Fig 3-C), as we now report in the manuscript: page 12 (lines 6-15), “GL261 tumors accumulated significantly less lactate in the core (1.60±0.25 vs 2.91±0.33 mM: -45%, p=0.013) and peritumor margin regions (0.94±0.09 vs 1.46±0.17 mM: 36%, p=0.025) than CT2A – Fig 3 A-B, Table S1. Consistently, tumor lactate accumulation correlated with tumor cellularity in pooled cohorts (R=0.74, p=0.014). Then, lower tumor lactate levels were associated with higher lactate elimination rate, k<sub>lac</sub> (0.11±0.1 vs 0.06±0.01 mM/min: +94%, p=0.006) – Fig 3B – which in turn correlated inversely with peritumoral rim infiltration of microglia/macrophages in pooled cohorts (R=-0.73, p=0.027) – Fig 3-C. Further analysis of Tumor/P-Margin metabolic ratios (Table S3) revealed: (i) +38% glucose (p=0.002) and -17% lactate (p=0.038) concentrations, and +55% higher lactate consumption rate (p=0.040) in the GL261 cohort; and (ii) lactate ratios across those regions reflected the respective cell density ratios in pooled cohorts (R=0.77, p=0.010) – Fig 3-C”. This is a novel, relevant feature compared to our previous work, as highlighted in our discussion: page 17 (lines 1-8), “Tumor vs peritumor border analyses further suggest that lactate metabolism reflects regional histologic differences:
lactate accumulation mirrors cell density gradients between and across the two cohorts; whereas lactate consumption/elimination rate coarsely reflects cohort differences in cell proliferation, and inversely correlates with peritumoral infiltration by microglia/macrophages across both cohorts. This is consistent with GL261’s lower cell density and cohesiveness, more disrupted stromal-vascular phenotypes, and infiltrative growth pattern at the peritumor margin area, where less immune cell infiltration is detected and relatively lower cell division is expected [43]”.
We trust that these new features recovered from DGE-DMI (Fig 2-B and Fig 3-C) show its potential for new discoveries in glioblastoma.
(2) When using DGE-DMI to quantitatively map glycolysis and mitochondrial oxidation fluxes, there is no comparison with other methods to directly identify the changes. This makes it difficult to assess how sensitive DGE-DMI is in detecting differences in glycolysis and mitochondrial oxidation fluxes, which undermines the claim of its potential for in vivo GBM phenotyping.
PR-R1.C2: We thank the reviewer for raising this important point. The validity of the method for mapping specific metabolic kinetics in mouse glioma was reported in our previous work, using the same animal models, as specified in the introduction (page 4, lines 10-13): “we recently (…) propose[d] Dynamic Glucose-Enhanced (DGE) 2H-MRS [31], demonstrating its ability to quantify glucose fluxes through glycolysis and mitochondrial oxidation pathways in vivo in mouse GBM (…)”. Therefore, this was not reproduced in the present work.
In brief, our DGE-DMI results are very consistent with our previous study, where DGE single voxel deuterium spectroscopy was performed in the same tumor models with higher temporal resolution and SNR (as state on page 16, lines 9-10: glycolytic lactate synthesis rate, 0.59±0.04 vs. 0.55±0.07 mM/min; glucose-derived glutamate-glutamine synthesis rate, 0.28±0.06 vs. 0.40±0.08 mM/min), which in turn matched well the values reported by others for glucose consumption rate through:
(i) glycolysis, in different tumor models including mouse lymphoma in vivo (0.99 mM/min, by DGE-DMI (Kreis et al. 2020), rat breast carcinoma in situ (1.43 mM/min, using a biochemical assay (Kallinowski et al. 1988), and even perfused GBM cells (1.35 fmol min<sup>−1</sup> cell<sup>−1</sup>, according to Hyperpolarized 13C-MRS (Jeong et al. 2017), very similar to our previous in vivo measurements in GL261 tumors: 0.50 ± 0.07 mM min<sup>−1</sup> = 1.25 ± 0.16 fmol min<sup>−1</sup> cell<sup>−1</sup> (Simoes et al. 2022));
(ii) mitochondrial oxidation, very similar to previous in vivo measurements in mouse GBM xenografts (0.33 mM min<sup>−1</sup>, using 13C spectroscopy (Lai et al. 2018)), and particularly to our in situ measurements in cell culture for (GL261, 0.69 ± 0.09 fmol min<sup>−1</sup> cell<sup>−1</sup>; and CT2A 0.44 ± 0.08 fmol min<sup>−1</sup> cell<sup>−1</sup>), remarkably similar to the in vivo measurements in the respective tumors in vivo (Gl261, 0.32 ± 0.10 mM min<sup>−1</sup> = 0.77 ± 0.23 fmol min<sup>−1</sup> cell<sup>−1</sup>; and CT2A, 0.51 ± 0.11 mM min<sup>−1</sup> = 0.60 ± 0.12 fmol min<sup>−1</sup> cell<sup>−1</sup>) (Simoes et al. 2022)).
(3) The study only used intracranial injections of two mouse glioblastoma cell lines, which limits the application of DGE-DMI in detecting and characterizing de novo glioblastomas. A de novo mouse model can show tumor growth progression and is more heterogeneous than a cell line injection model. Demonstrating that DGE-DMI performs well in a more clinically relevant model would better support its claimed potential usage in patients.
PR-R1.C3: We agree that a more clinically relevant model, such as the one suggested by the Reviewer, would in principle be better suited to provide more translatable results to patients. We however believe that the potential of DGE-DMI for application to different glioblastoma models or patients, with GBM or any other types of brain tumors for that matter, is demonstrated clearly enough with the two syngeneic models we chose, given their robustness and general acceptance in the literature as reliable immunocompetent models of GBM, and for their different histologic and metabolic properties. This way we could fully focus on the novel metabolic imaging method, as compared to our previous single-voxel approach. While both tumor cohorts (GL261 and CT2A) were studied at more advanced stages of tumor progression, the metabolic differences depicted are consistent with the histopathologic features reported, as discussed in the manuscript; namely, the lower glucose oxidation rates. We have now modified the manuscript to highlight this point: page 18 (lines 12-14), “While patient-derived xenografts and de novo models would be more suited to recapitulate human GBM heterogeneity and infiltration features, and genetic manipulation of glycolysis and mitochondrial oxidation pathways could be relevant to ascertain DGE-DMI sensitivity for their quantification, (…)”.
Reviewer #2 (Public Review):
Summary:
In this work, the authors attempt to noninvasively image metabolic aspects of the tumor microenvironment in vivo, in 2 mouse models of glioblastoma. The tumor lesion and its surrounding appearance are extensively characterized using histology to validate/support any observations made with the metabolic imaging approach. The metabolic imaging method builds on a previously used approach by the authors and others to measure the kinetics of deuterated glucose metabolism using dynamic 2H magnetic resonance spectroscopic imaging (MRSI), supported by de-noising methods.
Strengths:
Extensive histological evaluation and characterization.
Measurement of the time course of isotope labeling to estimate absolute flux rates of glucose metabolism.
Weaknesses:
(1) The de-noising method appears essential to achieve the high spatial resolution of the in vivo imaging to be compatible with the dimensions of the tumor microenvironment, here defined as the immediately adjacent rim of the mouse brain tumors. There are a few challenges with this approach. Often denoising methods applied to MR spectroscopy data have merely a cosmetic effect but the actual quantification of the peaks in the spectra is not more accurate than when applied directly to original non-denoised data. It is not clear if this concern is applicable to the denoising technique applied here. However, even if this is not an issue, no denoising method can truly increase the original spatial resolution at which data were acquired. A quick calculation estimates that the spatial resolution of the 2H MRSI used here is 30-40 times too low to capture the much smaller tumor rim volume, and therefore there is concern that normal brain tissue and tumor tissue will be the dominant metabolic signal in so-called tumor rim voxels. This means that the conclusions on metabolic features of the (much larger) tumor are much more robust than the observations attributed to the (much smaller) tumor microenvironment/tumor rim.
PR-R2.C1: We thank the Reviewer for the constructive comments regarding resolution and tumor rim, and denoising. These issues were raised more extensively in the section Recommendations For The Authors, where they are addressed in detailed (RA-R2.C2). In summary, we agree with the Reviewer that no denoising method can increase the nominal resolution; not was that our purpose. Thus, we clarify the relevance of spectral matrix interpolation in MRSI, and how our display resolution should in principle provide a better approximation to the ground truth than the nominal resolution, relevant for ROI analysis in the tumor margin. While we further show relevant correlations between metabolic maps and histologic features in tumor core and margin, we agree with the reviewer that our observations in the tumor core are more robust than those in the margin, and acknowledge this in the Discussion: page 19, lines 6-10: “Therefore, further DGE-DMI preclinical studies aimed at detecting and quantifying relatively weak signals, such as tumor glutamate-glutamine, and/or increase the nominal spatial resolution to better correlate those metabolic results with histology findings (e.g. in the tumor margin), should improve basal SNR with higher magnetic field strengths, more sensitive RF coils, and advanced DMI pulse sequences [55]).”
(2) To achieve their goal of high-level metabolic characterization the authors set out to measure the deuterium labeling kinetics following an intravenous bolus of deuterated glucose, instead of the easier measurement of steady-state after the labeling has leveled off. These dynamic data are then used as input for a mathematical model of glucose metabolism to derive fluxes in absolute units. While this is conceptually a well-accepted approach there are concerns about the validity of the included assumptions in the metabolic model, and some of the model's equations and/or defining of fluxes, that seem different than those used by others.
PR-R2.C2: These concerns about the metabolic model, were also raised in more detail in the section Recommendations For The Authors, where they are addressed more extensively – please refer to RA-R2.C3 (glucose infusion protocol) and RA-R2.C4 (equations). In brief, we explain that the total volume injected (100uL/25g animal) is standard for i.v. administration in mice, and clarify this better in the manuscript (page 24, line 23); as well as the differences between our kinetic model and the original one reported by Kreis et al. (Radiology 2020), who quantified glycolysis kinetics on a subcutaneous mouse model of lymphoma, exclusively glycolytic and thus estimating the maximum glucose flux rate was from the lactate synthesis rate (Vmax = Vlac). Instead, we extended this model to account for glucose flux rates for lactate synthesis (Vlac) and also for glutamate-glutamine synthesis (Vglx) in mouse glioblastoma, where Vmax = Vlac + Vglx, also acknowledging its simplistic approach in the Discussion (page 20, lines 22-24: “(…) metabolic fluxes [estimations] through glycolysis and mitochondrial oxidation (…) could potentially benefit from an improved kinetic model simultaneously assessing cerebral glucose and oxygen metabolism, as recently demonstrated in the rat brain with a combination of 2H and 17O MR spectroscopy [62] (…)”).
Reviewer #3 (Public Review):
Summary:
Simoes et al enhanced dynamic glucose-enhanced (DGE) deuterium spectroscopy with Deuterium Metabolic Imaging (DMI) to characterize the kinetics of glucose conversion in two murine models of glioblastoma (GBM). The authors combined spectroscopic imaging and noise attenuation with histological analysis and showcased the efficacy of metabolic markers determined from DGE DMI to correlate with histological features of the tumors. This approach is also potent to differentiate the two models from GL261 and CT2A.
Strengths:
The primary strength of this study is to highlight the significance of DGE DMI in interrogating the metabolic flux from glucose. The authors focused on glutamine/glutamate and lactate. They attempted to correlate the imaging findings with in-depth histological analysis to depict the link between metabolic features and pathological characteristics such as cell density, infiltration, and distant migration.
Weaknesses:
(1) A lack of genetic interrogation is a major weakness of this study. It was unclear what underlying genetic/epigenetic aberrations in GL261 and CT2A account for the metabolic difference observed with DGE DMI. A correlative metabolic confirmation using mass spectrometry of the two tumor specimens would give insight into the observed imaging findings.
PR-R3.C1: We thank the Reviewer for the helpful comments, which we break down below.
PR-R3.C1.1 - genetic interrogation/manipulation: While we did not have access to conditional models for key enzymes of each metabolic pathway, for their genetic manipulation, we did however assess the mitochondrial function in each cell line, showing a significantly higher respiration buffer capacity and more efficient metabolic plasticity between glycolysis and mitochondrial oxidation in GL261 cells compared to CT2A (Simoes et al. NIMG:Clin 2022). This could drive e.g. more active recycling of lactate through mitochondrial metabolism in GL261 cells, aligned with our observations of increased glucose-derived lactate consumption rate in those tumors compared to CT2A. We have now included this in the discussion (page 17, lines 812): “our results suggest increased lactate consumption rate (active recycling) in GL261 tumors with higher vascular permeability, e.g. as a metabolic substrate for oxidative metabolism [44] promoting GBM cell survival and invasion [45], aligned with the higher respiration buffer capacity and more efficient metabolic plasticity of GL261 cells than CT2A [31].”
PR-R3.C1.2 - correlation with post-mortem metabolic assessment: implementing this validation step would require an additional equipment, also not accessible to us: focalized irradiator, to instantly halt all metabolic reactions during animal sacrifice. We do believe that DGE-DMI could guide further studies of such nature, aimed at validating the spatio-temporal dynamics of regional metabolite concentrations in mouse brain tumors. Thus, the importance of end-point validation is now stressed more clearly in the manuscript (page 20, lines 13-16): “(…) mapping pathway fluxes alongside de novo concentrations (…) may be determinant for the longitudinal assessment of GBM progression, with end-point validation (…)”.
These concerns and recommendations were also raised by the Reviewer in the Recommendations to Authors section, where we address them more extensively – please see RA-R1.C3 and RA-R1.C2, respectively.
(2) A better depiction of the imaging features and tumor heterogeneity would support the authors' multimodal attempt.
PR-R3.C2: We agree with the Reviewer that including more imaging features would improve the non-invasive characterization of each tumor. Due to the RF coil design and time constraints, we did not acquire additional data, such as diffusion MRI to assess tissue microstructure. Instead, our multi-modal protocol included two dynamic MRI studies on each animal, for multiparametric assessment of tumor volume, metabolism and vascular permeability, using 1H-MRI, 2H-spectroscopy during 2H-labelled glucose injection, and 1H-imaging during Gd-DOTA injection, respectively. Rather than aiming at tumor radiomics, we focused on the dynamic assessment of tumor metabolic turnover with heteronuclear spectroscopy, which is challenging per se and particularly in mouse brain tumors, given their very small size. For such multi-modal studies we used a previously developed dual tuned RF coil: the deuterium coil (2H) positioned in the mouse head, for optimal SNR; whereas the proton coil (1H) had suboptimal performance compared a conventional single tuned coil, and was used only for basic localization and adjustments, reference imaging and tumor volumetry (T2-weighted), and DCE-T1 MRI (T1weighted). The latter was analyzed pixel-wise to assess spatial correlations between tumor permeability and metabolic metrics, as shown in Fig S3. Whereas the limited T2w MRI data collected was only analyzed for tumor volume assessment; no additional imaging features were extracted (e.g. kurtosis/skewness), since such assessment did not shown any differences between the two tumor cohorts in our previous study (Simoes et al NIMG:Clin 2022).
(3) Integration of the various cell types in the tumor microenvironment, as allowed with the resolution of DGE DMI, will explain the observed difference between GL261 and CT2A. Is there a higher percentage of infiltrative "other cells" observed in GL261 tumor?
PR-R3.C3: While DGE-DMI resolution is far larger than brain and brain tumor cell sizes, we now performed additional analysis to assess the percentage of microglia/macrophages in both cohorts. The results are now included in the manuscript, namely Fig. 2B, as previously explained in PR-R1.1. Interestingly though, we observed a lower percentage of infiltrative "other cells" in GL261 tumors compared to CT2A, which we discuss in the manuscript: pages 19-20 (lines 20-24 and 1-4), “Finally, our results are indicative of higher microglia/macrophage infiltration in CT2A than GL261 tumors, which is inconsistent with another study reporting higher immunogenicity of GL261 tumors than CT2A for microglia and macrophage populations [56]. Such discrepancy could be related to methodologic differences between the two studies, namely the endpointguided assessment of tumor growth (bioluminescence vs MRI, more precise volumetric estimations) and the stage when tumors were studied (GL261 at 23-28 vs 16-18 days postinjection, i.e. less time for immune cell to infiltration in our case), presence/absence of a cell transformation step (GFP-Fluc engineered vs we used original cell lines), or perhaps media conditioning effects during cell culture due to the different formulations used (DMEM vs RPMI).”
(4) This underlying technology with DGE DMI is capable of identifying more heterogeneous GBM tumors. A validation cohort of additional in vivo models will offer additional support to the potential clinical impact of this study.
PR-R3.C4: We agree with the Reviewer that applying DGE-DMI to more clinically-relevant models of human brain tumors will enhance its translational impact to patients, as also suggested by Reviewer 1 and addressed in PR-R1.C3. We also believe that the feasibility and potential of DGE-DMI for application to different glioblastoma models or patients, with GBM or any other primary or secondary brain tumors, is clearly demonstrated in our work, using two reliable and well-described immunocompetent models of GBM. In any case, we have now modified the manuscript to better acknowledge this point: page 18 (lines 14-16), “(…) patient-derived xenografts and de novo models would be more suited to recapitulate human GBM heterogeneity and infiltration features (…)”.
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
(1) The authors utilize longitudinal MRI to track tumor volumes but perform DMI at endpoint with late-stage tumors. Their previous publication applied metabolic imaging in tumors before the presence of necrosis. It would be valuable to perform longitudinal DMI to examine the evolution of glucose flux metabolic profile over time in the same tumor.
RA-R1.C1: We thank the Reviewer for the very useful comments to our manuscript. We agree – in this work, we aimed at “extending” our previous DGE-2H single-voxel methodology to multivoxel (DMI), thoroughly demonstrating (1) its in vivo application to the same immunocompetent models of glioblastoma and (2) the ability to depict their phenotypic differences, and therefore (3) the potential for the metabolic characterization of more advanced models of GBM and/or their progression stages. We believe these objectives were achieved. Our results indeed open several possibilities, from longitudinal assessment of the spatio-temporal metabolic changes during GBM progression (and treatment-response) to its application to other models recapitulating more closely the human disease. Now that we have comprehensively demonstrated a protocol for DGE-DMI acquisition, processing and analysis in mouse GBM (a very challenging methodology), and demonstrate it in different mouse GBM cell lines, new studies can be designed to tackle more specific questions, like the one suggested here by the Reviewer. We have modified the manuscript to make this point clearer: page 20 (lines 15-17), “This may be determinant for the longitudinal assessment of GBM progression, with end-point validation; and/or treatment-response, to help selecting among new therapeutic modalities targeting GBM metabolism (…)”; page 21 (lines 5-8), “(…) we report a DGE-DMI method for quantitative mapping of glycolysis and mitochondrial oxidation fluxes in mouse GBM, highlighting its importance for metabolic characterization and potential for in vivo GBM phenotyping in different models and progression stages.”.
(2) The authors demonstrate a promising correlation between metabolic phenotypes in vivo and key histopathological features of GBM at the endpoint. Directly assessing metabolites involved in glucose fluxes on endpoint tumor samples would strengthen this correlation.
RA-R1.C2: While we acknowledge the Reviewer’s point, there were two main limitations to implementing such validation step in our protocol:
(1) Since we performed dynamic experiments, at the end of each study most 2H-glucose-derived metabolites were already below their maximum concentration (or barely detectable in some cases), as depicted by the respective kinetic curves (Fig 1-D and Fig S7), and thus no longer detectable in the tissues. Importantly, DGE-DMI could guide further studies towards selecting the ideally time-point for validating different metabolite concentrations in specific brain regions.
(2) Such validation would require sacrificing the animals with a focalized irradiator (which we did not have), to instantly halt all metabolic reactions. Only then we could collect and analyze the metabolic profile of specific brain regions, either by in vitro MS or high-resolution NMR following extraction, or by ex vivo HRMAS analysis of the intact tissue, as reported previously by some of the authors for validation of glucose accumulation in different regions of mouse GL261 tumors (Simões et al. NMRB 2010: https://doi.org/10.1002/nbm.1421). Importantly, even if we did have access to a focalized irradiator, such protocols for metabolic characterization would compromise tissue integrity and thus the histopathologic analysis performed in this study.
We do agree with the importance of end-point validation and therefore stress it more clearly in the revised manuscript (page 20, lines 14-16): “(…) mapping pathway fluxes alongside de novo concentrations (…) may be determinant for the longitudinal assessment of GBM progression, with end-point validation (…)”.
(3) Genetic manipulation of key players in the metabolic pathways studied in this paper (glycolysis and mitochondrial oxidation) would offer a strong validation for the sensitivity of DGE-DMI in accurately distinguishing metabolites (lactate, glutamate-glutamine) and their dynamics.
RA-R1.C3: Thank you for this comment, we agree. This would be particularly relevant in the context of treatment-response monitoring. While such models were not available to us (conditional spatio-temporal manipulation of metabolic pathway fluxes), we believe our results can still demonstrate this point: We previously used in vivo DGE 2H-MRS to show evidence of decreased glucose oxidation fraction (Vglx/Vlac) in GL261 tumors under acute hypoxia (FiO2=12 %) compared to regular anesthesia conditions (FiO2=31 %), consistent with the inhibition of OXPHOS due to lower oxygens tensions (Simoes et al. NIMG:Clin 2022). In the present work, enhanced glycolysis in tumors vs peritumoral brain regions was clearly observed in all the animals studied, from both cohorts, as shown in Fig 1-B and Fig S4. Moreover, the spectral background (before glucose injection) is limited to a single peak in all the voxels: basal DHO, used as internal reference for spatio-temporal quantification of glucose, glutamine-glutamate, and lactate, all de novo and extensively characterized in healthy and glioma-bearing rodent brain (Lu et al. JCBFM 2018; Zhang et al. NMR Biomed 2024, de Feyter et al. SciAdv 2018; Batsios et al ClinCancerRes 2022; Simoes et al. NIMG:Clin 2022) and other rodent tumors (Kreis et al. Radiology 2020, Montrazi et al. SciRep 2023). We have modified the manuscript to clarify this point (page 18, lines 14-17) “(…) patient-derived xenografts and de novo models would be more suited to recapitulate human GBM heterogeneity and infiltration features, and genetic manipulation of glycolysis and mitochondrial oxidation pathways could be relevant to ascertain DGE-DMI sensitivity for their quantification (…)”.
(4) Please explain more why DEG-DMI can distinguish different glucose metabolites and how accurate it is.
RA-R1.C4: DGE-DMI is the imaging extension of our previous work based on single-voxel deuterium spectroscopy, therefore relying on the same fundamental technique and analysis pipeline but moving from a temporal analysis to a spatio-temporal analysis for each metabolite, and thus dealing with more data. Unlike conventional proton spectroscopy (1H), only metabolites carrying the deuterium label (2H) will be detected in this case, including the natural abundance DHO (~0.03%), the deuterated glucose injected and its metabolic derivatives, namely deuterated lactate and deuterated glutamate-glutamine. Due to their different molecular structures, the deuterium atoms will resonate at specific frequencies (chemical shifts, ppm) during a 2H magnetic resonance spectroscopy experiment, as illustrated in Fig 1-A. The method is fully reproducible and accurate, and has been extensively reported in the literature from high-resolution NMR spectroscopy to in vivo spectroscopic imaging of different nuclei, such as proton (1H), deuterium (2H), carbon (13C), phosphorous (31P), and fluorine (19F). Since the fundamental principles of DMI and its application to brain tumors have been very well described in the flagship article by de Feyter et al., we have now highlighted this in the manuscript: page 4 (lines 4-7), “Deuterium metabolic imaging (DMI) has been (…) demonstrated in GBM patients, with an extensive rationale of the technique and its clinical translation [18], and more recently in mouse models of patient-derived GBM subtypes (…)”.
(5) When mapping glycolysis and mitochondrial oxidation fluxes, add a control method to compare the reliability of DEG-DMI.
RA-R1.C5: This concern (“lack of a control method”) was also raised by the Reviewer in the section Public Reviews section, where we already address it (PR-R1.2).
(6) If using peritumoral glutamate-glutamine recycling as a marker of invasion capacity, what would be the correct rate of the presence of secondary brain lesions?
RA-R1.C6: While our results suggest the potential of peritumoral glutamate-glutamine recycling as a marker for the presence of secondary brain lesions, this remains to be ascertained with higher sensitivity for glutamate-glutamine detection. Therefore, we cannot make further conclusions in this regard.
To make this point clear, we state in different sections of the discussion: page 19 (lines 1-2), “(…) recycling of the glutamate-glutamine pool may reflect a phenotype associated with secondary brain lesions.”; and page 19 (lines 6-10), “Therefore, further DGE-DMI preclinical studies aimed at detecting and quantifying relatively weak signals, such as tumor glutamateglutamine, and/or increase spatial resolution to correlate those metabolic results with histology findings (e.g in the tumor margin), should improve basal SNR with higher magnetic field strengths, more sensitive RF coils, and advanced DMI pulse sequences [55]).”).
(7) There are duplicated Vlac in Figure S3 B.
RA-R1.C7: This was a typo that has now been corrected. Thank you.
(8) Figure 4, it would be better to add a metabolic map of a tumor without secondary brain lesions to compare.
RA-R1.C8: We fully agree and have modified Fig 4 accordingly, together with its legend.
Particularly, we have included tumors C4 (without secondary lesions) vs G4 (with) for this “comparison”, since details of their histology, including the secondary lesions, are provided in Fig 2.
(9) Full name of SNR and FID should be listed when first mentioned.
RA-R1.C9: Agreed and modified accordingly, on pages 6-7 (lines 22-1), ”signal-to-noise-ratio (SNR)”, and page 19 (lines 5-6), “free induction decay (FID)”.
(10) Page 2, Line 14: (59{plus minus}7 mm3) is not needed in the abstract.
RA-R1.C10: As requested we have removed this specification from the Abstract.
(11) Page 4, Line 22: Closing out the Introduction section with a statement on broader implications of the present work would enhance the effectiveness of the section.
RA-R1.C11: We have added an additional sentence in this regard – pages 4-5 (lines 24-2): “Since DMI is already performed in humans, including glioblastoma patients [18], DGE-DMI could be relevant to improve the metabolic mapping of the disease.”
(12) Define all acronyms to facilitate comprehension. For example, principal component analysis (PCR) and signal-to-noise ratio (SNR).
R1.C12: Thank you for the comment. We have now defined all the acronyms when first used, including PCA (page 4 (line 11), “Marcheku-Pastur Principal Component Analysis (MP-PCA)”) and SNR (pages 6-7 (lines 22-1), as indicated above in comment R1.9).
(13) Some elements within the figures have lower resolution, specifically bar graphs.
RA-R1.C13: We apologize for this oversight. All the Figures have been revised accordingly, to correct this problem. Thank you.
(14) Page 13, Line 8: "underly" should be spelled "underlie."
RA-R1.C14: The typo has been corrected on page 15 (line 8), thank you.
(15) Page 14, Line 13: "better vascular permeability" would be more effectively phrased as "increased vascular permeability."
RA-R1.C15: This has also been corrected on page 16 (line 14), thank you.
Reviewer #2 (Recommendations For The Authors):
(1) I strongly suggest adding a scale bar in the histology figures.
RA-R2.C1: Thank you for spotting our oversight! This has now been added as requested to Fig 2.
(2) The 2H MRSI data were acquired at a nominal resolution of 2.25 x 2.27 x 2.25 mm^3, resulting in a nominal voxel volume of 11.5 uL. (In reality, this is larger due to the point spread function leading to signal bleeding from adjacent voxels.) If we estimate the volume of the tumor rim, as indicated by the histology slides, as (generously) ~ 50 um in width, 3.2 mm long (the diagonal of a 2.25 x 2.25 mm^2 square, and 2.27 mm high, we get a volume of 0.36 uL. Therefore the native spatial resolution of the 2H MRSI is at least 30 times larger than the volume occupied by the tumor rim/microenvironment. Normal tissue and tumor tissue will contribute the majority of the metabolic signal of that voxel. I feel an opposite approach could have been pursued: find out the spatial resolution needed to characterize the tumor rim based on the histology, then use a de-noising method to bring the SNR of those data to be acceptable. (this is just a thought experiment that assumes de-noising actually works to improve quantification for MRS data instead of merely cosmetically improve the data, so far the jury is still out on that, in my view).
RA-R2.C2 – We thank the Reviewer for the detailed analysis and reply below to each point.
RA-R2.C2.1 – spatial resolution and tumor rim: Our nominal voxel volume was indeed 11.5 uL, defined in-plane by the PSF which explains signal bleeding effects, as in any other imaging modality. The DMI raw data were Fourier interpolated before reconstruction, rendering a final in-plane resolution of 0.56 mm (0.72 uL voxel volume). The tumor rim (margin) analyzed was roughly 0.1 mm width (please note, not 0.05 mm), as explained in the methods section (page 28, line 16) and now more clearly defined with the scale bars in Fig 2. According to the Reviewer’s analysis, this would correspond to 0.1*3.2*2.27 = 0.73 uL, which we approximated with 1 voxel (0.72 uL), as displayed in Fig 3-A. Importantly, it has long been demonstrated that Fourier interpolation provides a better approximation to the ground truth compared to the nominal resolution, and even to more standard image interpolation performed after FT - see for instance Vikhoff-Baaz B et al. (MRI 2001. 19: 1227-1234), now citied in the Methods section: page 24, line 24 ([69]). While we do agree that both normal brain and tumor should contribute significantly to the metabolic signal in this relatively small region, we rely on extensive literature to maintain that despite its smoothing effect, the display resolution provides a better approximation to the ground truth and is therefore more suited than the nominal resolution for ROI analysis in this region. Still, we acknowledge this potential limitation in the Discussion: page 19, lines 6-10: “Therefore, further DGE-DMI preclinical studies aimed at detecting and quantifying relatively weak signals, such as tumor glutamate-glutamine, and/or increase the nominal spatial resolution to better correlate those metabolic results with histology findings (e.g. in the tumor margin), should improve basal SNR with higher magnetic field strengths, more sensitive RF coils, and advanced DMI pulse sequences [55]).”
RA-R2.C2.2 – metabolic and histologic features at the tumor rim: Furthermore, we also performed ROI analysis of lactate metabolic maps in tumor and peritumoral rim areas closely reflected regional differences in cellularity and cell density, and immune cell infiltration between the 2 tumor cohorts and across pooled cohorts, as explained in the Public Review section - PR-R1.1 – and now report in the manuscript: page 12 (lines 6-16), “GL261 tumors accumulated significantly less lactate in the core (1.60±0.25 vs 2.91±0.33 mM: -45%, p=0.013) and peritumor margin regions (0.94±0.09 vs 1.46±0.17 mM: -36%, p=0.025) than CT2A – Fig 3 A-B, Table S1. Consistently, tumor lactate accumulation correlated with tumor cellularity in pooled cohorts (R=0.74, p=0.014). Then, lower tumor lactate levels were associated with higher lactate elimination rate, k<sub>lac</sub> (0.11±0.1 vs 0.06±0.01 mM/min: +94%, p=0.006) – Fig 3B – which in turn correlated inversely with peritumoral margin infiltration of microglia/macrophages in pooled cohorts (R=-0.73, p=0.027) - Fig 3-C. Further analysis of Tumor/P-Margin metabolic ratios (Table S3) revealed: (i) +38% glucose (p=0.002) and -17% lactate (p=0.038) concentrations, and +55% higher lactate consumption rate (p=0.040) in the GL261 cohort; and (ii) lactate ratios across those regions reflected the respective cell density ratios in pooled cohorts (R=0.77, p=0.010) – Fig 3-C”; page 17 (lines 1-8), “Tumor vs peritumor border analyses further suggest that lactate metabolism reflects regional histologic differences: lactate accumulation mirrors cell density gradients between and across the two cohorts; whereas lactate consumption/elimination rate coarsely reflects cohort differences in cell proliferation, and inversely correlates with peritumoral infiltration by microglia/macrophages across both cohorts. This is consistent with GL261’s lower cell density and cohesiveness, more disrupted stromal-vascular phenotypes, and infiltrative growth pattern at the peritumor margin area, where less immune cell infiltration is detected and relatively lower cell division is expected [43]”.
RA-R2.C2.3 – alternative method: Regarding the alternative method suggested by the Reviewer, we have tested a similar approach in another region (tumor) and it did not work, as explained the Discussion section (page 19, lines 5-6) and Fig S11. Essentially, Tensor PCA performance improves with the number of voxels and therefore limiting it to a subregion hinders the results. In any case, if we understand correctly, the Reviewer suggests a method to further interpolate our data in the spatial dimension, which would deviate even more from the original nominal resolution and thus sounds counter-intuitive based on the Reviewer’s initial comment about the latter. More importantly, we would like to remark the importance of spectral denoising in this work, questioned by the Reviewer. There are several methods reported in the literature, most of them demonstrated only for MRI. We previously demonstrated how MPPCA denoising objectively improved the quantification of DCE-2H MRS in mouse glioma by significantly reducing the CRLBs: 19% improved fitting precision. In the present study, Tensor PCA denoising was applied to DGE-DMI, which led to an objective 63% increase in pixel detection based on the quality criteria defined, unambiguously reflecting the improved quantification performance due to higher spectral quality.
(3) Concerns re. the metabolic model: 2g/kg of glucose infused over 120 minutes already leads to hyperglycemia in plasma. Here this same amount is infused over 30 seconds... such a supraphysiological dose could lead to changes in metabolite pool sizes -which are assumed to not change since they are not measured, and also fractional enrichment which is not measured at all. Such assumptions seem incompatible with the used infusion protocol.
RA-R2.C3: We understand the concern. However, the protocol was reproduced exactly as originally reported by Kreis et al (Radiology 2020) that performed the measurements in mice and measured the fraction of deuterium enrichment (f=0.6). Since we also worked with mice, we adopted the same value for our model. The total volume injected was 100uL/25g animal, and adjusted for animal weight (96uL/24g average – Table S1), as we reported before (Simões et al. NIMG:Clin 2022), which is standard for i.v. bolus administration in mice as it corresponds to ~10% of the total blood volume. This volume is therefore easily diluted and not expected to introduce significant changes in the metabolic pool sizes. Continuous infusion protocols on the other hand will administer higher volumes, easily approaching the mL range when performed over periods as large as 120 min. This would indeed be incompatible with our bolus infusion protocol. We have now clarified this in the manuscript – page 24 (line 23): “i.v. bolus of 6,6<sup>′2</sup>H<sub>2</sub>-glucose (2 mg/g, 4 µL/g injected over 30 s (…)”.
(4) Vmax = Vlac + Vglx. This is incorrect: Vmax = Vlac.
RA-R2.C4: Thank you for raising this concern. As indicated in RA-R2.C3, our model (Simões et al. NIMG:Clin 2022) was adapted from the original model proposed by Kreis et al. (Radiology 2020), where the authors quantified glycolysis kinetics on a subcutaneous mouse model of lymphoma, exclusively glycolytic and thus estimating the maximum glucose flux rate was from the lactate synthesis rate (Vmax = Vlac). However, we extended this model to account for glucose flux rates for lactate synthesis (Vlac) and also for glutamate-glutamine synthesis (Vglx), where Vmax = Vlac + Vglx, as explained in our 2022 paper. While we acknowledge the rather simplistic approach of our kinetic model compared to others - reported by 13C-MRS under continuous glucose infusion in healthy mouse brain (Lai et al. JCBFM 2018) and mouse glioma (Lai et al. IJC 2018) – and acknowledge this in the Discussion (page 20, lines 22-24: “(…) metabolic fluxes [estimations] through glycolysis and mitochondrial oxidation (…) could potentially benefit from an improved kinetic model simultaneously assessing cerebral glucose and oxygen metabolism, as recently demonstrated in the rat brain with a combination of 2H and 17O MR spectroscopy [62] (…)”), our Vlac and Vglx results are consistent with our previous DGE 2H-MRS findings in the same glioma models, and very aligned with the literature, as discussed in PR-R1.C2.1.
(5) Some other items that need attention: 0.03 % is used as the value for the natural abundance of DHO. The natural abundance of 2H in water can vary somewhat regionally, but I have never seen this value reported. The highest seen is 0.015%.
RA-R2.C5: The Reviewers is referring to the natural abundance of deuterium in hydrogen: 1 in ~6400 is D, i.e. 0.015 %. The 2 hydrogen atoms in a water molecule makes ~3200 DHO, i.e. 0.03%. Indeed the latter can have slight variations depending on the geographical region, as nicely reported by Ge et al (Front Oncol 2022), who showed a 16.35 mM natural-abundance of DHO in the local tap water of St Luis MO, USA (55500/16.35 = 1/3364 = 0.034%).
(6) Based on the color scale bar in Figure 1, the HDO concentration appears to go as high as 30 mM. Even if this number is off because of the previous concern (HDO), it appears to be a doubling of the HDO concentration. Is this real? What would be the origin of that? No study using [6,6'-2H2]-glucose that I'm aware of has reported such an increase in HDO.
RA-R2.C6: As explained before (RA-R2.C3 and RA-R2.C4), we based our protocol and model on Kreis et al (Radiology 2020), who reported ~10 mM basal DHO levels raising up to ~27 mM after 90min, which are well within the ~30 mM ranges we report over a longer period (132 min).
Similar DHO levels were mapped with DGE-DMI in mouse pancreatic tumors (Montrazi et al. SciRep 2023).
(7) "...the central spectral matrix region selected (to discard noise regions outside the brain, as well as the olfactory bulb and cerebellum)". This reads as if k-space points correspond one-toone with imaging pixels, which is not the case.
RA-R2.C7: We rephrased the sentence to avoid such potential misinterpretation, specifically: page 25 (lines 19-21): “Each dataset was averaged to 12 min temporal resolution and the noise regions outside the brain, as well as the olfactory bulb and cerebellum, were discarded (…)”.
(8) The use of the term "glutamate-glutamine recycling" is not really appropriate since these metabolites are not individually detected with 2H MRS, which is a requirement to measure this neurotransmitter cycling.
RA-R2.C8: Thank you for this comment. To avoid this misinterpretation, we have now rephrased "glutamate-glutamine recycling" to “recycling of the glutamate-glutamine pool” in all the sentences, namely: page 2 (lines 14-15); page 15 (line 8); page 15 (line 8); page 19 (line 1); page 21 (line 10).
Reviewer #3 (Recommendations For The Authors):
(1) One major issue is the lack of underlying genetics, and therefore it is hard for readers to put the observed difference between GL261 and CT2A into context. The authors might consider perturbing the genetic and regulatory pathways on glycolysis and glutamine metabolism, repeating DGE DMI measure, in order to enhance the robustness of their findings.
RA-R3.C1: We thank the reviewer for the helpful revision and comments. The point made here is aligned with Reviewer 1’s, addressed in RA-R1.C3; and also with our previous reply to the Reviewer, PR-R3.C1. Thus, we agree that conditional spatio-temporal manipulation of metabolic pathway fluxes would be relevant to further demonstrate the robustness of DGEDMI, particularly for treatment-response monitoring. While such models were not available to us, our previous findings seem compelling enough to demonstrate this point. Thus, we previously showed a significantly higher respiration buffer capacity and more efficient metabolic plasticity between glycolysis and mitochondrial oxidation in GL261 cells compared to CT2A (Simoes et al. NIMG:Clin 2022), which could enhance lactate recycling through mitochondrial metabolism in GL261 cells and thus explain our observations of increased glucose-derived lactate consumption rate in those tumors compared to CT2A. We have now included this in the discussion (page 17, lines 8-12): “our results suggest increased lactate consumption rate (active recycling) in GL261 tumors with higher vascular permeability, e.g. as a metabolic substrate for oxidative metabolism [44] promoting GBM cell survival and invasion [45], aligned with the higher respiration buffer capacity and more efficient metabolic plasticity of GL261 cells than CT2A [31].” Moreover, we previously showed evidence of DGE-2H MRS’ ability to detect decreased glucose oxidation fraction (Vglx/Vlac) in GL261 tumors under acute hypoxia (FiO2=12 %) compared to regular anesthesia conditions (FiO2=31 %), consistent with the inhibition of OXPHOS due to lower oxygens tensions (Simoes et al. NIMG:Clin 2022).
(2) Is increased resolution possible for DGE DMI to correlate with histological findings?
RA-R3.C2: The resolution achieved with DGE DMI, or any other MRI method, is limited by the signal-to-noise ratio (SNR), which in turn depends on the equipment (magnetic field strength and radiofrequency coil), the pulse sequence used, and post-processing steps such as noiseremoval. Thus, increased resolution could be achieved with higher magnetic field strengths, more sensitive RF coils, more advanced DMI pulse sequences, and improved methods for spectral denoising if available. We have used the best configuration available to us and discussed such limitations in the manuscript, including now a few modifications to address the Reviewer’s point more clearly – page 19 (lines 6-10): “Therefore, further DGE-DMI preclinical studies aimed at detecting and quantifying relatively weak signals, such as tumor glutamateglutamine, and/or increase the nominal spatial resolution to better correlate those metabolic results with histology findings (e.g in the tumor margin), should improve basal SNR with higher magnetic field strengths, more sensitive RF coils, and advanced DMI pulse sequences [55])”.
(3) The authors might consider measuring the contribution of stromal cells and infiltrative immune cells in the analysis of DGE DMI data, to construct a more comprehensive picture of the microenvironment.
RA-R3.C3: Thank you for this important point. We now added additional Iba-1 stainings of infiltrating microglia/macrophages, for each tumor, as suggested by the Reviewer; stromal cells would be more difficult to detect and we did not have access to a validated staining method for doing so. Our new data and results - now included in Fig 2B – indicate significantly higher levels of Iba-1 positive cells in CT2A tumors compared to GL261, which are particularly noticeable in the periphery of CT2A tumors and consistent with their better-defined margins and lower infiltration in the brain parenchyma. This has been explained more extensively in PRR1.1.
(4) Additional GBM models with improved understanding of the genetic markers would serve as an optimal validation cohort to support the potential clinical translation.
RA-R3.C4: We agree with the Reviewer and direct again to RA-R1.3, where we already addressed this suggestion in detail and introduced modifications to the manuscript accordingly.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
Summary:
In this report, the authors investigated the effects of reproductive secretions on sperm function in mice. The authors attempt to weave together an interesting mechanism whereby a testosterone-dependent shift in metabolic flux patterns in the seminal vesicle epithelium supports fatty acid synthesis, which they suggest is an essential component of seminal plasma that modulates sperm function by supporting linear motility patterns.
Strengths:
The topic is interesting and of general interest to the field. The study employs an impressive array of approaches to explore the relationship between mouse endocrine physiology and sperm function mediated by seminal components from various glandular secretions of the male reproductive tract.
Thank you for your positive evaluation of our study's topic and approach. We are pleased that you found our investigation into the effects of reproductive secretions on sperm function to be of general interest to the field. We appreciate your positive feedback on the diverse methods we employed to explore this complex relationship.
Weaknesses:
Unfortunately, support for the proposed mechanism is not convincingly supported by the data, and the experimental design and methodology need more rigor and details, and the presence of numerous (uncontrolled) confounding variables in almost every experimental group significantly reduce confidence in the overall conclusions of the study.
The methodological detail as described is insufficient to support replication of the work. Many of the statistical analyses are not appropriate for the apparent designs (e.g. t-tests without corrections for multiple comparisons). This is important because the notion that different seminal secretions will affect sperm function would likely have a different conclusion if the correct controls were selected for post hoc comparison. In addition, the HTF condition was not adjusted to match the protein concentrations of the secretion-containing media, likely resulting in viscosity differences as a major confounding factor on sperm motility patterns.
We appreciate you highlighting concerns regarding our weak points and apologize for our unclear description. We revised the manuscript to be as rigorous and detailed as possible. In addition, some experimental designs were changed to simpler direct comparisons, and additional experiments were conducted (New Figure 1A-F, lines 103-113). We have made our explanations more consistent with the provided data, which includes further experimentation with additional controls and larger sample sizes to increase the reliability of the findings.
To address the multiple testing problem, a multiple testing correction was made by making the statistical tests more stringent (Please see Statistical analysis in the Methods section and the Figure legends). Based on different statistical methods, the analysis results did not require significant revisions of the previous conclusions.
Because the experiments on mixing extracts from the seminal vesicles were exploratory, we planned to avoid correcting for multiple comparisons. Repeating the t-test could lead to a Type I error in some results, so we apologize for not interpreting and annotating them. In the revised version, we removed the dataset for experiments on mixing extracts from the seminal vesicles and prostate, and we changed the description to refer to the clearer dataset mentioned above.
The viscosity of the secretion-containing medium was measured with a viscometer, confirming that secretions did not significantly affect the viscosity of the solution. In addition, as the reviewer pointed out, we addressed the issue that the HTF condition could not be used as a control because of the heterogeneity in protein concentration (New Fig.1G, lines 110-111).
Overall, we concluded that seminal vesicle secretion improves the linear motility of sperm more than prostate secretion.
There is ambiguity in many of the measurements due to the lack of normalization (e.g. all Seahorse Analyzer measurements are unnormalized, making cell mass and uniformity a major confounder in these measurements). This would be less of a concern if basal respiration rates were consistently similar across conditions and there were sufficient independent samples, but this was not the case in most of the experiments.
We apologize for the many ambiguities in the first manuscript. Cell culture experiments in the paper, including the flux analysis, were performed under conditions normalized or fixed by the number of viable cells. The description has also been revised to emphasize that the measurement values are standardized by cell count (lines 183-185, 189-190, 194-197). We emphasize that testosterone affects metabolism under the same number of viable cells (New Fig.4). This change in basal respiration is thought to be due to the shift in the metabolic pathway of seminal vesicle epithelial cells to a “non-normal TCA cycle” in which testosterone suppresses mitochondrial oxygen consumption, even under aerobic conditions (New Figs.3, 4, 5).
The observation that oleic acid is physiologically relevant to sperm function is not strongly supported. The cellular uptake of 10-100uM labeled oleic acid is presumably due to the detergent effects of the oleic acid, and the authors only show functional data for nM concentrations of exogenous oleic acid. In addition, the effect sizes in the supporting data were not large enough to provide a high degree of confidence given the small sample sizes and ambiguity of the design regarding the number of biological and technical replicates in the extracellular flux analysis experiments.
Thank you for your important critique. As you noted, the too-high oleic acid concentration did not reflect physiological conditions. Therefore, we changed the experimental design of an oleic acid uptake study and started again. We added an in vitro fertilization experiment corresponding to the functional data of exogenous oleic acid at nM concentrations (New Fig.7J,K, Lines 274-282).
For the flux data to determine the effect of oleic acid on sperm metabolism, we have indicated in the text that the data were obtained based on eight male mice and two technical replicates. Pooled sperm isolated and cultured from multiple mice were placed in one well. The measurements were taken in three different wells, and each experiment was repeated four times. We did not use the extracellular flux analyzers XFe24 or XFe96. The measurements were also repeated because the XF HS Mini was used in an 8-well plate (only a maximum of 6 samples at a run since 2 wells were used for calibration).
Overall, the most confident conclusion of the study was that testosterone affects the distribution of metabolic fluxes in a cultured human seminal vesicle epithelial cell line, although the physiological relevance of this observation is not clear.
We thank the comments that this finding is one of the more robust conclusions of our study. Below we have written our thoughts on the physiological relevance of the observation results and our proposed revisions. In the mouse experiments, when the action of androgens was inhibited by flutamide, oleic acid was no longer synthesized in the seminal vesicles. The results of the experiments using cultured seminal vesicle epithelial cells showed that oleic acid was not being synthesized because of a change in metabolism dependent on testosterone. We have also added IVF data on the effects of oleic acid on sperm function (New Fig.7 and Supplementary Fig. 5, lines 274-282).<br /> As you can see, we have obtained consistent data in vitro and in vivo in mice. Our data also showed that the effects of testosterone on metabolic fluxes in vitro are similar in mouse and human seminal vesicle epithelial cells (New Fig.9). Therefore, it can be assumed that a decrease in testosterone levels causes abnormalities in the components of human semen. However, the conclusion was overestimated in the original manuscript, so we changed the wording as follows: It could be assumed that a decrease in testosterone levels causes abnormalities in the components of human semen. (lines 422-423)
In the introduction, the authors suggest that their analyses "reveal the pathways by which seminal vesicles synthesize seminal plasma, ensure sperm fertility, and provide new therapeutic and preventive strategies for male infertility." These conclusions need stronger or more complete data to support them.
We appreciate your comments about the suggestion presented in the introduction.
We also removed our conclusions regarding treatment and prevention strategies for male infertility (lines 96-98). We wanted to discuss our findings not conclusively but as future applications that could result from further research based on our initial findings.
The last sentence of the introduction has been revised to tone down these assertions as follows: These analyses revealed that testosterone promotes the synthesis of oleic acid in seminal vesicle epithelial cells and its secretion into seminal plasma, and the oleic acid ensures the linear motility and fertilization ability of sperm.
We are grateful for your suggestions, which have prompted us to refine our manuscript.
Reviewer #2 (Public Review):
Summary:
Using a combination of in vivo studies with testosterone-inhibited and aged mice with lower testosterone levels, as well as isolated mouse and human seminal vesicle epithelial cells, the authors show that testosterone induces an increase in glucose uptake. They find that testosterone induces differential gene expression with a focus on metabolic enzymes. Specifically, they identify increased expression of enzymes that regulate cholesterol and fatty acid synthesis, leading to increased production of 18:1 oleic acid.
Strength:
Oleic acid is secreted by seminal vesicle epithelial cells and taken up by sperm, inducing an increase in mitochondrial respiration. The difference in sperm motility and in vivo fertilization in the presence of 18:1 oleic acid and the absence of testosterone is small but significant, suggesting that the authors have identified one of the fertilization-supporting factors in seminal plasma.
Thank you for your positive comments regarding our work on the role of testosterone in regulating metabolic enzymes and the subsequent production of 18:1 oleic acid in seminal vesicle epithelial cells. We are pleased that the strength of our findings, particularly identifying oleic acid as a factor influencing sperm motility and mitochondrial respiration, has been recognized.
Weaknesses:
Further studies are required to investigate the effect of other seminal vesicle components on sperm capacitation to support the author's conclusions. The author's experiments focused on potential testosterone-induced changes in the rate of seminal vesicle epithelial cell glycolysis and oxphos, however, provide conflicting results and a potential correlation with seminal vesicle epithelial cell proliferation should be confirmed by additional experiments.
Thank you very much for your valuable criticism. Although we fully agree with your comment, conducting experiments to investigate the effects of other seminal vesicle components on the fertilization potential of sperm would be a great challenge for us. This is because it has taken us the last three years to identify oleic acid as a key factor in seminal plasma. We are considering a follow-up study to explore the effect of other seminal vesicle components on sperm capacitation. Therefore, we have revised the Introduction and conclusions to tone down our assertions .
The revised manuscript also includes additional data showing a correlation between changes in metabolic flux and the proliferation of seminal vesicle epithelial cells using shRNA. As a result, it was shown that cell proliferation is promoted when mitochondrial oxidative phosphorylation is promoted by ACLY knockdown (New Fig.8D, lines 303-305). This shows a close relationship between the metabolic shift in seminal vesicle epithelial cells and cell proliferation. The revised manuscript includes an interpretation and discussion of these results (lines 369-379).
We are grateful for your suggestions, which have prompted us to refine our manuscript.
Reviewer #3 (Public Review):
Summary:
Male fertility depends on both sperm and seminal plasma, but the functional effect of seminal plasma on sperm has been relatively understudied. The authors investigate the testosterone-dependent synthesis of seminal plasma and identify oleic acid as a key factor in enhancing sperm fertility.
Strengths:
The evidence for changes in cell proliferation and metabolism of seminal vesicle epithelial cells and the identification of oleic acid as a key factor in seminal plasma is solid.
Weaknesses:
The evidence that oleic acids enhance sperm fertility in vivo needs more experimental support, as the main phenotypic effect in vitro provided by the authors remains simply as an increase in the linearity of sperm motility, which does not necessarily correlate with enhanced sperm fertility.
We appreciate the positive feedback on the solid evidence of cell proliferation and metabolic changes in seminal vesicle epithelial cells and the identification of oleic acid as an important factor in seminal plasma. We fully agree with the assessment that the evidence linking oleic acid and increased sperm fertility in vivo needs further experimental support. To address this concern, we changed the experimental design of an oleic acid study and started again to be more physiological regarding the effect of oleic acid on fertility outcomes, increased the replicates of artificial insemination, and added in vitro fertilization assessments (New Fig.7 and supplementary Fig.5, lines 274-282). The revised manuscript describes these experiments and discusses the association between oleic acid and fertility.
We are grateful for your suggestions, which have prompted us to refine our manuscript.
Recommendations for the authors:
Reviewing Editor's note:
As you can see from the three reviewers' comments, the reviewers agree that this study can be potentially important if major concerns are adequately addressed. The major concern common to all the reviewers is the incomplete mechanistic link between the physiological androgen effect on the production of oleic acid and its effect on sperm function. Statistical analyses need more rigor and consideration of other important capacitation parameters are needed to address these concerns and to improve the manuscript to support the current conclusions.
Thank you for summarizing the reviewers' feedback and for your insights regarding the major concerns raised. We appreciate the reviewers' understanding of the potential importance of our work and have addressed the issues highlighted to strengthen the manuscript. We believe these changes will improve the quality of the manuscript and provide a clearer and more complete understanding of the role of androgens and oleic acid in sperm function.
Reviewer #1 (Recommendations For The Authors):
The following comments are provided with the hope of aiding the authors in improving the alignment between the data and their interpretations.
Thank you for allowing us to strengthen our manuscript with your valuable comments and queries. We have made our best efforts to reflect your feedback.
Major Comments:
(1) The methodological detail is not sufficient to reproduce the work. For example:<br /> a. Manufacturer protocols are referred to extensively. These protocols are neither curated nor version-controlled. Please consider describing the underlying components of the assays. If information is not available, please consider providing catalog numbers and lot numbers in the methods (if appropriate for journal style requirements).
We appreciate this suggestion, which we believe is important to ensure reproducibility. We described the catalog number in our Methodology and included as much information as possible.
b. Please consider describing the analyses in full, with consideration given to whether blinding was part of the design. For example- line 492: "apoptotic cells were quantified using ImageJ". How was this quantified? How were images pre-processed? Etc.
Although blinding was not performed, experiments and analyses based on Fisher's three principles were conducted to eliminate bias (lines 549-552). In order to avoid false-positive or false-negative results, it is clearly stated that tissue sections treated with DNAse were used as positive controls, and tissue sections without TdT were used as negative controls for apoptosis. We have added detailed quantification methods (lines 544-546).
c. Please consider providing versions of all acquisition and analysis software used.
We have added software version information in Materials and Methods.
(2) Please consider revisiting the statistical analyses. Many of the analyses don't seem appropriate for the design. For example, the use of a t-test with multiple comparisons for repeated measures design in Figure 2 and the use of t-test for two-factor design in Figure 8. etc.
To address the multiple testing issues, the statistical methodology was changed to a more rigorous one. Details are given in the Statistical analysis in the Methods section and the Figure legends.
(3) The increase in % LIN in Figure 1 may be confounded by differences in viscosity between HTF and the fluid secretion mixtures. For this reason, HTF may not be an appropriate control for the ANOVA post hoc analysis. HTF protein was not adjusted to the same concentration as the secretion mixtures, correct? Ultimately, it does not appear that there would be a significant statistical effect of the different fluid mixtures if appropriate statistical comparisons were made. This detracts from the notion that the secretions impact sperm function.
(4) Figure 1, the statistical analysis in the legend suggests that the experiments were analyzed with a t-test. Were corrections made for multiple comparisons in B-D? An ANOVA would probably be more appropriate.
We used a viscometer to measure the viscosity of a solution of prostate and seminal vesicle secretions adjusted to a protein concentration of 10 mg/mL. The results showed that the secretions did not cause any significant viscosity changes (New Fig.1G, Lines 110-111).
As you pointed out, the protein levels in the HTF medium and the secretion mixture are not adjusted to the same concentration. In addition, the original manuscript was not a controlled experiment because the two factors, seminal vesicle and prostate extracts, were modified. Therefore, to investigate the effect of prostate and seminal vesicle secretions on sperm motility, we modified the experimental design to directly compare the effects of the two groups: seminal vesicle and prostate extracts (New Fig.1A-G, lines 101-113). To show the sperm quality used in this study, motility data from sperm cultured in the HTF medium are presented independently in New Supplemental Fig.1A.
(5) Additionally in Figure 1, there is no baseline quality control data to show that there are no intrinsic differences between sperm sampled from the two treatment groups. So baseline differences in sperm quality/viability remain a potential confounder.
We thank you for this important point. Epididymal sperm were collected from healthy mice. We recovered only the seminal vesicle secretions from the flutamide-treated mice to pursue its role in the accessory reproductive glands, since testosterone targets the testes and accessory reproductive organs. So, there was no qualitative difference between the epididymal sperm before treatment. Nevertheless, incubation with seminal vesicle secretion for one hour altered the sperm motility pattern and in vivo fertilization results. Sperm function was altered by seminal vesicle secretion in a short period of culture time. We apologize for the confusion, and we have revised the text and figure to carry a clearer message (lines 128-132).
(6) Figure 1E, did the authors confirm that flutamide-treated mice had decreased serum androgens? How often were mice treated with flutamide? This is important because flutamide has a relatively short half-life and is rapidly metabolized to inert hydroxyflutamide.
Serum testosterone levels were unchanged. Flutamide was administered every 24 hours for 7 consecutive days. Although there was no change in blood testosterone levels (New Supplemental Fig.1B), a decrease in the weight of the seminal vesicles, prostate, and epididymis was confirmed. This is thought to be due to the pharmacological activity of flutamide.
(7) Figure 1H, the meaning of 'relative activity of mitochondria' isn't clear. JC-1 does not measure 'activity'. A decreased average voltage potential across the inner mitochondrial membrane may indicate that more of the sperm from the flutamide group were dead. Additionally, J-aggregates are slow to form, generally requiring long incubation periods of at least 90 minutes or more. Additional positive and negative controls for predictable mitochondrial transmembrane voltage potential polarization states would have improved the quality of this experiment.
Thank you for pointing this out. We have replaced the relative activity of mitochondria with high mitochondrial membrane potential (New Fig.1M, lines 125-128). Actually, it is thought that the sperm cultured in seminal vesicle secretions from mice that had been administered flutamide died because the motility of the sperm was also significantly reduced. Since antimycin reduces mitochondrial membrane potential, we have added an experiment in which 10 µM antimycin-treated sperm were used as a control to confirm that the JC-1 reaction is sensitive to changes in membrane potential.
(8) Figure 4, the extracellular flux data appear to be unnormalized. The Seahorse instruments are extremely sensitive to the mass and uniformity of the cells at the bottom of the well. This may be a significant confounder in these results. For example, all of the observed differences between groups could simply be a product of differential cell mass, which is in line with the reduced growth potential of testosterone-treated cells indicated by the authors in the results section.
We thank you for this important point. After correcting for cell viability, we seeded the same number of viable cultured cells into wells between experimental groups before measuring them in the flux analyzer. There were no significant differences in survival rates in all experiments. As a result, an increase in glucose-induced ECAR and a suppression of mitochondrial respiration were observed. We would like to emphasize that this difference based on metabolic data does not imply a reduction in the growth potential of the cells due to testosterone treatment.
We described that these measurements are normalized based on cell count and viability (lines 184, 190, 195).
(9) How did the authors know that the isolated mouse primary cells were epithelial cells? Was this confirmed? What was the relative sample purity?
The cells were labeled with multiple epithelial cell markers (cytokeratin) and confirmed using immunostaining and flow cytometry. The percentage of cells positive for epithelial cell markers was approximately 80%. A stromal cell marker (vimentin) was also used to confirm purity, but only a few percent of cells were positive. The contaminating cell type was considered to be mainly muscle cells because the gene expression levels of muscle cell markers verified by RNA-seq were relatively high.
(10) It is misleading to include the lactate/pyruvate media measurements in the middle of the figure in Figure 4 D and E because it seems at first glance like these measurements were made in the seahorse media but they are completely unrelated. Additionally, these measures are not normalized and are sensitive to confounding differences in cell viability, seeding density, mass, etc.
Thank you for pointing this out. We have placed the lactate and pyruvate measurement graphs after the flux data of ECAR. We noted that these measurements are normalized based on cell count and viability (lines 189-190). The doubling time of seminal vesicle epithelial cells was approximately 3 days, and testosterone inhibited cell proliferation. Therefore, the seeding concentration of cells was increased 4-fold in the testosterone-treated group compared to the control, and experiments were conducted to ensure that the confluency at the time of measurement after 7 days of culture was comparable between groups.
(11) The flux analyzer assays sold by Agilent have many ambiguities and problems of interpretation. Unfortunately, Agilent's interest in marketing/sales has outpaced their interest in scientific rigor. Please consider revising some of the language regarding the measurements. For example, 'ATP production rate' is not directly measured. Rather, oligomycin-sensitive respiration rate is measured. The conversion of OCR to ATP production rate is an estimation that depends on complex assumptions often requiring additional testing and validation. The same is true for other ambiguous terms such as 'maximal respiration' referring to FCCP uncoupled respiration, and glycolytic rate- which is also not measured directly. If the authors are interested in a more detailed description of the problems with Agilent's interpretation of these assays please see the following reference (PMID: 34461088).
Thank you for your critical criticism and thoughtful advice, as well as for sharing the excellent reference. We agree with you on the flux analyzer ambiguities and data interpretation problems. The description of the measured values has been revised as follows.
We have replaced the “ATP production rate” with the “oligomycin-sensitive respiratory rate.” Similarly, we have replaced “maximal respiration” with “FCCP-induced unbound respiration.” (lines 197-202) We chose not to deal with the conversion of OCR to ATP production rate because it is outside the scope of interest in our study.
Avoid using the term "glycolytic capacity". We use “Oligomycin-sensitive ECAR.” (line 186) We recognize that the ECARs measured in this study reflect experimental conditions and may not fully represent physiological glycolytic flux in vivo. So, the main section includes a data set of glucose uptake studies to emphasize the significance of the changes obtained with the flux analyzer assay. (New Fig.6, lines 230-254)
Figure 6, it's not surprising to see the accumulation of labeled oleic acid in the cells, however, this does not mean that oleic acid is participating in normal metabolic processes. Oleic acid will have detergent effects at high (uM) concentrations. The observation that sperm 'take up' OA at 10-100 uM concentrations should also be validated against sperm function the health of the cells is very likely to be negatively impacted. Additionally, no apparent accumulation is noted in the fluorescence imaging at 1uM, but the authors insinuate that uptake occurs at low nM concentrations. The effects in Figure 6D-F are nominal at best and are likely a result of the small sample sizes.
Thank you for your good suggestion. We agree with the reviewer that high concentrations of oleic acid had a detergent effect. To improve the consistency of functional data and observations, oleic acid uptake tests were performed under the same concentration range as the sperm motility tests (New Fig.7A-C). The oleic acid concentration at this time was calculated regarding the oleic acid concentration in seminal fluid recovered from mice as detected by GCMS to reflect in vivo conditions.
Epididymal sperm were incubated with fluorescently labeled oleic acid and observed after quenching of extracellular fluorescence. Fluorescent signals were detected selectively in the midpiece of the sperm. The fluorescence intensity of sperm quantified by flow cytometry increased significantly in a dose-dependent manner (New Fig.7A-C, lines 261-264).
Furthermore, increasing the sample size did not change the trend of the sperm motility data. Although the effect size of oleic acid on sperm motility was small (New Fig.7D-G, lines 265-268), an improvement in fertilization ability was observed both in vitro (IVF) and in vivo (AI) (New Fig.7J-L, lines 274-282, 286-291). We conclude that the effect of oleic acid on sperm is of substantial significance. These data and interpretations have been revised in the text in the Results section.
(12) Figure 6H, I applaud the authors for attempting intrauterine insemination experiments to test their previous findings. That said, there is no supporting data included to show that the sperm from the treatment groups had comparable starting viability/quality. Additionally, it is difficult to tell if the results are due to the small sample sizes and particularly the apparent outlier in the flutamide-only group.
Thanks for the praise and comments for improvement. As we answered in your comment #5 above, the epididymal sperm was collected from healthy mice. Therefore, there is no qualitative difference in the epididymal sperm before treatment. This is described in the figure legend (lines 1130-1131). We apologize again for this complication. We also more than doubled the number of replications of the experiment. The impact of the outlier would have been minimal.
(13) One final question related to Figure 6H: how did the authors know they were retrieving all of the possible 2-cell embryos from the uterus? Perhaps the authors could provide the raw counts of unfertilized eggs and 2-cell embryos so we can see if there were differences between the mice.
We retrieved the pronuclear stage embryos from the fallopian tubes. It is not certain whether all embryos were recovered. Therefore, we added the number of embryos in the graph and in the supplementary data.
(14) Figure 7 has the same seahorse assay normalization problem as mentioned earlier. Without normalization, it is difficult to tell if the effects are simply due to differences in cell mass. Were the replicates indicated in the graphs run on the same plate? If so, it would be much more convincing to see a nested design, with technical replicates within plates, and additional replicates run on separate plates.
As we answered in your comment #8 above, these measurements were normalized based on sperm count. This has been corrected to be noted in the text and the figure legend (lines 1123-1124).
Pooled sperm isolated and cultured from multiple mice were placed in one well. The measurements were taken in three different wells, and each experiment was repeated four times. We did not use the extracellular flux analyzers XFe24 or XFe96. The measurements were also repeated because the XF HS Mini was used in an 8-well plate (only a maximum of 6 samples at a run since 2 wells were used for calibration).
(15) The statistical test in Figures 8E and F described in the legend is inappropriate (t-test), this appears to be a two-factor design.
Thank you for pointing this out. Differences between groups were assessed using a two-way analysis of variance (ANOVA). When the two-way ANOVA was significant, differences among values were analyzed using Tukey's honest significant difference test for multiple comparisons.
(16) The data in Figure 8 are interesting, and the effects appear to be a little more consistent compared with the mouse primary cells, potentially due to cell uniformity. However, the data are unnormalized, causing significant ambiguity, and there are no measures of cell viability to determine if the effects are due to cell death (or at least relative cell mass).
As we answered in your comments #8 and #14 above, these measurements were normalized based on cell count and viability. This has been corrected to be noted in the figure legend (lines 1185-1186).
Minor Comments:
(1) The section title indicating the beginning of the results section is missing.
A section title has been added to indicate the beginning of the results section.
(2) There were several typos and confusingly worded statements throughout. Please consider additional editing.
We used a proofreading service and corrected as much as possible.
(3) In the introduction, a brief description of seminal fluid physiology is provided, but the reference is directed toward human physiology. Given that the research is performed solely in the mouse, a brief comparative description of mouse physiology would be helpful. For example, what is the role of mouse seminal fluid in the formation of the mating plug? What are the implications of the relative size disparity in seminal vesicles in mice versus humans? Etc.
The third paragraph of the introduction has been revised (lines 57-60).
Reviewer #2 (Recommendations For The Authors):
Thank you for allowing us to strengthen our manuscript with your valuable comments and queries. We have made our best efforts to reflect your feedback.
(1) The abstract is confusing and partly misleading and should be revised to more clearly and accurately summarize the study.
The abstract was revised to be clearer and more accurate (lines 20-34).
(2) The introduction should be revised to more accurately describe the sperm life cycle. Spermatogenesis, per definition, for example, exclusively takes place in the testis, sperm do not gain fertilization competence in the epididymis, sperm isolated from the epididymis cannot fertilize an oocyte unless in vitro capacitated, etc. In the last paragraph the connection between changes in fructose and citrate concentration, sperm metabolism and testicular-derived testosterone and AR remain unclear.
The introduction was revised to be clearer and more accurate (lines 44-45).
Citric acid and fructose are chemical components that are the subject of biochemical testing and are commonly used as semen testing items for humans and livestock. This is because the secretory function of the prostate and seminal vesicles is dependent on androgens. The measurement of citric acid and fructose concentrations in semen is routinely used to indicate testicular androgen production function (ISBN: 978-1-4471-1300-3, 978 92 4 0030787).
(3) Throughout the manuscript the concept of (in vitro) capacitation is missing. Mixing sperm with seminal plasma is not the only way to achieve sperm that can fertilize the oocyte. Since media containing bicarbonate and albumin is the standard procedure in the field to capacitate epididymal mouse sperm rein vitro, the manuscript would gain value from a comparison between the effect of seminal plasma and in vitro capacitating media. Interesting readouts in addition to motility would i.e. be sAC activation, PKA-substrate phosphorylation, and acrosomal exocytosis.
Thank you for pointing out this important point. As the reviewer points out, fertilization can be achieved in artificial insemination and in vitro fertilization using epididymal sperm which have not been exposed to seminal plasma. This has historically led to an underestimation of the role of accessory reproductive glands, such as the prostate and seminal vesicles. However, it has been reported that the removal of seminal vesicles in rodents decreases the fertilization rate after natural mating. This has been shown to be due to multiple factors affecting sperm motility rather than factors involved in plug formation (PMID: 3397934), but details of these factors and the whole picture of the role of the accessory glands were not known. This led us to become interested in the effects of sperm plasma on sperm other than fertilization and led us to begin research on the role of the accessory glands that synthesize sperm plasma.
Early in our study, we found that simply exposing sperm to seminal vesicle extracts for 1 hour before IVF dramatically reduced fertilization rates, even in HTF medium containing bicarbonate and albumin. The experiment was designed on the assumption that seminal plasma contains factors that inhibit sperm from acquiring fertilizing ability. Therefore, we conducted experiments using modified HTF without albumin to avoid unintended motility patterns.
However, we also respect the reviewer's opinion, and we have added our preliminary data related to IVF (New supplementary Fig.5).
(4) In the introduction and throughout the manuscript it is unclear what the authors mean by "linear motility". An increase in VSL doesn't mean that the sperm swim in a more linear or straight way, or even that the sperm are 'straightened', it means that they swim faster from point A to point B. Do the authors mean progressive or hyperactivated motility? Please clarify.
For all conditions tested the authors should follow the standard in the field and include the % of motile, progressively motile, and hyperactivated sperm.
Thank you for pointing this out. We appreciate your feedback regarding the terminology. In our manuscript, "linear motility" refers to the degree to which sperm move in a straight line. We have clarified this by explaining that VSL (Straight-Line Velocity) and LIN (Linearity) are used to quantify and describe linear motility in sperm analysis: Higher VSL values indicate more direct, linear movement. A higher LIN value indicates a straighter path, thus representing greater linear motility. These terms have been standardized, and explanations have been added to the main text (lines 111-113).
In response to your suggestion, we have included the percentage of motility and progressive motility for all conditions tested. However, since the experiment was performed using modified HTF without albumin, we have decided not to report the percentage of hyperactivation to avoid confusion.
(5) Did the authors confirm that the injection of flutamide decreases androgen levels? That control needs to be included in the experiment to validate the conclusion.
Injection of flutamide did not reduce androgen levels (see reviewer #1, comment 6). This is because flutamide's mechanism of action is based on antagonizing androgen and inhibiting its binding to the androgen receptor (New Fig.2A).
(6) The role of mitochondrial activity in sperm progressive motility is still under investigation. PMID: 37440924 i.e. showed that inhibition of the ETC does not affect progressive but hyperactivated motility. The authors should either include additional experiments to confirm the correlation between mitochondrial activity and sperm progressive motility or tone down that conclusion.
We have previously shown that treatment with D-chloramphenicol, an inhibitor of mitochondrial translation, significantly reduced sperm mitochondrial membrane potential, ATP levels, and linear motility (PMID: 31212063). Also, in the previous manuscript, we did not address progressive motility or hyperactivated motility in our analysis. We have chosen to discuss the effect of mitochondrial activity on linear motility rather than on progressive motility and hyperactivation of sperm.
Was mitochondrial activity also altered in epididymal sperm incubated with and without seminal plasma or in aged mice?
The mitochondrial membrane potential of epididymal sperm cultured with seminal vesicle extract (SV) was higher than that of epididymal sperm cultured without seminal vesicle extract (without SV: 67.3 ± 0.8%, with SV: 83.4 ± 1.8%). On the other hand, the mitochondrial membrane potential of epididymal sperm cultured with seminal vesicle extract recovered from aged mice was decreased (SV from aged: 60.3 ± 2.7%). It should be noted that the epididymal spermatozoa used in these experiments were healthy individuals, different from those from which seminal vesicle extracts were collected. (See also the response to reviewer 1's comment #5.)
(7) The quality of the provided images showing AR, Ki67, and TUNEL staining should be improved or additional images should be included. Especially the AR staining is hard to detect in the provided images. The authors should also include a co-staining between AR and vesicle epithelial cells. That epithelial cells are multilayered does not come across in the pictures provided.
We apologize for any inconvenience caused. The image has been replaced with one of higher resolution. The multilayered structure of the epithelial cells will also be seen.
For the 12-month-old mice, an age-matched control should be included to support the authors' conclusion.
To clarify the seminal vesicle changes associated with aging, we included images of 3-month-old mice as controls (New Supplementary Fig.2D).
Overall, the rationale for the experiment does not become clear. How are the amount of seminal vesicle epithelial cells, testosterone, and AR expression connected to seminal plasma secretions? Why is it a disadvantage to have proliferating seminal vesicle epithelial cells? How is proliferation connected to the proposed switch in metabolic pathway activity?
We have added some explanations and supporting data to the manuscript (New Fig.8D, lines 303-305, 315-319, 369-379). Cell proliferation stopped when the metabolic shift occurred, redirecting glucose toward fatty acid synthesis. Fatty acid synthesis is an important function of the seminal vesicle, and in the presence of testosterone, fatty acid synthesis enhancement and arrest of proliferation occur simultaneously. The connection between metabolism and cell proliferation was further demonstrated when ACLY was knocked down by shRNA, which stopped fatty acid synthesis and released the proliferative arrest induced by testosterone, allowing the cells to proliferate again. However, we do not know what effects occur when cell proliferation is stopped.
(8) The experiments provided for glycolysis and oxphos are inconsistent and insufficient to support the authors' conclusion that testosterone shifts glycolytic and oxphos activity of seminal vesicle epithelial cells. Multiple groups (PMID 37440924, 37655160, 32823893) have shown that the increased flux through central carbon metabolism during capacitation is accompanied by an accumulation of intracellular lactate and increased secretion of lactate into the surrounding media. How do the authors explain that they see an increase in glucose uptake and ECAR but not in lactate and a decrease in pyruvate? Did the authors additionally quantify intracellular pyruvate and lactate? Since pyruvate and lactate are in constant equilibrium, it is odd that one metabolite is changing and the other one is not.
Thank you for pointing this out. Since ECAR is often used as an alternative to lactate production but does not directly measure lactate levels, we measured changes in lactate and pyruvate concentrations in the culture medium. Under our experimental conditions, glucose appeared to be directed primarily towards anabolic processes, such as fatty acid synthesis, rather than the OXPHOS pathway, which may explain the lack of lactate production. The observed decrease in pyruvate might indicate its conversion to acetyl-CoA in the mitochondria, supporting both fatty acid synthesis and the TCA cycle. This shift would be consistent with the metabolic reprogramming toward anabolic activity.
What do the authors mean by "the glycolytic pathway was not enhanced despite the activation of glycolysis" Seahorse, especially using a series of pathway inhibitors, only provides an indirect measurement of glycolysis and oxphos since the instrument does not provide a distinction from which pathways the detected protons are originating. The authors should consider a more optimized experimental design, i.e. the authors could monitor ECAR and OCR in the presence of glucose over time with and without the addition of testosterone. That would be less invasive since the sperm are not starved at the beginning of the experiment and would provide a more direct read-out. Did the authors normalize cell numbers in their experiment? Alternatively, the authors could consider performing metabolomics experiments.
I agree with the reviewer. Buzzwords such as “glycolytic capacity” simply do not make sense, so we have removed them from the phrases noted by the reviewer. Please refer to the response to some of reviewer 1's points regarding the ambiguity of the data measured by the flux analyzer. Nevertheless, the assay design of the flux analysis could be used as a good “starting point” and provide information on the glycolytic system and respiratory control. Therefore, the interpretation of the flux analysis is supported by subsequent data sets.
(9) The authors would strengthen their results by confirming their gene expression data by quantifying the expression of the respective proteins.
Does testosterone treatment increase GLUT4 protein levels in isolated seminal vesicle epithelial cells? Or does it change the localization of the transporter? Are GLUT4 gene and protein levels altered in flutamide-treated cells? How do the authors explain that testosterone increases glucose uptake without changing Glut gene expression?
We performed Western blot analysis to measure GLUT4 protein levels in seminal vesicle epithelial cells after testosterone treatment. The results showed that testosterone does not alter the expression of GLUT4 protein but simply changes its subcellular localization (New Fig.6C,D, lines 238-244).
The discussion includes the interpretation of the observation that testosterone increases glucose uptake by altering localization without altering GLUT4 gene expression, a phenomenon commonly seen in other cells, such as cardiomyocytes (lines 362-364). The revised main figure also includes a data set of changes in GLUT4 localization, including flutamide-treated data. See also Reviewer 3's main comment #1.
(10) Considering that the authors claim that SV secretions are crucial for sperm fertilization capacity, how do they explain that fertilization rates are still at 40 % when sperm are treated with flutamide?
It is actually about 50% fertilized with HTF because it is fertilized without SV. Considering this baseline, we found that seminal vesicle secretions positively affect sperm in vivo fertilization. On the other hand, seminal plasma from flutamide-treated mice reduced the fertilization ability of healthy sperm. These are described in the text (lines 283-294).
(11) It would be beneficial for the reader to include a schematic summarizing the results.
Thank you for your advice from the reader's point of view. We have visualized the summaries of this study and added them to the manuscript (New Fig.10).
Minor comments:
Line 38: Male fertility, no article, please revise.
I have changed “The male fertility” to “Male fertility” and added some references (lines 42-43).
Line 55: Seminal plasma or TGFb? Please clarify.
Corrected as follows. “TGFβ, a component of seminal plasma, increases antigen-specific Treg cells in the uterus of mice and humans, which induces immune tolerance, resulting in pregnancy.” (lines 60-62)
Line 63: Why do the authors find it surprising that blood and seminal plasma have different compositions?
This is because seminal plasma contains unique biochemical components that are not normally found in blood or only in small quantities. The intention was to emphasize the unique function of seminal plasma in supporting the physiological functions of sperm and to highlight its complex role by comparing it to blood. We clarified these intentions and reflected them in the revised text (lines 62-67).
Line 94: The headline causes confusion. Seminal plasma does not induce sperm motility, it increases progressive sperm motility.
Corrected as follows. “The effect of androgen-dependent changes in mouse seminal vesicle secretions on the linear motility of sperm” (lines 101-102)
Reviewer #3 (Recommendations For The Authors):
Thank you for allowing us to strengthen our manuscript with your valuable comments and queries. We have made our best efforts to reflect your feedback.
Major:
Figure 4 and Figure 5: The trend shows that GLUT3 is up-regulated and GLUT4 is downregulated although both of them are not statistically significant. However, GLUT4 is picked for all the following experiments based on protein localization. Providing other evidence/discussion why not to further consider other GLUTs will help to justify. Also, this reviewer suggests including GLUT4 localization data in the main figure as it is important data for the logical flow to link the following figures.
We focused on GLUT4 because it was known that testosterone increases glucose uptake by changing the localization of GLUT4 without changing its expression (lines 230-231). In the revised manuscript, the increasing trend in Glut3 gene expression was also mentioned in the discussion, in addition to GLUT4 (lines 360-362). In any case, the results showed that testosterone increased glucose uptake by regulating the function of glucose transporters.
Immunostaining of GLUT1~4 was performed to compare seminal vesicles from flutamide-treated mice with controls, and localization changes were observed only in GLUT4. Therefore, we hypothesized that GLUT4 is regulated by testosterone and performed the experiment. Fortunately, we were able to obtain a GLUT4-specific inhibitor, which dramatically inhibited the testosterone-dependent glucose uptake and subsequent lipid synthesis in seminal epithelial cells, leading us to believe that GLUT4 is a major glucose transporter.
Increasing sperm linearity by oleic acid is observed and interpreted as enhanced sperm fertilizing potential. It is not clear why and how sperm linearity can be a determinant factor for enhancing sperm fertility in vivo. Providing an explanation of the effect of oleic acid on another key motility parameter more proven to be directly correlated with fertility (i.e., hyperactivation), and more direct evidence of oleic acid on enhancing sperm linearity indeed increasing sperm fertilization using IVF, is strongly recommended to support the author's main conclusion.
Thank you for pointing this out. It is known that proteins derived from the seminal vesicles inhibit the hyperactivation of sperm and the acrosome reaction. Therefore, we conducted an experiment to add oleic acid, focusing on fatty acid synthesis caused by the metabolic shift of the seminal vesicles, which had not been known until now.
Sperm were pretreated with an oleic acid-containing medium before IVF and oleic acid enhanced sperm linearity. When the sperm number was sufficient, there was no change in the cleavage rate after in vitro fertilization, but when the sperm count was reduced to one-tenth of the normal, the cleavage rate increased compared to the control (lines 274-282). In other words, the physiological role of oleic acid is to increase the probability of fertilization by keeping the sperm motility pattern linear or progressive. This increases the likelihood of the sperm passing through the female reproductive tract and environments that are unfavorable to sperm survival. Our research has uncovered significant insights into the role of seminal vesicle fluid and oleic acid in sperm fertilization. Due to the strong effect of the decapacitation factor, we found that seminal vesicle fluid reduces the fertilization rate in IVF. However, it does not interfere with the fertilization rate in in vivo during artificial insemination. This emphasizes the importance of oleic acid, along with other protein components of seminal plasma, in ensuring the in vivo fertilization ability of sperm.
Minor:
Please correct a typo in Line 173: sifts to shifts
All typographical errors have been corrected.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
Cystinosis is a rare hereditary disease caused by biallelic loss of the CTNS gene, encoding two cystinosin protein isoforms; the main isoform is expressed in lysosomal membranes where it mediates cystine efflux whereas the minor isoform is expressed at the plasma membrane and in other subcellular organelles. Sur et al proceed from the assumption that the pathways driving the cystinosis phenotype in the kidney might be identified by comparing the transcriptome profiles of normal vs CTNS-mutant proximal tubular cell lines. They argue that key transcriptional disturbances in mutant kidney cells might not be present in non-renal cells such as CTNS-mutant fibroblasts.
Using cluster analysis of the transcriptomes, the authors selected a single vacuolar H+ATPase (ATP6VOA1) for further study, asserting that it was the "most significantly downregulated" vacuolar H+ATPase (about 58% of control) among a group of similarly downregulated H+ATPases. They then showed that exogenous ATP6VOA1 improved CTNS(-/-) RPTEC mitochondrial respiratory chain function and decreased autophagosome LC3-II accumulation, characteristic of cystinosis. The authors then treated mutant RPTECs with 3 "antioxidant" drugs, cysteamine, vitamin E, and astaxanthin (ATX). ATX (but not the other two antioxidant drugs) appeared to improve ATP6VOA1 expression, LC3-II accumulation, and mitochondrial membrane potential. Respiratory chain function was not studied. RTPC cystine accumulation was not studied.
In this manuscript, as an initial step, we have studied the first step in respiratory chain function by performing the Seahorse Mito Stress Test to demonstrate that the genetic manipulation (knocking out the CTNS gene and plasmid-mediated expression correction of ATP6V0A1) impacts mitochondrial energetics. We did not investigate the respirometry-based assays that can identify locations of electron transport deficiency, which we plan to address in a follow-up paper.
We would like to draw attention to Figure 3D, where cystine accumulation has been studied. This figure demonstrates an increased intracellular accumulation of cystine.
The major strengths of this manuscript reside in its two primary findings.
(1) Plasmid expression of exogenous ATP6VOA1 improves mitochondrial integrity and reduces aberrant autophagosome accumulation.
(2) Astaxanthin partially restores suboptimal endogenous ATP6VOA1 expression.
Taken together, these observations suggest that astaxanthin might constitute a novel therapeutic strategy to ameliorate defective mitochondrial function and lysosomal clearance of autophagosomes in the cystinotic kidney. This might act synergistically with the current therapy (oral cysteamine) which facilitates defective cystine efflux from the lysosome.
There are, however, several weaknesses in the manuscript.
(1) The reductive approach that led from transcriptional profiling to focus on ATP6VOA1 is not transparent and weakens the argument that potential therapies should focus on correction of this one molecule vs the other H+ ATPase transcripts that were equally reduced - or transcripts among the 1925 belonging to at least 11 pathways disturbed in mutant RPTECs.
The transcriptional profiling studies on ATP6V0A1 have been fully discussed and publicly shared. Table 2 lists the v-ATPase transcripts that are significantly downregulated in cystinosis RPTECs. We have also clarified and justified the choice of further studies on ATP6V0A1, where we state the following: "The most significantly perturbed member of the V-ATPase gene family found to be downregulated in cystinosis RPTECs is ATP6V0A1 (Table 2). Therefore, further attention was focused on characterizing the role of this particular gene in a human in vitro model of cystinosis."
(2) A precise description of primary results is missing -- the Results section is preceded by or mixed with extensive speculation. This makes it difficult to dissect valid conclusions from those derived from less informative experiments (eg data on CDME loading, data on whole-cell pH instead of lysosomal pH, etc).
We appreciate the reviewer highlighting areas for further improving the manuscript's readership. In our resubmission, we have revised the results section to provide a more precise description of the primary findings and restrict the inferences to the discussion section only.
(3) Data on experimental approaches that turned out to be uninformative (eg CDME loading, or data on whole=cell pH assessment with BCECF).
We have provided data whether it was informative or uninformative. Though lysosome-specific pH measurement would be important to measure, it was not possible to do it in our cells as they were very sick and the assay did not work. Hence we provide data on pH assessment with BCECF, which measures overall cytoplasmic and organelle pH, which is also informative for whole cell pH that is an overall pH of organelle pH and cytoplasmic pH.
(4) The rationale for the study of ATX is unclear and the mechanism by which it improves mitochondrial integrity and autophagosome accumulation is not explored (but does not appear to depend on its anti-oxidant properties).
We have provided rationale for the study of ATX; provided in the introduction and result section, where we mentioned the following: “correction of ATP6V0A1 in CTNS-/- RPTECs and treatment with antioxidants specifically, astaxanthin (ATX) increased the production of cellular ATP6V0A1, identified from a custom FDA-drug database generated by our group, partially rescued the nephropathic RPTEC phenotype. ATX is a xanthophyll carotenoid occurring in a wide variety of organisms. ATX is reported to have the highest known antioxidant activity and has proven to have various anti-inflammatory, anti-tumoral, immunomodulatory, anti-cancer, and cytoprotective activities both in vivo and in vitro_”._
We are still investigating the mechanism by which ATX improves mitochondrial integrity, and this will be the focus of a follow-on manuscript.
(5) Thoughtful discussion on the lack of effect of ATP6VOA1 correction on cystine efflux from the lysosome is warranted, since this is presumably sensitive to intralysosomal pH.
In the revised manuscript, we have included a detailed discussion on the plausible reasons why ATP6V0A1 correction has no effect on cysteine efflux from the lysosome. We have now added to the Discussion – “However, correcting ATP6V0A1 had no effect on cellular cystine levels, likely because cystinosin is known to have multiple roles beyond cystine transport Cystinosin is demonstrated to be crucial for activating mTORC1 signaling by directly interacting with v-ATPases and other mTORC1 activators. Cystine depletion using cysteamine does not affect mTORC1 signaling. Our data, along with these observations, further supports that cystinosin has multiple functions and that its cystine transport activity is not mediated by ATP6V0A1.”
(6) Comparisons between RPTECs and fibroblasts cannot take into account the effects of immortalization on cell phenotype (not performed in fibroblasts).
The purpose of examining different tissue sources of primary cells in nephropathic cystinosis was to assess if any of the changes in these cells were tissue source specific. We used primary cells isolated from patients with nephropathic cystinosis—RPTECs from patients' urine and fibroblasts from patients' skin—these cells are not immortalized and can therefore be compared. This is noted in the results section - “Specific transcriptional signatures are observed in cystinotic skin-fibroblasts and RPTECs obtained from the same individual with cystinosis versus their healthy counterparts”.
We next utilized the immortalized RPTEC cell line to create CRISPR-mediated CTNS knockout RPTECs as a resource for studying the pathophysiology of cystinosis. These cells were not compared to the primary fibroblasts.
(7) This work will be of interest to the research community but is self-described as a pilot study. It remains to be clarified whether transient transfection of RPTECs with other H+ATPases could achieve results comparable to ATP6VOA1. Some insight into the mechanism by which ATX exerts its effects on RPTECs is needed to understand its potential for the treatment of cystinosis.
In future studies we will further investigate the effect of ATX on RPTECs for treatment of cystinosis- this will require the conduct of Phase 1 and Phase 2 clinical studies which are beyond the scope of this current manuscript.
Reviewer #2 (Public Review):
Sur and colleagues investigate the role of ATP6V0A1 in mitochondrial function in cystinotic proximal tubule cells. They propose that loss of cystinosin downregulates ATP6V0A1 resulting in acidic lysosomal pH loss, and adversely modulates mitochondrial function and lifespan in cystinotic RPTECs. They further investigate the use of a novel therapeutic Astaxanthin (ATX) to upregulate ATP6V0A1 that may improve mitochondrial function in cystinotic proximal tubules.
The new information regarding the specific proximal tubular injuries in cystinosis identifies potential molecular targets for treatment. As such, the authors are advancing the field in an experimental model for potential translational application to humans.
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
(1) There is a lack of care with precise wording and punctuation, which negatively affects the text. Importantly, the manuscript lacks a clear description of experimental Results. This section begins with speculation, then wanders through experimentation that didn't work (could be deleted). Figure 1A and lines 94-102 could be deleted. Data from CDME loading was found to be a "poor surrogate" for cystinosis and could be deleted from the manuscript or mentioned as a minor point in the discussion. The number of individual patient cell lines used for experimentation is unclear - 8 patients are mentioned on line 109, Figure 2B shows 6 normal fibroblasts, 3 CDME-loaded fibroblasts, and an indeterminate number of normal vs CDME-loaded cells (both colored red). Cluster analysis refers to two large gene clusters - data supporting this key conclusion is not shown. It is unclear why ATP6VOA1 was selected as the most significantly reduced H+ATPase from Table II. Thus, the focus on this particular gene appears to be largely "a hunch".
In this study, we aim to establish a new concept by using multiple cell types and various assays tailored to each affected organelle, which might be confusing. Therefore, we believe Figure 1a provides a roadmap and helps clarify what to expect from this paper.
This study was started a decade back, when CDME-mediated lysosomal loading was regularly used as a surrogate in vitro model to study cystinosis tissue injury. That was the reason to include CDME in the study design. Since we already had the CDME-treated data and in this article we are talking about another superior in vitro cystinosis model, we would like to include it.
In the Result and Methods section, we mentioned “8 patients” with nephropathic cystinosis from whom we collected the RPTECs and Fibroblasts. These cystinotic cells are shown in blue and purple dots, respectively in figure 2B. Normal RPTEC and fibroblast cells were purchased from company and these cells were then treated with CDME to artificially load lysosomes with cystine. Details on the cell types and its procurement can be found in the Methods section under “Study design and Samples”. Normal and CDME-loaded RPTECs are shown in red and orange dots, whereas normal and CDME-loaded fibroblasts are shown in green and yellow dots, respectively in figure 2B.
We removed this figure from the manuscript because the data is already detailed in Tables 1 and 2. As a sub-figure, the string pathway analysis output was illegible and did not add any new information. However, for your reference, we have now provided this data below.
Author response image 1.
STRIG pathway analysis using the microarray transcriptomic data from normal vs.cystinotic RPTECs. Ysing K-mean clustering on the genes in these significantly enriched pathways, we identified 2 distinct clusters, red and green nodes. Red nodes are enriched in nucleus-encoded mitochondrial genes and v-ATPases family, which are crucial for lysosomes and kidney tubular acid secretion. ATP6VOA1, the topmost v-ATPase in our cystinotic transcriptome dataset is highlighted in cyan. Green nodes are enriched in genes needed for DNA replication.
(2) It was decided to use transcriptional profiling of CTNS mutant vs wildtype renal proximal tubular cells (RPTECs) as a way to uncover defective secondary molecular pathways that might be upstream drivers of the cystinosis phenotype. Since the kidneys are the first organs to deteriorate in cystinosis, it is postulated that transcriptome differences might be more obvious in kidney cells than in non-renal tissues, such as fibroblasts. A potential pitfall is that the RPTECs were transformed cell lines whereas fibroblasts were not.
Transcriptional profiling was done on primary cells isolated from patients with nephropathic cystinosis—RPTECs from patients' urine and fibroblasts from patients' skin—these cells are not immortalized and can therefore be compared. This is noted in the results section - “Specific transcriptional signatures are observed in cystinotic skin-fibroblasts and RPTECs obtained from the same individual with cystinosis versus their healthy counterparts”.
We utilized the immortalized RPTEC cell line to create CRISPR-mediated CTNS knockout RPTECs as a resource for studying the pathophysiology of cystinosis. These cells were not compared to the primary fibroblasts.
(3) The authors wanted to study intralysosomal pH but could not, so used a pH-sensitive dye that reflects whole cell pH. It would be incorrect to take this measurement as support for their hypothesis that intralysosomal pH is increased. Since these experiments cannot be interpreted, they should be deleted from the manuscript.
We have now corrected the term to "intracellular pH." Although measuring lysosome-specific pH would be important, it was not feasible in our cells as knocking out cystinosin gene made them fragile, making the assay ineffective. Therefore, we provide data on pH assessment using BCECF, which measures the overall pH of the cytoplasm and organelles. This information is still valuable for understanding the whole cell pH, encompassing both organelle and cytoplasmic pH. We have mentioned this as one of our limitations in the Discussion section.
(4) The choice of ATX as a potential therapy is puzzling. Its antioxidant properties seem to be irrelevant since two other antioxidants had no effect. The mechanism by which it appears to correct some aspects of the cystinosis phenotype remains unknown and this should be pointed out. A key experiment to assess whether ATX reduces lysosomal cystine accumulation is missing. While the impact of ATX on cystinosis is interesting, the mechanism is unexplored.
A detailed study on the mechanism by which ATX corrects certain aspects of the cystinosis phenotype is currently underway and will be presented in a follow-up paper. We have measured the effect of ATX and cysteamine, both individually and combined, on cystine accumulation using HPLC, as shown in the figure below. Our results indicate a significant increase in cystine levels with ATX treatment alone, while the combined ATX and cysteamine treatment significantly reduced cystine accumulation to the normal level. This suggests that ATX addresses specific aspects of the cystinosis phenotype through a different mechanism, not by reducing the accumulated cystine levels. When co-administered with cysteamine, they have the potential to complement each other's shortcomings. We believe that the increase in cystine with ATX alone may be due to interactions between ATX's ketone or hydroxyl groups and cystine's amine or carboxylic groups. Further research on this interaction is ongoing.
We have now added to the Discussion – “We noticed a significant increase in cystine levels with ATX treatment alone (data not shown in the manuscript), while the combined ATX and cysteamine treatment significantly reduced cystine accumulation to the normal level. This may suggest that when co-administered with cysteamine, they have the potential to complement each other's shortcomings. We believe that the increase in cystine with ATX alone could be due to interactions between ATX's ketone or hydroxyl groups and cystine's amine or carboxylic groups. Further research on this interaction is ongoing.”
Author response image 2.
(5) The effects of exogenous ATP6VOA1 are interesting but had no effect on lysosomal cystine efflux, a hallmark of the cystinosis cellular phenotype. A discussion of this issue would be important.
In the revised manuscript, we have included a detailed discussion on the plausible reasons why ATP6V0A1 correction has no effect on cysteine efflux from the lysosome. We have added to the Discussion – “However, correcting ATP6V0A1 had no effect on cellular cystine levels (Figure 7C), likely because cystinosin is known to have multiple roles beyond cystine transport. Cystinosin is demonstrated to be crucial for activating mTORC1 signaling by directly interacting with v-ATPases and other mTORC1 activators. Cystine depletion using cysteamine does not affect mTORC1 signaling (47). Our data, along with these observations, further supports that cystinosin has multiple functions and that its cystine transport activity is not mediated by ATP6V0A1.”
(6) The arguments on lines 260-273 are not comprehensible. The authors confirm that RPTC LC3-II levels are increased, a marker of active processing of autophagosome cargo, prior to delivery to lysosomes. Discussion of balfilomycin (not used), mTORC activity, and endocytosis are not directly relevant and wander from interpretation of the LC3-II observation. One possibility is that the 50% decrease in ATP6VOA1 transcript is sufficient to slow the transfer of LC3-II-tagged cargo from autophagosome to lysosome - however, it would be important to offer a plausible explanation for why decreased ATP6VOA1 expression alone does not appear to be the key limitation on lysosomal cystine efflux.
We have now rephrased our explanation in the Discussion section – “Cystinotic cells are known to have an increased autophagy or reduced autophagosome turnover rate. Autophagic flux in a cell is typically assessed by examining the accumulation of the autophagosome or autophagy-lysosome marker LC3B-II. This accumulation can be artificially induced using bafilomycin, which targets the V-ATPase, thereby inhibiting lysosomal acidification and degradation of its contents. Taken together, the observed innate increase in LC3B-II in cystinotic RPTECs (Figure 5A) without bafilomycin treatment suggests dysfunctional lysosomal acidification and thus could be linked to inhibited v-ATPase activity”.
-
-
pmc.ncbi.nlm.nih.gov pmc.ncbi.nlm.nih.gov
-
Author response:
We plan to submit a revised version of our manuscript eLife-RP-RA-2024-105013, in which we address all comments raised by the two expert reviewers.
Below we describe what we like to address in this revision. We understand that the provisional response is not meant to be a point-by-point reply. Therefore, our revision plan more generally summarizes the comments of the reviewers and how we plan to address them.
Reviewer #1:
This reviewer is overall very positive and states that our ‘work is likely to become the go-to resource for quantification in this field’. This reviewer raises few weaknesses of the manuscript that are explicitly described as minor.
Microscopic resolution sufficient to support quantitative spine assessments?
In the detailed revision, we will provide quantification of microscopic resolution and will relate this to the spine comparisons offered. Where needed, we will add caveats discussing measurement limits.
Age of the human tissue.
Most analysis is based on the study of three brains from elderly individuals. For the analysis of dendritic spines, we added measures from a younger brain (37 years-old). We will make it more clear, which datasets contained these measures and what the results of our comparative analysis have been.
Genetic diversity contributing to species differences?
We provide an updated discussion on this interesting topic.
Reviewer #2:
This reviewer also expresses a largely positive view of the manuscript, noting that ‘..the data will be of widespread interest to the cerebellar field…’.
Microscopic resolution:
see above.
Figure panels / Fig. 3:
We will make sure that the figures are readable and will provide a clarification of gray scales used in Fig. 3.
Vertical vs horizontal dendrite orientation:
This is a point that requires clarification. Per our definition, all dendrites fall either into the vertical or horizontal category. We will make sure that this is defined sufficiently well.
-
-
-
Author response:
Response to Referee 1
We agree that convex walls increase the time that consortia remain trapped in pores at high magnetic fields. Since the non-monotonic behavior of the drift velocity with the Scattering number arises largely due to these long trapping times, we agree that experiments using concave pores are likely to show a peak drift velocity that is diminished or erased.
However, we disagree that a random packing of spheres or similar particles provides an appropriate model for natural sediment, which is not composed exclusively of hard particles in a pure fluid. Pore geometry is also influenced by clogging. Biofilms growing within a network of convex pillars in two-dimensional microfluidic devices have been observed to connect neighboring pillars, thereby forming convex pores. Similar pore structures appear in simulations of biofilm growth between spherical particles in three dimensions. Moreover, the salt marsh sediment in which MMB live is more complex than simple sand grains, as cohesive organic particles are abundant. Experiments in microfluidic channels show that cohesive particles clog narrow passageways and form pores similar to those analyzed here. Thus, we expect convex pores to be present and even common in natural sediment where clogging plays a role.
The concentration of convex pores in the experiments presented here is almost certainly much higher than in nature. Nonetheless, since magnetotactic bacteria continuously swim through the pore space, they are likely to regularly encounter such convexities. Efficient navigation of the pore space thus requires that magnetotactic bacteria be able to escape these traps. In the original version of this manuscript, this reasoning was reduced to only one or two sentences. That was a mistake, and we thank the reviewer for prompting us to expand on this point. As the reviewer notes, this reasoning is central to the analysis and should have been featured more prominently. In the final version, we will devote considerable space to this hypothesis and provide references to support the claims made above.
The reviewer suggests that the generality of this work depends on our finding a "positive correlation between the swimming speed and alignment [rate] based on parameters derived from literature." We wish to emphasize that, in addition to predicting this correlation, our theory also predicts the function that describes it. The black line in Figure 3 is not fitted to the parameters found in the literature review; it is a pure prediction.
Response to Referee 2
In the "Recommendations for the Authors," this reviewer drew our attention to a manuscript that absolutely should have been prominently cited. As the reviewer notes, our manuscript meaningfully expands upon this work. We are pleased to learn that the phenomena discussed here are more general than we initially understood. It was an oversight not to have found this paper earlier. The final version will better contextualize our work and give due credit to the authors. We sincerely appreciate the reviewer for bringing this work to our attention.
We disagree that the use of non-culturable organisms and our unrealistic array should be considered serious weaknesses. While any methodological choice comes with trade-offs, we believe these choices best advance our aims. First, the goal of our research, both within and beyond this manuscript, is to understand the phenotypes of magnetotactic bacteria in nature. While using pure cultures enables many useful techniques, phenotypic traits may drift as strains undergo domestication. We therefore prioritize studying environmental enrichments.
Clearly, an array of obstacles does not fully represent natural heterogeneity. However, using regular pore shapes allows us to average over enough consortium-wall collisions to enable a parameter-free comparison between theory and experiment. Conducting an analysis like this with randomly arranged obstacles would require averaging over an ensemble of random environments, which is practically challenging given the experimental constraints. Since we find good agreement between theory and experiment in simple geometries, we are now in a position to justify extending our theory to more realistic geometries. Additionally, we note that a microfluidic device composed of a random arrangement of obstacles would also be a poor representation of environmental heterogeneity, as pore shape and network topology differ between two and three dimensions.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Public Reviews:
Reviewer #1 (Public review):
Summary:
Cruz-González and colleagues draw on DNA methylation and paired genetic data from 621 participants (n=308 controls; n=313 participants with Alzheimer's Disease). The authors generate a panel of epigenetic biomarkers of aging with a primary focus on the Horvath multi-tissue clock. The authors find weaker correlations between predicted epigenetic age and chronological age in subgroups with higher African ancestry than within a subgroup identified as White. The authors then examine genetic variation as a potential source for between-group differences in epigenetic clock performance. The authors draw on a large collection of publicly available methylation quantitative trait loci datasets and find evidence for substantial overlap between clock CpGs located within the Horvath clock and methQTLs. Going further, the authors show that methQTLs that overlap with Horvath clock CpGs show greater allelic variation in African ancestral groups pointing to a potential explanation for poorer clock performance within this group.
Thank you for this summary.
Strengths:
This is an interesting dataset and an important research question. The authors cite issues of portability regarding polygenic risk scores as a motivation to examine between-group differences in the performance of a panel of epigenetic clocks. The authors benefit from a diverse cohort of individuals with paired genetic data and focus on a clinical phenotype, Alzheimer's disease, of clear relevance for studies evaluating age-related biomarkers.
Weaknesses:
While the authors tackle an important question using a diverse cohort the current manuscript is lacking some detail that may diminish the potential impact of this paper. For example:
(1) Information on chronological ages across groups should be reported to ensure there are no systematic differences in ages or age ranges between groups (see point below).
Thank you for pointing out this omission. The age ranges are similar across cohorts. No individuals under 60 were considered, and the average ages per cohort ranged from 72 to 76. Neither average age nor age range was consistently higher or lower in the admixed cohorts for which the clocks had lower performance compared to the White cohort. We will report the age distributions in supplementary material in the revision.
(2) The authors compare correlations between chronological age and epigenetic age in sub-groups within to correlations reported by Horvath (2013). Attempting to draw comparisons between these two datasets is problematic. The current study has a much smaller N (particularly for sub-group analyses) and has a more restricted age range (6090yrs versus 0-100 yrs). Thus, is an alternative explanation simply that any weaker correlations observed in this study are driven by sample size and a restricted age range? Reporting the chronological ages (and ranges) across subgroups in the current study would help in this regard. Similarly, given the lack of association between AD status and epigenetic age (and very small effect in the white group), it may be of interest to examine the correlation between chronological age and epigenetic age in each group including the AD participants: would the between-group differences in correlations between chronological age and epigenetic be altered by increasing the sample size?
Our conclusions about the reduced accuracy of the clocks in admixed individuals are based on comparisons within the MAGENTA cohorts, not on the comparisons to previous reports. We show significantly reduced accuracy on African American and Puerto Rican cohorts in MAGENTA compared to the White MAGENTA cohort. The reviewer is correct that the lower correlation in each of the cohorts compared to those in the Horvath study is due to the older age range of our cohort. Indeed, other studies applying the Horvath clock have seen similar correlations to those observed on the White MAGENTA cohort (Marioni et al., 2015, Horvath 2013, and Shireby et al., 2020). Following the suggestion to increase sample size, we conducted the chronological age vs. epigenetic age correlation analysis with the inclusion of AD cases. The significantly lower performance of the clock on Puerto Ricans and African Americans relative to White individuals remains after including all individuals in each cohort. We will include these results on the full cohorts in MAGENTA in the revision.
(3) The correlation between chronological age and epigenetic age, while helpful is not the most informative estimate of accuracy. Median absolute error (and an analysis of MAE across subgroups) would be a helpful addition.
We used correlation because this is commonly used to evaluate the performance of epigenetic age clocks, but we agree that direct error quantification provides a complementary perspective. We confirm that the African American and Puerto Rican cohorts have higher error than the White cohort, and we will report these comparisons in the revision.
(4) More information should be provided about how DNAm data were generated. Were samples from each ancestral group randomized across plates/slides to ensure ancestry and batch are not associated? How were batch effects considered? Given the relatively small sample sizes, it would be important to consider the impact of technical variation on measures of epigenetic age used in the current study. The use of principal Component-based versions of these clocks (Higgins Chen et al., 2023; Nature Aging https://doi.org/10.1038/s43587-022-00248-2) may help address concerns such concerns.
Thank you for pointing out the need for additional context on data generation. All omics data from the MAGENTA study were generated using protocols that aim to minimize technical artifacts and batch effects. We will add detailed protocol information will be detailed in the revision. We also thank the reviewer for their suggestion on applying the principal component clock to account for potential technical variation. We are planning to perform these analyses and include them in the revision.
(5) Marioni et al., (2015) found a very weak cross-sectional association between DNAm Age and cognitive function (r~0.07) in a cohort of >900 participants. Given these effect sizes, I would not interpret the absence of an effect in the current study to reflect issues of portability of epigenetic biomarkers.
We agree that previous links between DNAm Age and AD/cognitive function have been small in magnitude. For example, the PhenoAge paper (Levine et al., 2018) and a study using the Horvath clock (Levine et al., 2015) found age acceleration of less than a year in AD patients relative to non-demented individuals. These effects have been detected in studies with relatively small sample sizes (e.g., 700 for Levine et al. 2015 and 604 for Levine et al. 2018). Our study is of similar size, but the cohort-specific analyses have lower power. Nonetheless, we replicate the modest, but significant association with AD in the white MAGENTA cohort. We have performed power calculations and find that we have 26% power to detect an effect of this size in the Cubans, 46% for the Peruvians, 66% for the Whites, 74% for the Puerto Ricans, and 84% for the African Americans. Given the relatively high power in the Puerto Rican and African American cohorts, we suggest that the reduced accuracy of the clocks contributes to the lack of association. We will also add caveats about power and the small sample size in the revision.
6) The methQTL analyses presented are suggestive of potential genetic influence on DNAm at some Horvath CpGs. Do authors see differences in DNAm across ancestral groups at these potentially affected CpGs? This seems to be a missing piece together (e.g., estimating the likely impact of methQTL on clock CpG DNAm).
Thank you for this excellent suggestion. We will add this analysis in the revision. This will enable us to test for further evidence for our hypothesis about the role of ancestryspecific meQTL on clock accuracy.
Reviewer #2 (Public review):
Summary:
This paper seeks to characterize the portability of methylation clocks across groups. Methylation clocks are trained to predict biological aging from DNA methylation but have largely been developed in datasets of individuals with primarily European ancestries. Given that genetic variation can influence DNA methylation, the authors hypothesize that methylation clocks might have reduced accuracy in non-European ancestries.
Strengths:
The authors evaluate five methylation clocks in 621 individuals from the MAGENTA study. This includes approximately 280 individuals sampled in Puerto Rico, Cuba, and Peru, as well as approximately 200 self-identified African American individuals sampled in the US. To understand how methylation clock accuracy varies with proportion of nonEuropean ancestry, the authors inferred local ancestry for the Puerto Rican, Cuban, Peruvian, and African American cohorts. Overall, this paper presents solid evidence that methylation clocks have reduced accuracy in individuals with non-European ancestries, relative to individuals with primarily European ancestries. This should be of great interest to those researchers who seek to use methylation clocks as predictors of agerelated, late-onset diseases and other health outcomes.
Thank you for this summary.
Weaknesses:
One clear strength of this paper is the ability to do more sophisticated analyses using the local ancestry calls for the MAGENTA study. It would be valuable to capitalize on this strength and assess portability across the genetic ancestry spectrum, as was recently advocated by Ding et al. in Nature (2023). For example, the authors could regress non-European local ancestry fraction on measures of prediction accuracy. This could paint a clearer picture of the relationship between genetic ancestry and clock accuracy, compared to looking at overall correlations within each cohort.
Thank you for this excellent suggestion. We agree that modeling portability across genetic ancestry as a spectrum would help support our conclusions. We will add this to the revision.
The authors present two possible reasons that methylation clocks might have reduced accuracy in individuals with non-European ancestries: genetic variants disrupting methylation sites (i.e., "disruptive variants") and genetic variants influencing methylation sites (i.e., meQTLs). The authors conclude disruptive variants do not contribute to poor methylation clock portability, but the evidence in support of this conclusion is incomplete. The site frequency spectrum of disruptive variants in Figure 4 is estimated from all gnomAD individuals, and gnomAD is comprised of primarily European individuals. Thus, the observation that disruptive variants are generally rare in gnomAD does not rule them out as a source of poor clock portability in admixed individuals with non-European ancestries.
Thank you for this question. The allele frequencies were so low that even if they all occurred in individuals of non-European ancestries, they would still be incredibly rare. Nonetheless, in the revision, we will make this clear by reporting ancestry-specific allele frequencies.
It is also unclear to what extent meQTLs impact methylation clock portability. The authors find that the frequency of meQTLs is higher in African ancestry populations, but this could reflect the fact that some of the analyzed meQTLs were ascertained in African Americans. The number of meQTL-affected methylation sites also varies widely between clocks, ranging from 6 to 271; thus, meQTLs likely impact the portability of different clocks in different ways. Overall, the paper would benefit from a more quantitative assessment of the extent to which meQTLs influence clock portability.
We agree that the meQTL likely influence the clocks in different ways and that the ascertainment of the meQTLs in different populations makes direct comparisons challenging. To provide mechanistic insights into the ways that meQTL influence the methylation clocks, we plan to leverage the individual-level genetic data generated for the MAGENTA individuals. This will allow us to explore whether the individuals who have the specified clock-influencing meQTL receive less accurate predictions from the methylation clocks. In addition, the new analysis of whether individuals from different cohorts have different methylation levels at clock CpGs with ancestry-variable meQTLs will help establish the differences between groups (see response to Reviewer #1 point 6). Finally, to resolve potential bias due to ascertaining some of the meQTL in African Americans, we will conduct the same analyses from the manuscript, holding out the set of meQTL from African Americans. These results will be included in the revision.
The paper implies that methylation clocks have an inferior ability to predict AD risk in admixed populations relative to white individuals, but the difference between white AD patients and controls is not significant when correcting for multiple testing. This nuance should be made more explicit.
We agree that the signal is not particularly strong in the white cohort, but the effect size is in line with previous studies. We will add power calculations and discussion to help the interpretation of these results (see response to Reviewer #1 point 5).
Finally, this paper overlooks the possibility that environmental exposures co-vary with genetic ancestry and play a role in decreasing the accuracy of methylation clocks in genetically admixed individuals. Quantifying the impact of environmental factors is almost certainly outside of the scope of this paper. However, it is worth acknowledging the role of environmental factors to provide the field with a more comprehensive overview of factors influencing methylation clock portability. It is also essential to avoid the assumption that correlations with genetic ancestry necessarily arise from genetic causes.
We entirely agree about the importance of discussing environmental exposures. We did not intend to discount them in our manuscript. We will clarify their potential role and the scope of our analyses in the revision. We expect that environmental factors certainly contribute to differences between groups. The revisions outlined above may help us better quantify the genetic contribution.
Reviewer #3 (Public review):
This manuscript examines the accuracy of DNA methylation-based epigenetic clocks across multiple cohorts of varying genetic ancestry. The authors find that clocks were generally less accurate at predicting age in cohorts with large proportions of nonEuropean (especially African) ancestry, compared to cohorts with high European ancestry proportions. They suggest that some of this effect might be explained by meQTLs that occur near CpG sites included in clocks, because these variants may be at higher frequencies (or at least different frequencies) in cohorts with high proportions of non-European ancestry relative to the training set. They also provide discussions of potential paths forward to alleviate bias and improve portability for future clock algorithms.
The topic is timely due to the increasing popularity of DNA methylation-based clocks and the acknowledgment that many algorithms (e.g., polygenic risk scores) lack portability when applied to cohorts that substantially differ in ancestry or other characteristics from the training set. This has been discussed to some degree for DNA methylation-based clocks, but could of course use more discussion and empirical attention which the authors nicely provide using an impressive and diverse collection of data.
The manuscript is clear and well-written, however, some key background was missing (e.g., what we know already about the ancestry composition of clock training sets) and most importantly several analyses would benefit from being taken one step further. For example, the main argument of the paper is that ancestry impacts clock predictions, but this is determined by subsetting the data by recruitment cohort rather than analyzing ancestry as a continuous variable. Extending some of the analyses could really help the authors nail down their hypothesized sources of lack of portability, which is critical for making recommendations to the community and understanding the best paths forward.
Thank you for these suggestions. As noted in our response to reviewer #2, we will analyze ancestry as a continuous variable in the revision. We will also add details on the training of previous clocks and previous work on clock accuracy.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
We thank the reviewers for the careful review of our manuscript. Overall, they were positive about our use of cutting-edge methods to identify six inversions segregating in Lake Malawi. Their distribution in ~100 species of Lake Malawi species demonstrated that they were differentially segregating in different ecogroups/habitats and could potentially play a role in local adaptation, speciation, and sex determination. Reviewers were positive about our finding that the chromosome 10 inversion was associated with sex-determination in a deep benthic species and its potential role in regulating traits under sexual selection. They agree that this work is an important starting point in understanding the role of these inversions in the amazing phenotypic diversity found in the Lake Malawi cichlid flock.
There were two main criticisms that were made which we summarize:
(1) Lack of clarity. It was noted that the writing could be improved to make many technical points clearer. Additionally, certain discussion topics were not included that should be.
We will rewrite the text and add additional figures and tables to address the issues that were brought up in a point-by-point response. We will improve/include (1) the nomenclature to understand the inversions in different lineages, (2) improved descriptions for various genomic approaches, (3) a figure to document the samples and technologies used for each ecogroup, and 4) integration of LR sequences to identify inversion breakpoints to the finest resolution possible.
(2) We overstate the role that selection plays in the spread of these inversions and neglect other evolutionary processes that could be responsible for their spread.
We agree with the overarching point. We did not show that selection is involved in the spread of these inversions and other forces can be at play. Additionally, there were concerns with our model that the inversions introgressed from a Diplotaxodon ancestor into benthic ancestors and incomplete lineage sorting or balancing selection (via sex determination) could be at play. Overall, we agree with the reviewers with the following caveats. 1. Our analysis of the genetic distance between Diplotaxodons and benthic species in the inverted regions is more consistent with their spread through introgression versus incomplete lineage sorting or balancing selection. 2. This question of selection is much more complicated in the context of the Lake Malawi cichlid radiation with ~800 different species. We believe the role of these inversions must be considered in a species- and time-specific way. In other words, the evolutionary forces acting on these inversions at the time of their formation are likely different than the role of the evolutionary forces acting now. Further the role of these inversions is likely different in different species. For example, the inversion of 10 and 11 play a role in sex determination in some species but not others and the potential pressures acting on the inverted and non-inverted haplotypes will be very different. These are very interesting and important questions booth for understanding the adaptive radiations in Lake Malawi and in general, and we are actively studying crosses to understand the role of these inversions in phenotypic variation between two species. We will modify the text to make all of these points clearer.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Reviewer #1:
Weaknesses:
(1) The crystal structure of HsIFT172c reveals a single globular domain formed by the last three TPR repeats and C-terminal residues of IFT172. However, the authors subdivide this globular domain into TPR, linker, and U-box-like regions that they treat as separate entities throughout the manuscript. This is potentially misleading as the U-box surface that is proposed to bind ubiquitin or E2 is not surface accessible but instead interacts with the TPR motifs. They justify this approach by speculating that the presented IFT172c structure represents an autoinhibited state and that the U-box-like domain can become accessible following phosphorylation. However, additional evidence supporting the proposed autoinhibited state and the potential accessibility of the U-box surface following phosphorylation is needed, as it is not tested or supported by the current data.
We thank the reviewer for this comment. IFT172C contains TPR region and Ubox-like region which are admittedly tightly bound to each other. While there is a possibility that this region functions and exists as one domain, below are the reasons why we chose to classify these regions as two different domains.
(1) TPR and Ubox-like regions are two different structural classes
(2) TPR region is linked to Ubox-like region via a long linker which seems poised to regulate the relative movement between these regions.
(3) Many ciliopathy mutations are mapped to the interface of TPR region and the Ubox region hinting at a regulatory mechanism governed by this interface.
(2) While in vitro ubiquitination of IFT172 has been demonstrated, in vivo evidence of this process is necessary to support its physiological relevance.
We thank the reviewer for this comment. We are currently working on identifying the substrates of IF172 to reveal the physiological relevant of its ubiquitination activity.
(3) The authors describe IFT172 as being autoubiquitinated. However, the identified E2 enzymes UBCH5A and UBCH5B can both function in E3-independent ubiquitination (as pointed out by the authors) and mediate ubiquitin chain formation in an E3-independent manner in vitro (see ubiquitin chain ladder formation in Figure 3A). In addition, point mutation of known E3-binding sites in UBCH5A or TPR/U-box interface residues in IFT172 has no effect on the mono-ubiquitination of IFT172c1. Together, these data suggest that IFT172 is an E3-independent substrate of UBCH5A in vitro. The authors should state this possibility more clearly and avoid terminology such as "autoubiquitination" as it implies that IFT172 is an E3 ligase, which is misleading. Similarly, statements on page 10 and elsewhere are not supported by the data (e.g. "the low in vitro ubiquitination activity exhibited by IFT172" and "ubiquitin conjugation occurring on HsIFT172C1 in the presence of UBCH5A, possibly in coordination with the IFT172 U-box domain").
We now consider this possibility and tone down our statements about the autoubiquitination activity of IFT172 in a revised version of the manuscript.
(4) Related to the above point, the conclusion on page 11, that mono-ubiquitination of IFT172 is U-box-independent while polyubiquitination of IFT172 is U-box-dependent appears implausible. The authors should consider that UBCH5A is known to form free ubiquitin chains in vitro and structural rearrangements in F1715A/C1725R variants could render additional ubiquitination sites or the monoubiquitinated form of IFT172 inaccessible/unfavorable for further processing by UBCH5A.
We now consider this possibility and tone down our statements about the autoubiquitination activity of IFT172 in the conclusion on pg. 11.
(5) Identification of the specific ubiquitination site(s) within IFT172 would be valuable as it would allow targeted mutation to determine whether the ubiquitination of IFT172 is physiologically relevant. Ubiquitination of the C1 but not the C2 or C3 constructs suggests that the ubiquitination site is located in TPRs ranging from residues 969-1470. Could this region of TPR repeats (lacking the IFT172C3 part) suffice as a substrate for UBCH5A in ubiquitination assays?
We thank the reviewer for raising this important point about ubiquitination site identification. While not included in our manuscript, we did perform mass spectrometry analysis of ubiquitination sites using wild-type IFT172 and several mutants (P1725A, C1727R, and F1715A). As shown in the figure below, we detected multiple ubiquitination sites across these constructs. The wild-type protein showed ubiquitination at positions K1022, K1237, K1271, and K1551, while the mutants displayed slightly different patterns of modification. However, we should note that the MS intensity signals for these ubiquitinated peptides were relatively low compared to unmodified peptides, making it difficult to draw strong conclusions about site specificity or physiological relevance.
Author response image 1.
These results align with the reviewer's suggestion that ubiquitination occurs within the TPR-containing region. However, given the technical limitations of the MS analysis and the potential for E3-independent ubiquitination by UBCH5A, we have taken a conservative approach in interpreting these findings.
(6) The discrepancy between the molecular weight shifts observed in anti-ubiquitin Western blots and Coomassie-stained gels is noteworthy. The authors show the appearance of a mono-ubiquitinated protein of ~108 kDa in anti-ubiquitin Western blots. However, this molecular weight shift is not observed for total IFT172 in the corresponding Coomassie-stained gels (Figures 3B, D, F). Surprisingly, this MW shift is visible in an anti-His Western blot of a ubiquitination assay (Fig 3C). Together, this raises the concern that only a small fraction of IFT172 is being modified with ubiquitin. Quantification of the percentage of ubiquitinated IFT172 in the in vitro experiments could provide helpful context.
We do acknowledge in the manuscript is that the conjugation of ubiquitins to IFT172C is weak (Page 16). Future experiments of identification of potential substrates and its implications in ciliary regulation will provide further context to our in vitro ubiquitination experiments.
(7) The authors propose that IFT172 binds ubiquitin and demonstrate that GST-tagged HsIFT172C2 or HsIFT172C3 can pull down tetra-ubiquitin chains. However, ubiquitin is known to be "sticky" and to have a tendency for weak, nonspecific interactions with exposed hydrophobic surfaces. Given that only a small proportion of the ubiquitin chains bind in the pull-down, specific point mutations that identify the ubiquitin-binding site are required to convincingly show the ubiquitin binding of IFT172.
(8) The authors generated structure-guided mutations based on the predicted Ub-interface and on the TPR/U-box interface and used these for the ubiquitination assays in Fig 3. These same mutations could provide valuable insights into ubiquitin binding assays as they may disrupt or enhance ubiquitin binding (by relieving "autoinhibition"), respectively. Surprisingly, two of these sites are highlighted in the predicted ubiquitin-binding interface (F1715, I1688; Figure 4E) but not analyzed in the accompanying ubiquitin-binding assays in Figure 4.
We agree that these mutations could provide insights into ubiquitin binding by IFT172. We are currently pursuing further mutagenesis studies on the IFT172-Ub interface based on the AF model. We however have evaluated the ubiquitin binding activity of the mutant F1715A using similar pulldowns, which showed no significant impact for the mutation on the ubiquitin binding activity of IFT172. We are yet to evaluate the impact of alternate amino acid substitutions at these positions. The I1688 mutants we cloned could not be expressed in soluble form, thus could not be used for testing in ubiquitination activity or ubiquitin binding assays.
(9) If IFT172 is a ubiquitin-binding protein, it might be expected that the pull-down experiments in Figure S1 would identify ubiquitin, ubiquitinated proteins, or E2 enzymes. These were not observed, raising doubt that IFT172 is a ubiquitin-binding protein.
It is likely that IFT172 only binds ubiquitin with low affinity as indicated by our in vitro pulldowns and the AF interface. In our pull down experiment performed using the Chlamy flagella extracts, we have used extensive washes to remove non-specific interactors. This might have also excluded the identification of weak but bona fide interactors of IFT172. Additionally, we have not used any ubiquitination preserving reagents such as NEM in our pulldown buffers, exposing the cellular ubiquitinated proteins to DUB mediated proteolysis further preventing their identification in our pulldown/MS experiment.
(10) The cell-based experiments demonstrate that the U-box-like region is important for the stability of IFT172 but does not demonstrate that the effect on the TGFb pathway is due to the loss of ubiquitin-binding or ubiquitination activity of IFT172.
We acknowledge that our current data cannot distinguish whether the TGFβ pathway defects arise from general protein instability or from specific loss of ubiquitin-related functions. Our experiments demonstrate that the U-box-like region is required for both IFT172 stability and proper TGFβ signaling, but we agree that establishing a direct mechanistic link between these phenomena would require additional evidence. We will revise our discussion to more clearly acknowledge this limitation in our current understanding of the relationship between IFT172's U-box region and TGFβ pathway regulation.
(11) The challenges in experimentally validating the interaction between IFT172 and the UBX-domain-containing protein are understandable. Alternative approaches, such as using single domains from the UBX protein, implementing solubilizing tags, or disrupting the predicted binding interface in Chlamydomonas flagella pull-downs, could be considered. In this context, the conclusion on page 7 that "The uncharacterized UBX-domain-containing protein was validated by AF-M as a direct IFT172 interactor" is incorrect as a prediction of an interaction interface with AF-M does not validate a direct interaction per se.
We agree with the reviewer that our AlphaFold-Multimer (AF-M) predictions alone do not constitute experimental validation of a direct interaction. We appreciate the reviewer's understanding of the technical challenges in validating this interaction experimentally. We will revise our text to more precisely state that "The uncharacterized UBX-domain-containing protein was validated by AF-M as a potential direct IFT172 interactor" and will discuss the AF-M predictions as computational evidence that suggests, but does not prove, a direct interaction. This more accurately reflects the current state of our understanding of this potential interaction.
Reviewer #3:
Weaknesses:
(1) Interaction studies were carried out by pulldown experiments, which identified more IFT172 interaction partners. Whether these interactions can be seen in living cells remains to be elucidated in subsequent studies.
We agree with the reviewer that validation of protein-protein interactions in living cells provides important physiological context. While our pulldown experiments have identified several promising interaction partners and the AF-M predictions provide computational support for these interactions, we acknowledge that demonstrating these interactions in vivo would strengthen our findings. However, we believe our current biochemical and structural analyses provide valuable insights into the molecular basis of IFT172's interactions, laying important groundwork for future cell-based studies.
(2) The cell culture-based experiments in the IFT172 mutants are exciting and show that the U-box domain is important for protein stability and point towards involvement of the U-box domain in cellular signaling processes. However, the characterization of the generated cell lines falls behind the very rigorous analysis of other aspects of this work.
We thank the reviewer for noting that the characterization of our cell lines could be more rigorous. In the revised manuscript, we will provide additional characterization of the cell lines, including detailed sequencing information and validation data for the IFT172 mutants. This will bring the documentation of our cell-based experiments up to the same standard as other aspects of our work.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
Summary:
The authors tested whether learning to suppress (ignore) salient distractors (e.g., a lone colored nontarget item) via statistical regularities (e.g., the distractor is more likely to appear in one location than any other) was proactive (prior to paying attention to the distractor) or reactive (only after first attending the distractor) in nature. To test between proactive and reactive suppression the authors relied on a recently developed and novel technique designed to "ping" the brain's hidden priority map using EEG inverted encoding models. Essentially, a neutral stimulus is presented to stimulate the brain, resulting in activity on a priority map which can be decoded and used to argue when this stimulation occurred (prior to or after attending to a distracting item). The authors found evidence that despite learning to suppress the high probability distractor location, the suppression was reactive, not proactive in nature.
Overall, the manuscript is well-written, tests a timely question, and provides novel insight into a long-standing debate concerning distractor suppression.
Strengths (in no particular order):
(1) The manuscript is well-written, clear, and concise (especially given the complexities of the method and analyses).
(2) The presentation of the logic and results is mostly clear and relatively easy to digest.
(3) This question concerning whether location-based distractor suppression is proactive or reactive in nature is a timely question.
(4) The use of the novel "pinging" technique is interesting and provides new insight into this particularly thorny debate over the mechanisms of distractor suppression.
Weaknesses (in no particular order):
(1) The authors tend to make overly bold claims without either A) mentioning the opposing claim(s) or B) citing the opposing theoretical positions. Further, the authors have neglected relevant findings regarding this specific debate between proactive and reactive suppression.
(2) The authors should be more careful in setting up the debate by clearly defining the terms, especially proactive and reactive suppression which have recently been defined and were more ambiguously defined here.
(3) There were some methodological choices that should be further justified, such as the choice of stimuli (e.g., sizes, colors, etc.).
(4) The figures are often difficult to process. For example, the time courses are so far zoomed out (i.e., 0, 500, 100 ms with no other tick marks) that it makes it difficult to assess the timing of many of the patterns of data. Also, there is a lot of baseline period noise which complicates the interpretations of the data of interest.
(5) Sometimes the authors fail to connect to the extant literature (e.g., by connecting to the ERP components, such as the N2pc and PD components, used to argue for or against proactive suppression) or when they do, overreach with claims (e.g., arguing suppression is reactive or feature-blind more generally).
We thank the reviewer for their insightful feedback and have made several adjustments to address the concerns raised. To provide a balanced discussion, we tempered our claims about suppression mechanisms and incorporated additional references to opposing theoretical positions, including the signal suppression hypothesis, while clarifying the definitions of proactive and reactive suppression based on recent terminology (Liesefeld et al., 2024). We justified methodological choices, such as the slight size differences between stimuli to achieve perceptual equivalence and the randomization of target and distractor colors to mitigate potential luminance biases. We have revised our figure to enhance figure clarity. Lastly, while our counterbalanced design precluded reliable ERP assessments (e.g., N2pc, PD), we discussed their potential relevance for future research and ensured consistency with the broader literature on suppression mechanisms.
Reviewer #2 (Public Review):
Summary:
The authors investigate the mechanisms supporting learning to suppress distractors at predictable locations, focusing on proactive suppression mechanisms manifesting before the onset of a distractor. They used EEG and inverted encoding models (IEM). The experimental paradigm alternates between a visual search task and a spatial memory task, followed by a placeholder screen acting as a 'ping' stimulus -i.e., a stimulus to reveal how learned distractor suppression affects hidden priority maps. Behaviorally, their results align with the effects of statistical learning on distractor suppression. Contrary to the proactive suppression hypothesis, which predicts reduced memory-specific tuning of neural representations at the expected distractor location, their IEM results indicate increased tuning at the high-probability distractor location following the placeholder and prior to the onset of the search display.
Strengths:
Overall, the manuscript is well-written and clear, and the research question is relevant and timely, given the ongoing debate on the roles of proactive and reactive components in distractor processing. The use of a secondary task and EEG/IEM to provide a direct assessment of hidden priority maps in anticipation of a distractor is, in principle, a clever approach. The study also provides behavioral results supporting prior literature on distractor suppression at high-probability locations.
Weaknesses:
(1) At a conceptual level, I understand the debate and opposing views, but I wonder whether it might be more comprehensive to present also the possibility that both proactive and reactive stages contribute to distractor suppression. For instance, anticipatory mechanisms (proactive) may involve expectations and signals that anticipate the expected distractor features, whereas reactive mechanisms contribute to the suppression and disengagement of attention.
This is an excellent point. Indeed, while many studies, including our own, have tried to dissociate between proactive and reactive mechanisms, as if it is one or the other, the overall picture is arguably more nuanced. We have added a paragraph to the discussion on page 19 to address this. At the same time, (for more details see our responses to your comments 3 and 5), we have added a paragraph where we provide an alternative explanation of the current data in the light of the dual-task nature of our experiment.
(2) The authors focus on hidden priority maps in pre-distractor time windows, arguing that the results challenge a simple proactive view of distractor suppression. However, they do not provide evidence that reactive mechanisms are at play or related to the pinging effects found in the present paradigm. Is there a relationship between the tuning strength of CTF at the high-probability distractor location and the actual ability to suppress the distractor (e.g., behavioral performance)? Is there a relationship between CTF tuning and post-distractor ERP measures of distractor processing? While these may not be the original research questions, they emerge naturally and I believe should be discussed or noted as limitations.
Thank you for raising these important points. While CTF slopes have been shown to provide spatially and temporally resolved tracking of covert spatial attention and memory representations at the group level, to the best of our knowledge, no study to date has found a reliable correlation between CTFs and behavior. Moreover, the predictive value of the learned suppression effect, while also highly reliable at the group level, has been proven to be limited when it comes to individual-level performance (Ivanov et al. 2024; Hedge et al., 2018). Nevertheless, based on your suggestion, we explored whether there was a correlation between the averaged gradient slope within the time window where the placeholder revived the memory representation and the average distance slope in reaction times for the learned suppression effect. This correlation was not significant (r = .236, p = 0.267), which, considering our sample size and the reasons mentioned earlier, is not particularly surprising. Given that our sample size was chosen to measure group level effects, we decided not to include individual differences analysis it in the manuscript.
Regarding the potential link between the CTF tuning profile and post-distractor ERP measures like N2pc and Pd, our experimental design presented a specific challenge. To reliably assess lateralized ERP components like N2pc or Pd the high probability location must be restricted to static lateralized positions (e.g., on the horizontal midline). Our counterbalanced design (see also our response to comment 9 by reviewer 1), which was crucial to avoid bias in spatial encoding models, precluded such a targeted ERP analysis.
(3) How do the authors ensure that the increased tuning (which appears more as a half-split or hemifield effect rather than gradual fine-grained tuning, as shown in Figure 5) is not a byproduct of the dual-task paradigm used, rather than a general characteristic of learned attentional suppression? For example, the additional memory task and the repeated experience with the high-probability distractor at the specific location might have led to longer-lasting and more finely-tuned traces for memory items at that location compared to others.
Thank you for raising these important points. Indeed, a unique aspect of our study that sets it apart from other studies, is that the effects of learned suppression were not measured directly via an index of distractor processing, but rather inferred indirectly via tuning towards a location in memory. The critical assumption here, that we now make explicit on page 18, is that various sources of attentional control jointly determine the priority landscape, and this priority landscape can be read out by neutral ping displays. An alternative however, as suggested by the reviewer, is that memory representations may have been sharper when they remembered location was at the high probability distractor location. We believe this is unlikely for various reasons. First, at the behavioral level there was no evidence that memory performance differed for positions overlapping high and low probability distractor locations (also see our response to reviewer 3 minor comment 4). Second, there was no hint whatsoever that the memory representation already differed during encoding or maintenance (This is now explicitly indicated in the revised manuscript on page 14), which would have been expected if the spatial distractor imbalance modulated the spatial memory representations.
Nevertheless, as discussed in more detail in response to comment 5, there is an alternative explanation for the observed gradient modulation that may be specific to the dual nature of our experiment.
(4) It is unclear how IEM was performed on total vs. evoked power, compared to typical approaches of running it on single trials or pseudo-trials.
Thank you for pointing out that our methods were not clear. We did not run our analysis on single trials because we were interested in separately examining the spatial selectivity of both evoked alpha power (phase locked activity aligned with stimulus onset) and total alpha power (all activity regardless of signal phase). It is only possible to calculate evoked and total power when averaging across trials. Thus, when we partitioned the data into sets for the IEM analysis, we averaged trials for each condition/stimulus location to obtain a measurement of evoked and total power each condition for each set. This is the same approach used in previous work (e.g. Foster et al., 2016; van Moorselaar et al., 2018).
We reviewed our method section and can see why this was unclear. In places, we had incorrectly described the dimensions of training and test data as electrodes x trials. To address this, we’ve rewritten the “Time frequency analysis”, “Inverted encoding model” sections, and added a new “Training and test data” section. We hope that these sections are easier to follow.
(5) Following on point 1. What is the rationale for relating decreased (but not increased) tuning of CTF to proactive suppression? Could it be that proactive suppression requires anticipatory tuning towards the expected feature to implement suppression? In other terms, better 'tuning' does not necessarily imply a higher signal amplitude and could be observable even under signal suppression. The authors should comment on this and clarify.
We appreciate your highlighting of these highly relevant alternative explanations. In response, we have revised a paragraph in the General Discussion on page 18 to explicitly outline our rationale for associating decreased tuning with proactive suppression. However, in doing so, we now also consider the alternative perspective that proactive suppression might actually require enhanced tuning towards the expected feature to implement suppression effectively.
It's important to note that both of these interpretations – decreased tuning as a sign of suppression and increased tuning as a preparatory mechanism for suppression – diverge significantly from the commonly held model (including our own initial assumptions) wherein weights at the to-be-suppressed location are simply downregulated.
Minor:
(1) In the Word file I reviewed, there are minor formatting issues, such as missing spaces, which should be double-checked.
Thank you! We have now reviewed the text thoroughly and tried our best to avoid formatting issues.
(2) Would the authors predict that proactive mechanisms are not involved in other forms of attention learning involving distractor suppression, such as habituation?
Habituation is a form of non-associative learning where the response to a repetitive stimulus decreases over time. As such, we would not characterize these changes as “proactive”, as it only occurs following the (repeated) exposure to the stimulus.
(3) A clear description in the Methods section of how individual CTFs for each location were derived would help in understanding the procedure.
Thank you. We have now added several sentences on page 27 to clarify how individual CTFs in Figure 3 and distance CTFs in Figure 5 are calculated.
“The derived channel responses (8 channels × 8 location bins) were then used for the following analyses: (a) calculating individual Channel Tuning Functions (CTFs) based on each of the eight physical location bins (e.g., Figure 3C and 3D); (b) grouping responses according to the distance between each physical location and the high-probability distractor location to calculate distance CTFs (e.g., Figure 5); and (c) averaging across location bins to represent the general strength of spatial selectivity in tracking the memory cue, irrespective of its specific location (e.g., Figure 3A and 3B).”
(4) Why specifically 1024 resampling iterations?
Thank you for your question. The statistical analysis was conducted using the permutation_cluster_1samp_test function within the MNE package in Python. We have clarified this on page 25. The choice of 1024 permutations reflects the default setting of the function, which is generally considered sufficient for robust non-parametric statistical testing. This number provides a balance between computational efficiency and the precision of p-value estimation in the context of our analyses.
Reviewer #3 (Public Review):
Summary:
In this experiment, the authors use a probe method along with time-frequency analyses to ascertain the attentional priority map prior to a visual search display in which one location is more likely to contain a salient distractor. The main finding is that neural responses to the probe indicate that the high probability location is attended, rather than suppressed, prior to the search display onset. The authors conclude that suppression of distractors at high-probability locations is a result of reactive, rather than proactive, suppression.
Strengths:
This was a creative approach to a difficult and important question about attention. The use of this "pinging" method to assess the attentional priority map has a lot of potential value for a number of questions related to attention and visual search. Here as well, the authors have used it to address a question about distractor suppression that has been the subject of competing theories for many years in the field. The paper is well-written, and the authors have done a good job placing their data in the larger context of recent findings in the field.
Weaknesses:
The link between the memory task and the search task could be explored in greater detail. For example, how might attentional priority maps change because of the need to hold a location in working memory? This might limit the generalizability of these findings. There could be more analysis of behavioral data to address this question. In addition, the authors could explore the role that intertrial repetition plays in the attentional priority map as these factors necessarily differ between conditions in the current design. Finally, the explanation of the CTF analyses in the results could be written more clearly for readers who are less familiar with this specific approach (which has not been used in this field much previously).
We appreciate the reviewer's valuable feedback and have made significant revisions to address the concerns raised. To clarify the connection between the memory and search tasks, we conducted additional analyses to explore the effects of spatial distance between the memory cue location and the high-probability distractor location on behavioral performance. We also investigated the potential influence of intertrial repetition effects on the observed results by removing trials with location repetitions. To enhance clarity, we revised the explanation of the CTF analyses in the Results section and improved figure annotations to ensure accessibility for readers unfamiliar with this approach. Collectively, these updates further discuss how the pattern of CTF slopes reflect the interplay between memory and search tasks while addressing key methodological and interpretative considerations.
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
Suggestions/Critiques (in no particular order)
(1) The authors discuss the tripartite model (bottom-up, top-down, and selection history) but neglect recent and important discussions of why this trichotomy might be unnecessarily complicated (e.g., Anderson, 2024: Trichotomy revisited: A monolithic theory of attentional control). Simply put, one of the 3 pillars (i.e., selection history) likely does not fall into a unitary construct or "box"; instead, it likely contains many subcomponents (e.g., reward associations, stimulus-response habit learning, statistical learning, etc.). Since the focus of the current study is learned distractor suppression based on the statistical regularities of the distractor, the authors should comment on which aspects of selection history are relevant, perhaps by using this monolithic framework.
We appreciate the reviewer's insightful suggestion regarding theoretical frameworks of attentional control. While Anderson (2024) proposes a monolithic theory that challenges the traditional tripartite model, our study deliberately maintains a pragmatic approach. The main purpose of our experiment is empirically investigating the mechanisms of learned distractor suppression, rather than adjudicating between competing theoretical models.
We agree that selection history is not a unitary construct but comprises multiple subcomponents, including reward associations, stimulus-response habit learning, and statistical learning. In this context, our study specifically focuses on statistical learning as a key mechanism of distractor suppression. By explicitly acknowledging the multifaceted nature of selection history and referencing Anderson's monolithic perspective, we invite readers to consider the theoretical implications while maintaining our research's primary focus on empirical investigation. To this end, we have modified the manuscript to read (see page 3):
"The present study investigates the mechanisms underlying statistical learning, specifically learned distractor suppression, which represents one critical subcomponent of selection history. While theoretical models like the tripartite framework and the recent monolithic theory (Anderson, 2024) offer complementary perspectives on attentional control, our investigation focuses on empirically characterizing the statistical learning mechanisms underlying learned distractor suppression."
(2) The authors discuss previous demonstrations of location-based and feature-based learned distractor suppression. The authors admit that there have been a large number of studies but seem to mainly cite those that were conducted by the authors themselves (with the exception being Vatterott & Vecera, 2012). For example, there are other studies investigating location-based suppression (Feldmann-Wüstefeld et al., 2021; Sauter et al., 2021), feature-based suppression (Gaspelin & Luck, 2018a; Stilwell et al., 2022; Stilwell & Gaspelin, 2021; Vatterott et al., 2018), or both (Stilwell et al., 2019). The authors do not cite Gaspelin and colleagues at all in the manuscript, despite claiming that singleton-based suppression is not proactive.
We appreciate your pointing out the need for a more comprehensive citation of the literature on learned distractor suppression, particularly with respect to location-based and feature-based suppression. In response to your comment, we have now expanded the reference list on page 4 to include relevant studies that further support our discussion of both location-based and feature-based suppression mechanisms.
(3) The authors use the terms "proactive" and "reactive" suppression without taking into consideration the recent terminology paper, which one of the current authors, Theeuwes, helped to write (Liesefeld et al., 2024, see Figure 8). The terms proactive and reactive suppression need to be defined relative to a time point. The authors need to be careful in defining proactive suppression as prior to the first shift of attention, but after the stimuli appear and reactive suppression as after the first shift of attention and after the stimuli appear. Thus, the critical time point is the first shift of attention. Does suppression occur before or after the first shift of attention? The authors could alleviate this by using the term "stimulus-triggered suppression" to refer to "suppression that occurs after the distractor appears and before it captures attention" (Liesefeld et al., 2024).
Thank you for pointing out that this was insufficiently clear in the previous version. In the revised version we specifically refer to the recent terminology paper on page 5 to make clear that suppression could theoretically occur at three distinct moments in time, and that the present paper was designed to dissociate between suppression before or after the first shift of attention.
(4) Could the authors justify why the circle stimulus (2° in diameter) was smaller than the diamonds (2.3° x 2.3°)? Are the stimuli equated for the area? Or, for width and height? Doesn't this create a size singleton target on half of all trials (whenever the target is a circle) in addition to the lone circle being a shape singleton? Along these lines, could the authors justify why the colors were used and not equiluminant? This version of red is much brighter than this version of green if assessed by a spectrophotometer. Thus, there are sensory imbalances between the colors. Further, the grey used as the ping is likely not equiluminant to both colors. Thus, the grey "ping" is likely dimmer for red items but brighter for green items. Is this a fair "ping"?
Thank you for raising these important points. We chose, as is customary in this experimental paradigm (e.g., Huang et al., 2023; Duncan et al., 2023), to make the diamond slightly larger (2.3° x 2.3°) than the circle (2° in diameter) to ensure a better visual match in overall size appearance. If the circle and diamond stimuli were equated strictly in terms of size (both at 2°), the diamond would appear visually smaller due to the differences in geometric shape. By adjusting the dimensions slightly, we aimed to minimize any unintentional differences in perceptual salience.
As for the colors used in the experiment, the reviewer is right that there might be sensory imbalances between the red and green stimuli, with red appearing brighter than green based on measurements such as spectrophotometry. To ensure that any effects couldn’t be explained by sensory imbalance in the displays, we randomized target and distractor colors across trials, meaning that roughly half the trials had a red distractor and half had a green distractor. This randomization should have mitigated any systematic biases caused by color differences.
We appreciate your feedback and have clarified these points in method section in the revised manuscript on page 22:
"Please note that although the colors were not equiluminant, the target and distractor colors were randomized across trials such that roughly half the trials had a red distractor, and half had a green distractor. This randomization process should help mitigate any systematic biases this may cause."
(5) For the eye movement artifact rejection, the authors use a relatively liberal rejection routine (i.e., allowing for eye movements up to 1.2° visual angle and a threshold of 15 μV). Given that every 3.2 μV deviation in HEOG corresponds to ~ ± 0.1° of visual angle (Lins, et al., 1993), the current oculomotor rejection allows for eye movements between 0.5° and 1.2° visual angle to remain which might allow for microsaccades (e.g., Poletti, 2023) to contaminate the EEG signal (e.g., Woodman & Luck, 2003).
The reviewer correctly points out that our eye rejection procedure, which is the same as in our previous work (e.g., Duncan et al., 2023), still allows for small, but systematic biases in eye position towards the remembered location and potentially towards or away from the high probability distractor location. While we cannot indefinitely exclude this possibility, we believe this is unlikely for the following reasons. First, although there is a link between microsaccades and covert attention, it has been demonstrated that subtle biases in eye position cannot explain the link between alpha activity and the content of spatial WM (Foster et al., 2016, 2017). Specifically, Foster et al. (2017) found no evidence for a gaze-position-related CTF, while an analysis on that same data yielded clear target related CTFs. Similarly, within the present data set there was no evidence that the observed revival induced by the ping display could be attributed to systematic changes in gaze position, as a multivariate cross-session decoding analysis with x,y positions from the tracker did not yield reliable above-chance decoding of the location in memory.
Author response image 1.
(6) The authors claim that "If the statistically learned suppression was spatial-based and feature-blind, one would also expect impaired target processing at the high-probability location." (p. 7, lines 194-195). Why is it important that suppression is feature-blind here? Further, is this a fair test of whether suppression is feature-blind? What about inter-trial priming of the previous trial? If the previous trial's singleton color repeated RTs might be faster than if it switched. In other words, the more catastrophic the interference (the target shape, target color, distractor shape, distractor color) change between trials, the more RTs might slow (compared with consistencies between trials, such that the target and distractor shapes repeat and the target and distractor colors repeat). Lastly, given the variability across both the shape and color dimensions, the claim that this type of suppression is feature-blind might be an artifact of the design promoting location-based instead of feature-based suppression.
Thank you for raising this point. In the past we have used the finding that learned suppression was not specific to distractors, but also generalized to targets to argue in favor of proactive (or stimulus triggered) suppression. However, we agree that given the current experimental parameters it may be an oversimplification to conclude that the effect was feature-blind based on the impaired target processing as observed here. As this argument is also not relevant to our main findings, we have removed this interpretation and simply report that the effect was observed for both distractor and targets. Nevertheless, we would like to point out that while inter-trial priming could influence reaction times, the features of both target and distractors (shape and color) were randomly assigned on each trial. This should mitigate consistent feature repetitions effects. Additionally, previous research has demonstrated that suppression effects persist even when immediate feature repetitions are controlled for or statistically accounted for (e.g., Wang & Theeuwes 2018 JEP:HPP; Huang et al., 2021 PB&R).
(7) The authors should temper claims such as "suppression occurs only following attentional enhancement, indicating a reactive suppression mechanism rather than proactive suppression." (p. 15, lines 353-353). Perhaps this claim may be true in the current context, but this claim is too generalized and not supported, at least yet. Further, "Within the realm of learned distractor suppression, an ongoing debate centers around the question of whether, and precisely when, visual distractors can be proactively suppressed. As noted, the idea that learned spatial distractor suppression is applied proactively is largely based on the finding that the behavioral benefit observed when distractors appear with a higher probability at a given location is accompanied by a probe detection cost (measured via dot offset detection) at the high probability distractor location (Huang et al., 2022, 2023; Huang, Vilotijević, et al., 2021)." (p. 15, lines 355-361). Again, the authors should either cite more of the opposing side of the debate (e.g., the signal suppression hypothesis, Gaspelin & Luck, 2019 or Luck et al., 2021) and the many lines of converging evidence of proactive suppression) or temper the claims.
Thank you for your constructive feedback regarding our statements on suppression mechanisms. We acknowledge that our original claim was intended to reflect our specific findings within the context of this study and was not meant to generalize across all research in the field. To prevent any misunderstanding, we have tempered our claims to avoid overgeneralization by clarifying that our findings suggest a tendency toward reactive suppression within the specific experimental conditions we investigated (see page 17).
Furthermore, learned distractor suppression is multifaceted, encompassing both feature-based suppression (as proposed by the signal suppression hypothesis) and spatial-based suppression (as examined in the current study). The signal suppression hypothesis provides proactive evidence related to the suppression of specific feature values (Gaspelin et al., 2019; Gaspelin & Luck, 2018b; Stilwell et al., 2019). We have incorporated references to these studies to offer a more comprehensive perspective on the ongoing debate at a broader level (see page 17).
(8) "These studies however, mainly failed to find evidence in support of active preparatory inhibition (van Moorselaar et al., 2020, 2021; van Moorselaar & Slagter, 2019), with only one study observing increased preparatory alpha contralateral to the high probability distractor location (Wang et al., 2019)." (p. 15, lines 367-370). This is an odd phrasing to say "many studies" have shown one pattern (citing 3 studies) and "only" one showing the opposite, especially given these were all from the current authors' labs.
Agreed. We have rewritten this text on page 17.
“These studies however, failed to find evidence in support of active preparatory inhibition as indexed via increased alpha power contralateral to the high probability distractor location (van Moorselaar et al., 2020, 2021; van Moorselaar & Slagter, 2019; but see Wang et al., 2019).”
(9) Could the authors comment on why total power was significantly above baseline immediately (without clearer timing marks, ~10-50 ms) after the onset of the cue (Figure 3)? Is this an artifact of smearing? Further, it appears that there is significant activity (as strong as the evoked power of interest) in the baseline period of the evoked power when the memory item is presented on the vertical midline in the upper visual field (this is also true, albeit weaker, for the memory cue item presented on the horizontal midline to the right). This concern again appears in Figure 4 where the Alpha CTF slope was significantly below or above the baseline prior to the onset of the memory cue. Evoked Alpha was already significantly higher than baseline in the baseline period. In Figure 5, evoked power is already higher and different for the hpl than the lpls even at the memory cue (and before the memory cue onsets). There are often periods of differential overlap during the baseline period, or significant activity in the baseline period or at the onset of the critical, time-locked stimulus array. The authors should explain why this might be (e.g., smearing).
Thank you for pointing this out. As suggested by the reviewer, this ‘unexpected’ pre-stimulus decoding is indeed the result of temporal smearing induced by our 5th order Butterworth filter. The immediate onset of reliable tuning (sometimes even before stimulus onset) is then also a typical aspect of studies that track tuning profiles across time in the lower frequency bands such as alpha (van Moorselaar & Slagter 2019; van Moorselaar et al., 2020; Foster et al., 2016).
Indeed, visual inspection also suggests that evoked activity tracked items at the top of the screen, an effect that is unlikely to result from temporal smearing as it is temporally interrupted around display onset. However, it is important to note that CTFs by location are based on far fewer trials, making them inherently noisier. The by-location plots primarily serve to show that the observed pattern is generally consistent across locations. In any case, given that the high probability distractor location was counterbalanced across participants it did not systematically influence our results.
(10) Given that EEG was measured, perhaps the authors could show data to connect with the extant literature. For example, by showing the ERP N2pc and PD components. A strong prediction here is that there should be an N2pc component followed by a PD component if there is the first selection of the singleton before it is suppressed.
Thank you for your great suggestion regarding the analysis of ERP components such as N2pc and Pd. To reliably assess lateralized ERP components like N2pc or Pd the high probability location must be restricted to static lateralized positions (e.g., on the horizontal midline such as Wang et al., 2019). In contrast, our study was designed to utilize an inverted encoding model to investigate the mechanisms underlying spatial suppression. To avoid bias in training the spatial model toward specific spatial locations (see also the previous comment), we counterbalanced the high-probability location across participants, ensuring an equal distribution of high-probability locations within the sample. Given this counterbalanced design, it was not feasible to reliably assess these components within the scope of the current study. Yet, we agreed with the reviewer that it would be of theoretical interest to examine Pd and N2pc evoked by the search display, particularly in this scenario where suppression has been triggered prior to search onset.
(11) Figure 2 (behavioral results) is difficult to see (especially the light grey and white bars). A simple fix might be to outline all the bars in black.
Thank you! We have incorporated your suggestion by outlining all the bars on page 10.
Reviewer #3 (Recommendations For The Authors):<br /> (1) I'm wondering about the link between the memory task and the search task. I think the interpretation of the data should include more discussion of the fact that much of the search literature doesn't involve simultaneously holding an unrelated location in memory. How might that change the results?
For example - what happens behaviorally on the subset of trials in which the location to be held in memory is near the high probability distractor location? All the behavioral data is more or less compartmentalized, but I think some behavioral analysis of this and related questions might be quite useful. I know there are comparisons of behavior in single vs. dual-task cases (for the memory task at least), but I think the analyses could go deeper.
Thank you for your great suggestion. To investigate the potential interactions between the spatial memory task and the visual search task, we conducted additional analyses on the behavioral data. First, we examined whether memory recall was influenced by the spatial distance (dist0 to dist4) between the memory cue location and the high-probability distractor location. As shown in the figure below, memory recall is not systematically biased either toward or away from the high-probability distractor location (p = .562, ηp<sup>2</sup> = .011).
We also assessed how the memory task might affect search performance. Specifically, we plotted reaction times as a function of the spatial overlap between the memory cue location and any of the search items, separating trials by distractor-present (match-target, match-distractor, match-neutral) and distractor-absent (match-target, match-neutral) conditions. Although visually the result pattern seems to suggest that search performance was facilitated when the memory cue spatially overlapped with the target and interfered with when it overlapped with the distractor, this pattern did not reach statistical significance (distractor-present: p = .249, ηp<sup>2</sup> = .002; distractor-absent: p = .335, ηp<sup>2</sup> = .002). We have now included these analyses in our supplemental material.
Beyond additional data analyses, there are also theoretical questions to be asked. For example, one could argue that in order to maintain a location near or at the high probability distractor location in working memory, the priority map would have to shift substantially. This doesn't necessarily mean that proactive suppression always occurs in search when there is a high probability location. Instead, one could argue that when you need to maintain a high probability location in memory but also know that this location might contain a distractor, the representation necessarily looks quite different than if there were no memory tasks. Maybe there are reasons against this kind of interpretation but more discussion could be devoted to it in the manuscript. I guess another way to think of this question is - how much is the ping showing us about attentional priority for search vs. attentional priority for memory, or is it simply a combination of those things, and if so, how might that change if we could ping the attentional priority map without a simultaneous memory task?
Thank you for this valuable suggestion. The aim of our study was to explore how the CTFs elicited by the memory cue were influenced by the search task. We employed a simultaneous memory task because directly measuring CTFs in relation to the search task was not feasible, as the HPL typically does not vary within individual participants. Consequently, CTFs locked to placeholder onsets could reflect arbitrary differences between (subgroups of) participants rather than true differences in the HPL. To address this, we combined the search task with a VWM task, leveraging the fact that location-specific CTFs can reliably be elicited by a memory cue and that the location of this cue relative to the HPL can be systematically varied within participants (Foster et al., 2016, 2017; van Moorselaar et al., 2018). This approach allowed us to examine the CTFs elicited by the memory cue and how these were modulated by their distance from the HPL.
While it is theoretically possible that the observed changes resulted from alterations in how the memory cue was maintained in memory only, this explanation seems unlikely, for memory performance (recall) did not vary as a function of the cue's distance from the HPL, suggesting that the distance-related changes in the CTFs are reflections of both tasks. Moreover, distractor learning typically occurs without awareness (Gao & Theeuwes 2022; Wang & Theeuwes 2018). It is difficult to understand how such unconscious processes could lead to anticipations in the memory task and subsequently modulate the representation of the consciously remembered memory cue only. We therefore believe that if we would have pinged the attentional priority map without a simultaneous memory task, the results would have been similar to those obtained in the present experiment, indicating stronger tuning at the HPL. Yet, this work still needs to be done.
To address this comment, we have added a paragraph on p. 18:
“However, two alternative explanations warrant consideration. First, one could argue that observed modulations in the revived CTFs do not provide insight into the mechanisms underlying distractor suppression but instead reflect changes in the memory representation itself, potentially triggered by the anticipation of the HPL in the search task. According to this view, the changes in the revived CTFs would be unrelated to how search performance (in particular distractor suppression) was achieved. While this is theoretically possible, we believe it to be unlikely. Memory performance (recall) did not vary as a function of the cue's distance from the HPL, whereas the revived CTFs did, indicating that these changes likely reflect contributions from both tasks. Additionally, distractor learning typically occurs without conscious awareness (Gao & Theeuwes 2022; Wang & Theeuwes 2018). It is difficult to conceive how such unconscious processes could produce anticipatory effects in the memory task and selectively modulate the representation of the consciously remembered memory cue. Second, the apparent lack of suppression and the presence of a pronounced tuning at the high-probability distractor location could actually reflect a proactive mechanism that manifests in a way that seems reactive due to the dual-task nature of our experiment.”
(2) When the distractor appears at a particular location with a high probability it necessarily means that intertrial effects differ between high and low probability distractor locations. Consecutive trials with a distractor at the same location are far more frequent in the high probability condition. You may not have enough power to look at this, and I know this group has analyzed this behaviorally in the past, but I do wonder how much that influences the EEG data reported here. Are CTFs also sensitive to distractors/targets from the most recent trial? And does that contribute to the overall patterns observed here?
Thank you for your thoughtful comment. Indeed, Statistical distractor learning studies naturally involve a higher proportion of intertrial effects for high-probability distractors compared to low-probability ones. Previous research, including the present study, has demonstrated that while distractor location improves performance—shown by faster response times (t(23) = 6.32, p < .001, d = 0.33) and increased accuracy (t(23) = 4.21, p < .001, d = 0.86)—intertrial effects alone cannot fully account for the learned suppression effects induced by spatial distractor imbalances. This analysis in now reflected in the revised manuscript on page 9.
However, as noted by the reviewer, this leaves uncertain to what extent the neural indices of statistical learning, in this case the modulation of channel tuning functions, capture the effects of interest beyond the contributions of intertrial priming. To address this issue, one possible approach is to rerun the CTF analysis after excluding trials with location repetitions. Since the distractor location is unknown to participants at the time the CTF is revived by the placeholder, we removed trials where the memory cue location repeated the distractor location from the preceding trial, rather than trials with distractor location repetitions between consecutive trials. Our analyses indicate that after trials removal (~ 9% of overall trials), the spatial gradient pattern in the CTF slopes remains similar. However, the cluster-based permutation analysis fails to reveal any significant findings, and a one-sample t-test on the slopes averaged within the 100 ms time window of interest yields a p-value of 0.106. While this could suggest that the current pattern is influenced by distractor-cue repetition, it is more likely that the trial removal resulted in an underpowered analysis. To investigate this, we randomly removed an equivalent number of trials (9%), which similarly resulted in insignificant findings, although the overall result pattern remained comparable (p = 0.066 for the one-sample t-test on the slopes average within the interested time window of 100 ms).
Author response image 2.
Also, in our previous pinging study we observed that, despite the trial imbalance, decoding was approximately equal between high probability trailing (i.e., location intertrial priming) and non-trailing trials, suggesting that the ping is able to retrieve the priority landscape that build up across longer timescales.
(3) Maybe there is too much noise in the data for this, but one could look at individual differences in the magnitude of the high probability distractor suppression and the magnitude of the alpha CTF slope. If there were a correlation here it would bolster the argument about the relationship between priority to the distractor location and subsequent behavior reduction of interference from that distractor.
Thank you for this valuable suggestion. We investigated whether there was a correlation between the average gradient slope during the time window in which the placeholder revived the memory representation and the average distance slope in reaction times for the learned suppression effect. This correlation was not significant (r = .236, p = 0.267), which is perhaps expected given the potential noise levels, as noted by the reviewer. Furthermore, while the learned suppression effect is robust at the group level, its predictive value for individual-level performance has been shown to be limited (Ivanov et al., 2024; Hedge et al., 2018). Consequently, we chose not to include this analysis in the manuscript (see also our response to comment 2 by reviewer 2).
(4) The results sections are a bit dense in places, especially starting at the bottom of page 11. For readers who are familiar with the general questions being asked but less so with the particular time-frequency analyses and CTF approaches being used (like myself), I think a bit more time could be spent setting up these analyses within the results section to make extra clear what's going on.
Thank you for your feedback regarding the clarity of our Results section. We have revised this section to make it more understandable and easier to follow, especially for readers who may be less familiar with the specific time-frequency analyses and modeling approaches used in our study. Specifically, we have provided additional interpretations alongside the reported results from page 10 to page 13 to aid comprehension and ensure that the methodology and findings are accessible to a broader audience. Additionally, we have revised the figure notes to further enhance clarity and understanding.
Other comments:
Abstract: "a neutral placeholder display was presented to probe how hidden priority map is reconfigured..." i think the word "the" is missing before "priority map"
Thank you. We have added the word “the” before “hidden priority map”.
p. 4, Müller's group also has a number of papers that demonstrate how learned distractor regularities impact search (From the ~2008-2012 range, probably others as well), it might be worth citing a few here.
Thank you for your suggestion. In the revised manuscript, we have added citations to several key papers from Muller’s group on page 4 as well as other research groups.
p.5 - Chang et al. (2023) seems highly relevant to the current study (and consistent with its results) - depending on word limits, it might make sense to expand the description of this in the introduction to make clear how the present study builds upon it
Thank you! We have expanded the discussion of Chang et al. (2023) on page 5 to provide more detailed elaboration of their study and its relevance to our work.
p. 7 - maybe not for the current study, but I do wonder whether the distortion of spatial memory by the presence of the search task occurs only when there is a relevant regularity in the search task. In other words, if the additional singleton task had completely unpredictable target and distractor locations, would there be memory distortions? Possibly for the current dataset, the authors could explore whether the behavioral distortion is systematically towards or away from the high probability distractor location.
Thank you for your insightful suggestion. Following your recommendation, we conducted an additional analysis to examine memory recall as a function of the distance between the memory cue location and the high-probability distractor location. Figure S1A illustrates the results, depicting memory recall deviation across various distances (dist0 to dist4) from the high-probability distractor location.
Our statistical analysis indicates that memory recall is not systematically biased either towards or away from the high-probability distractor location (p = .562, η<sub>p</sub><sup>2</sup> = .011). This finding suggests that spatial memory recall remains relatively stable and is not heavily influenced by the presence of regularities in the distractor locations.
p. 7 - in addition to stats it would be helpful to report descriptive statistics for the high probability vs. other distractor location comparisons
Thank you! We have added descriptive statistics on page 8 and page 9.
p. 19, "64%" repeated unnecessarily - also, shouldn't it be 65% if it's 5% at each of the other seven locations?
Thank you. This is now corrected in the revised manuscript.
p. 20 "This process continued until participants demonstrated a thorough understanding of the assigned tasks" Were there objective criteria to measure this?
Thank you for pointing out this issue. To clarify, objective criteria were indeed used to assess participants’ readiness to proceed. Specifically:
For the training phase practice trials, participants were required to achieve an average memory recall deviation of less than 13°.
For the test phase practice trials, participants needed to demonstrate a minimum of 65% accuracy in the search task. In addition, participants were asked to verbally confirm their understanding of the task goals with the experimenter before proceeding.
We have revised the manuscript to clearly indicate these criteria on p. 23.
p. 21 "P-values were Greenhouse-Geiser corrected in case where the..." I think "case" should be "cases"
Thank you. We have corrected this in the revised manuscript.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the previous reviews.
Public Reviews:
Reviewer #1 (Public review):
Koren et al. derive and analyse a spiking network model optimised to represent external signals using the minimum number of spikes. Unlike most prior work using a similar setup, the network includes separate populations of excitatory and inhibitory neurons. The authors show that the optimised connectivity has a like-to-like structure, which leads to the experimentally observed phenomenon of feature competition. The authors also examine how various (hyper)parameters-such as adaptation timescale, the excitatory-to-inhibitory cell ratio, regularization strength, and background current-affect the model. These findings add biological realism to a specific implementation of efficient coding. They show that efficient coding explains, or at least is consistent with, multiple experimentally observed properties of excitatory and inhibitory neurons.
As discussed in the first round of reviews, the model's ability to replicate biological observations such as the 4:1 ratio of excitatory vs. inhibitory neurons hinges on somewhat arbitrary hyperparameter choices. Although this may limit the model's explanatory power, the authors have made significant efforts to explore how these parameters influence their model. It is an empirical question whether the uncovered relationships between, e.g., metabolic cost and the fraction of excitatory neurons are biologically relevant.
The revised manuscript is also more transparent about the model's limitations, such as the lack of excitatory-excitatory connectivity. Further improvements could come from explicitly acknowledging additional discrepancies with biological data, such as the widely reported weak stimulus tuning of inhibitory neurons in the primary sensory cortex of untrained animals.
We thank the Reviewer for their insightful characterization of our paper and for further suggestions on how to improve it. We have now further improved the transparency about model’s limitations and we explicitly acknowledged the discrepancy with biological data about connection probability and about the selectivity of inhibitory neurons (pages 4 and 15).
Reviewer #2 (Public review):
Summary:
In this work, the authors present a biologically plausible, efficient E-I spiking network model and study various aspects of the model and its relation to experimental observations. This includes a derivation of the network into two (E-I) populations, the study of single-neuron perturbations and lateral-inhibition, the study of the effects of adaptation and metabolic cost, and considerations of optimal parameters. From this, they conclude that their work puts forth a plausible implementation of efficient coding that matches several experimental findings, including feature-specific inhibition, tight instantaneous balance, a 4 to 1 ratio of excitatory to inhibitory neurons, and a 3 to 1 ratio of I-I to E-I connectivity strength.
Strengths:
While many network implementations of efficient coding have been developed, such normative models are often abstract and lacking sufficient detail to compare directly to experiments. The intention of this work to produce a more plausible and efficient spiking model and compare it with experimental data is important and necessary in order to test these models. In rigorously deriving the model with real physical units, this work maps efficient spiking networks onto other more classical biophysical spiking neuron models. It also attempts to compare the model to recent single-neuron perturbation experiments, as well as some long-standing puzzles about neural circuits, such as the presence of separate excitatory and inhibitory neurons, the ratio of excitatory to inhibitory neurons, and E/I balance. One of the primary goals of this paper, to determine if these are merely biological constraints or come from some normative efficient coding objective, is also important. Lastly, though several of the observations have been reported and studied before, this work arguably studies them in more depth, which could be useful for comparing more directly to experiments.
Weaknesses:
This work is the latest among a line of research papers studying the properties of efficient spiking networks. Many of the characteristics and findings here have been discussed before, thereby limiting the new insights that this work can provide. Thus, the conclusions of this work should be considered and understood in the context of those previous works, as the authors state. Furthermore, the number of assumptions and free parameters in the model, though necessary to bring the model closer to biophysical reality, make it more difficult to understand and to draw clear conclusions from. As the authors state, many of the optimality claims depend on these free parameters, such as the dimensionality of the input signal (M=3), the relative weighting of encoding error and metabolic cost, and several others. This raises the possibility that it is not the case that the set of biophysical properties measured in the brain are accounted for by efficient coding, but rather that theories of efficient coding are flexible enough to be consistent with this regime. With this in mind, some of the conclusions made in the text may be overstated and should be considered in this light.
Conclusions, Impact, and additional context:
Notions of optimality are important for normative theories, but they are often studied in simple models with as few free parameters as possible. Biophysically detailed and mechanistic models, on the other hand, will often have many free parameters by their very nature, thereby muddying the connection to optimality. This tradeoff is an important concern in neuroscientific models. Previous efficient spiking models have often been criticized for their lack of biophysically-plausible characteristics, such as large synaptic weights, dense connectivity, and instantaneous communication. This work is an important contribution in showing that such networks can be modified to be much closer to biophysical reality without losing their essential properties. Though the model presented does suffer from complexity issues which raise questions about its connections to "optimal" efficient coding, the extensive study of various parameter dependencies offers a good characterization of the model and puts its conclusions in context.
We thank the Reviewer for their thorough and accurate assessment of our paper.
Reviewer #3 (Public review):
Summary:
In their paper the authors tackle three things at once in a theoretical model: how can spiking neural networks perform efficient coding, how can such networks limit the energy use at the same time, and how can this be done in a more biologically realistic way than previous work.
They start by working from a long-running theory on how networks operating in a precisely balanced state can perform efficient coding. First, they assume split networks of excitatory (E) and inhibitory (I) neurons. The E neurons have the task to represent some lower dimensional input signal, and the I neurons have the task to represent the signal represented by the E neurons. Additionally, the E and I populations should minimize an energy cost represented by the sum of all spikes. All this results in two loss functions for the E and I populations, and the networks are then derived by assuming E and I neurons should only spike if this improves their respective loss. This results in networks of spiking neurons that live in a balanced state, and can accurately represent the network inputs.
They then investigate in depth different aspects of the resulting networks, such as responses to perturbations, the effect of following Dale's law, spiking statistics, the excitation (E)/inhibition (I) balance, optimal E/I cell ratios, and others. Overall, they expand on previous work by taking a more biological angle on the theory and show the networks can operate in a biologically realistic regime.
Strengths:
* The authors take a much more biological angle on the efficient spiking networks theory than previous work, which is an essential contribution to the field
* They make a very extensive investigation of many aspects of the network in this context, and do so thoroughly
* They put sensible constraints on their networks, while still maintaining the good properties these networks should have
Weaknesses:
* One of the core goals of the paper is to make a more biophysically realistic network than previous work using similar optimization principles. One of the important things they consider is a split into E and I neurons. While this works fine, and they consider the coding consequences of this, it is not clear from an optimization perspective why the split into E and I neurons and following Dale's law would be beneficial. This would be out of scope for the current paper however.
* The theoretical advances in the paper are not all novel by themselves, as most of them (in particular the split into E and I neurons and the use of biophysical constants) had been achieved in previous models. However, the authors discuss these links thoroughly and do more in-depth follow-up experiments with the resulting model.
Assessment and context:
Overall, although much of the underlying theory is not necessarily new, the work provides an important addition to the field. The authors succeeded well in their goal of making the networks more biologically realistic, and incorporate aspects of energy efficiency. For computational neuroscientists this paper is a good example of how to build models that link well to experimental knowledge and constraints, while still being computationally and mathematically tractable. For experimental readers the model provides a clearer link of efficient coding spiking networks to known experimental constraints and provides a few predictions.
We thank the Reviewer for a positive assessment and for pointing out the merits of our work.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
The authors have addressed my previous concerns, and I agree that the manuscript has improved. However, I believe they could still do more to acknowledge two notable mismatches between the model and experimental data.
(1) Stimulus selectivity of excitatory and inhibitory neurons
In the model, excitatory and inhibitory neurons exhibit similar stimulus selectivity, which appears inconsistent with most experimental findings. The authors argue that whether inhibitory neurons are less selective remains an open question, citing three studies in support. However, only one of these studies (Ranyan) was conducted in primary sensory cortex and it is, to my knowledge, one of the few papers showing this (indeed, it's often cited as an exception). The other two studies (Kuan and Najafi) recorded from the parietal cortex of mice trained on decision making tasks, and therefore seem less relevant to the model.
In contrast to the cited studies, the overwhelming majority of the work has found that inhibitory neurons in sensory cortex, in particular those expressing Parvalbumin, are less stimulus selective than excitatory cells. And this is indeed the prevailing view, as summarized by the review from Hu et al. (Science, 2014): "PV+ interneurons exhibit broader orientation tuning and weaker contrast specificity than pyramidal neurons." This view emerged from numerous classical studies, including Sohya et al. (J. Neurosci., 2007), Cardin (J. Neurosci., 2007), Nowak (Cereb. Cortex, 2008), Niell et al. ( J. Neurosci., 2008), Liu (J. Neurosci., 2009), Kerlin (Neuron, 2010), Ma et al. (J. Neurosci., 2010), Hofer et al. (Nature Neurosci. 2011), and Atallah et al. (Neuron 2012). Weak inhibitory tuning has been confirmed by recent studies, such as Sanghavi & Kar (biorxiv 2023), Znamenskiy et al. (Neuron 2024), and Hong et al. (Nature, 2024).
The authors should acknowledge this consensus and cite the conflicting evidence. Failing to do so is cherry picking from the literature. Since training can increase the stimulus selectivity of PV+ neurons to that of Pyr levels, also in primary visual cortex (Khan et al. Neuron 2018), a favourable interpretation of the model is that it represents a highly optimized, if not overtrained, state.
We have carefully considered the literature cited by the Reviewer. We agree with the interpretation that stimulus selectivity of inhibitory neurons in our model is higher than the stimulus selectivity of Parvalbumin-positive inhibitory neurons in the primary sensory cortex of naïve animals. We have edited the text in Discussion (page 14).
(2) Connection probability
The manuscript claims that "rectification sets the overall connection probability to 0.5, consistent with experimental results (Pala & Petersen; Campagnola et al.)." However, the cited studies, and others, report significantly lower probabilities, except for Pyr-PV (E-I connections in the model). For example, Campagnola et al. measured PV-Pyr connectivity at 34% in L2/3 and 20% in L5.
It's perfectly acceptable that the model cannot replicate every detail of biological circuits. But it's important to be cautious when claiming consistency with experimental data.
Here as well, we agree with the Reviewer that the connection probability of 0.5 is consistent with reported connectivity of Pyr-PV neurons, but less so with reported connectivity of PV-Pyr neurons. We have now qualified our claim about compatibility of the connection probability in our model with empirical observations more precise (page 4).
Reviewer #2 (Recommendations for the authors):
I commend the authors for an extremely thorough and detailed rebuttal, and for all of the additional work put in to address the reviewer concerns. For the most part, I am satisfied with the current state of the manuscript.
We thank the Reviewer for recognizing our effort to address the first round of Reviews to our best ability.
Here are some small points still remaining that I think the authors should address:
(1) Pg. 8, "We verified the robustness of the model to small deviations from the optimal synaptic weights" - while the authors now cite Calaim et al. 2022 in the discussion, its relevance to several of the results justify its inclusion in other places. Here is one place where the authors test something that was also studied in this previous paper.
The Reviewer is correct that Calaim et al. (eLife 2022) addressed the robustness of synaptic weights, and we now cited this study when describing our results on jiVering of synaptic connections (page 8).
(2) Pg. 9, "In our optimal E-I network we indeed found that optimal coding efficiency is achieved in absence of within-neuron feedback or with weak adaptation in both cell types" Pg. 10, "the absence of within-neuron feedback or the presence of weak and short-lasting spike-triggered adaptation in both E and I neurons are optimally efficient solutions" The authors seem to state that both weak adaptation and no adaptation at all are optimal. In contrast to the rest of the results presented, this is very vague and does not give a particular level of adaptation as being optimal. The authors should make this more clear.
We agree that the text about optimal level of adaptation was unclear. The optimal solution is no adaptation, while weak and short-lasting adaptation define a slightly suboptimal, yet still efficient, network state, as now stated on page 10.
(3) Pg. 13, "In summary our analysis suggests that optimal coding efficiency is achieved with four times more E neurons than I neurons and with mean I-I synaptic efficacy about 3 times stronger..." --- claims such as these are still too strong, in my opinion. It is rather the case that the particular ratio of E to I neurons and connections strengths can be made consistent with an optimally efficient regime.
We agree here as well. We have revised the text (page 13) to beVer explain our results.
(4) Pg. 14, "firing rates in the 1CT model were highly sensitive to variations in the metabolic constant" (Fig. 8I, as compared to Fig. 6C). This difference between the 1CT and E-I networks is striking, and I would suspect it is due to some idiosyncrasies in the difference between the two models (e.g., the relative amount of delay that it takes for lateral inhibition to take effect, or the fact that E-E connections have not been removed in this model). The authors should ideally back up this result with some justified explanation.
We agree with Reviewer that the delay for lateral inhibition in the E-I model is twice that of the 1CT model and that the E-I model gains stability from the lack of E-E connectivity. Furthermore, the tuning is stronger in I compared to E neurons in the E-I model, which contributes to making the E-I network inhibition-dominated (Fig. 1H). In contrast, the average excitation and inhibition in the 1CT model are of exactly the same magnitude. The property of being inhibition-dominated makes the E-I model more stable. We report these observations in the revised text (pages 14-15).
Reviewer #3 (Recommendations for the authors):
Overall my points were very well responded to and I removed most of my weaknesses.
I appreciate the authors implementing my suggested analysis change for Figure 8, and I find the result very clear. I would further suggest they add a bit of text for the reader as to why this is done. For a new reader without much knowledge of these networks at first it seems the inhibitory population is very good at representation in fig 8G: so why is it not further considered in fig 8H?
We thank the reviewer for providing further suggestions. We now clarified in the text why only the excitatory population of the E-I model is considered in E-I vs 1 cell type model comparison (page 14).
Thanks for sharing the code. From a quick browse through it looks very manageable to implement for follow up work, although some more guidance for how to navigate the quite complicated codebase and how to reproduce specific paper results would be helpful.
We have also updated the code repository, where we have included more complete instructions on how to reproduce results of each figure. We renamed the folders with the computer code so that they point to a specific figure in the paper. The repository has been completed with the output of the numerical simulations we run, which allows immediate replot of all figures. We have deposited the repository at Zenodo to have the final version of the code associated with the DOI ttps://doi.org/10.5281/zenodo.14628524. This is mentioned in the section Code availability (page 17).
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer 1 Public Review:
Summary
This very short paper shows a greater likelihood of C->U substitutions at sites predicted to be unpaired in the SARS-CoV-2 RNA genome, using previously published observational data on mutation frequencies in SARS-CoV-2 (Bloom and Neher, 2023).
General comments
A preference for unpaired bases as a target for APOBEC-induced mutations has been demonstrated previously in functional studies so the finding is not entirely surprising. This of course assumes that A3A or other APOBEC is actually the cause of the majority of C->U changes observed in SARS-CoV-2 sequences.
I'm not sure why the authors did not use the published mutation frequency data to investigate other potential influences on editing frequencies, such as 5' and 3' base contexts. The analysis did not contribute any insights into the potential mechanisms underlying the greater frequency of C->U (or G->U) substitutions in the SARS-CoV-2 genome.
I have added additional discussions of mechanisms focusing on the question of whether basepairing bias is primarily driven by secondary structure dependence of underlying mutation rates or by conservation of secondary structure (Discussion lines 178–192) and I added a brief analysis of the 5′ and 3′ contexts of the relationship between being basepaired in a secondary structure model and apparent mutational fitness (Figures S1 and S2, Results lines 85–97). I found that the 5′ context of unpaired, but not paired basepairs influences apparent mutational fitness (preference for 5′ U), and that the is also . Additionally, there is a 3′ preference for G, indicating some CpG suppression. This contrasts to some degree with another analysis based on counting lineage frequencies that may have lacked power to detect relatively small effects (Simmonds mBio 2024).
Reviewer 1 Author recommendations:
There are at least 5 publications describing the mapping/prediction of SARS-CoV-2 RNA secondary structure from 2022-2023 and their predictions are not entirely consistent. Why did the authors only refer to the Lan et al. paper?
I have added comparisons when the Lan et al secondary structure model is replaced by one of two others derived from SHAPE data (Results lines 110–122). Unsurprisingly, similar secondary structure models give similar results and performance is modestly higher for the models from Lan et al. This is consistent with their observations that DMS reactivities performed better as classifiers of SL5 and ORF1 secondary structure (the reason I compared to this secondary structure model and reactivity data set rather than others), but I did not go into detail on this in the revision since there are many differences in methods beyond class of reactivity probe. For example, somewhat stronger correlation for the Vero than the Huh7 dataset in Lan et al could arise from combining data from two replicates, from cell type, or from differences in data analysis methods. It’s also a small difference and cannot be confidently distinguished from noise.
I conducted a preliminary comparison of the performance of DMS and SHAPE data for predicting mutations where DMS data is available, but I opted against including this analysis in the manuscript for the same reasons. Instead, I included in results and discussion comments on how, in general, reactivity data contains information that is predictive of substitution rates that is not captured by binary secondary structure models. I also discuss how multiple data sources can potentially be integrated to more accurately predict the impact of a substitution on fitness (Discussion lines 195–201).
Specific substitutions are referred to as C->T and C29085T for example, but as the genome of SARS-CoV-2 is RNA, and T should be a U.
I agree and I have changed all “T” to “U” in the paper and analysis scripts. The choice of “T” was motivated by what seemed to appear most frequently in papers on SARS-CoV-2 mutational spectra, but “U” is nearly universal in papers on secondary structure and mutation mechanisms, so I agree it makes more sense in this paper.
The C29085T substitution is somewhat non-canonical as it is a single base bulge in a longer duplex section of dsRNA, very unlike the favoured sites for mutation in the Nakata et al paper.
I have added a discussion of Nakata et al ( NAR 2023) ( Introduction lines 29–32). I did not go into this depth in the revision, but the analysis of ~2M patient sequences in Nakata et al also noted a high rate of UUC→UUU substitution, so the UUUC context of C29095 (shared by 3 of the 10 positions highlighted in Nakata et al that had high mutation frequencies with exogenous APOBEC3A expression) could be interesting to investigate further.
High C29095U substitution frequency is indeed somewhat at odds with the results in that work, which found that UC→UU substitutions to be elevated in longer single-stranded regions than the context of C29095U in SARS-CoV-2 secondary structure models (a single unpaired base opposing three unpaired bases in an asymmetric internal loop).
I'm not sure why DMS reactivity is considered a separate variable from pairing likelihood as one informs the other.
The intent here, which was not clear, was to show that a binary basepairing model that uses DMS reactivities as constraints does not capture all of the information available. I have clarified this in as described above discussing information in different reactivy datasets.
The C29095U substitution is also relavent to the consideration of DMS reactivity in addition to the resulting secondary structure model. These are not considered as separate predictors and the reason for showing both is mentioned in the paper: “DMS reactivity was more strongly correlated with estimated mutational fitness than basepairing when analysis was limited to positions with detectable DMS reactivity.” I have clarified this in the revised manuscript and also it is relevant to the discussion of a potential model integrating all available datasets.
Reviewer 2 Public Review:
Hensel investigated the implications of SARS-CoV-2 RNA secondary structure in synonymous and nonsynonymous mutation frequency. The analysis integrated estimates of mutational fitness generated by Bloom and Neher (from publicly available patient sequences) and a population-averaged model of RNA basepairing from Lan et al (from DMS mutational profiling with sequencing, DMS-MaPseq).
The results show that base-pairing limits the frequency of some synonymous substitutions (including the most common CT), but not all: GA and AG substitutions seem unaffected by base-pairing.
The author then addressed nonsynonymous CT substitutions at base-paired positions. While there is still a generally higher estimated mutational fitness at unpaired positions, they propose a coarse adjustment to disentangle base-pairing from inherent mutational fitness at a given position. This adjustment reveals that nonsynonymous substitutions at base-paired positions, which define major variants, have higher mutational fitness.
Overall, this manuscript highlights the importance of considering RNA secondary structure in viral evolution studies.
The conclusions of this work are generally well supported by the data presented. Particularly, the author acknowledges most limitations of the analyses, and addresses them. Even though no new sequencing results were generated, the author used available data generated from the analysis of roughly seven million sequenced patient samples. Finally, the author discusses ways to improve the current available models.
There are a number of limitations of this work that should be highlighted, specifically in regard to the secondary structure data used in this paper. The Lan et al. dataset was generated using a multiplicity of infection (MOI) of 0.05, 24 hours post-infection (h.p.i.). At such a low MOI and late timepoint, viral replication is not synchronous and sequencing artifacts might be generated by cell debris and viral RNA degradation, therefore impacting the population-averaged results. In addition, the nonsynonymous base-paired positions in Figure 2 have relatively high population-averaged DMS reactivity, which suggests those positions are dynamic. Therefore, the proposed adjustment could result in an incorrect estimation of their inherent mutational fitness.
I would go further than this to say that the proposed adjustmentment will usually result in an incorrect estimate. My intent is to propose an improved, but still likely incorrect, estimate by utilizing in vitro data to refine baseline mutation rates in order to obtain improved, but only coarsely adjusted, estimates of mutational fitness. I added a note in the discussion that in vitro reactivities (and, consequently, secondary structure models) may not reflect secondary structures in vivo ( Discussion lines 204–205). I did not go into detail regarding the specific technical considerations mentioned here because they are outside the scope of my expertise.
I am not sure that top-ranked non-synonymous C→U positions have particularly high DMS values after coarse adjustment for basepairing (labeled amino acid mutations in Figure 2). Of the six common mutations used as examples, three have minimum values in the dataset considered (which is processed normalized/filtered data rather than raw data) and three do not have very high DMS reactivity.
However, there is clearly information in base reactivity that is not captured by a binary basepairing model, which is indicated by residual positive correlation between DMS reactivity and mutational fitness after adjustment. I now include a figure demonstrating this for synonymous C→U substitutions as Figure S3, and I have tried to clarify the language throughout the manuscript to make it clear that a more accurate adjustment is possible.
Additionally, like all such RNA probing experiments within cells, it remains difficult to deconvolve DMS/SHAPE low reactivity with RNA accessibility (e.g. from protein binding).
I agree, and in revising this manuscript it was interesting to see that Nakata et al ( discussed above) identified relatively large single-stranded regions with enhanced UC→UU substitution frequencies with exogenous APOBEC3A expression, while C29095U, for example, is a single unpaired base with high DMS reactivity and high empirical C→U substitution frequency (discussed briefly in the introduction of the revised manuscript). Future analyses could consider heterogeneity in secondary structure as well as secondary structures with low heterogeneity where strained conformations could have higher reactivity.
This work presents clear methods and an easy-to-access bioinformatic pipeline, which can be applied to other RNA viruses. Of note, it can be readily implemented in existing datasets. Finally, this study raises novel mechanistic questions on how mutational fitness is not correlated to secondary structure in the same way for every substitution.
Overall, this work highlights the importance of studying mutational fitness beyond an immune evasion perspective. On the other hand, it also adds to the viral intrinsic constraints to immune evasion.
Reviewer 2 Author recommendations:
Even though the experiment was not performed in this manuscript, it would be helpful for the readers if it was briefly explained how secondary structure is inferred from DMS reactivity, as this technique is not broadly used.
It is not objective to refer to the Lan et al. model of RNA structure as "high quality" given the limitations of their experimental approach (low MOI, asynchronous infection, DMS-only, no long-range interactions) and the lack of external validation of the structure of the genome they propose.
I removed “high-quality” from the abstract. Since a result of the paper is that secondary structure correlates with synonymous substitution rates, this is an observation that can be used to retrospectively compare the quality of secondary structure models in this respect. I updated the manuscript to include such a comparison, and did not find a large difference between secondary structure models (Results lines 110–122). I added a discussion of how multiple data sources can potentially be integrated to more accurately predict the impact of a substitution of viral fitness.
I have also added a brief discussion of constraints on how much we can confidently infer from these experiments given limitations of the experimental approach. I note that DMS and SHAPE data provide information that can be combined to make a stronger model, and that predictions can be rapidly tested given observations by Gout (Symonds?) et al that in vitro substitution rates correlate with those observed during the pandemic (Discussion lines 195–201).
Mutational fitness from Bloom & Neher was derived throughout the pandemic, much of which came from a period with the most active surveillance (Delta / Omicron waves). Consequently, these viruses differ from the WA1 strain used by Lan et al. far more than the 3 nt differences between lineage A and B that the author refers to. The following sentence should therefore be revised to avoid misleading the reader:
"Additionally, note that DMS data was obtained in experiments using the WA1 strain in Lineage A, which differs from the more common Lineage B at 3 positions and could have different secondary structure."
Revised:
“Additionally, note that DMS data was obtained in experiments using the WA1 strain in Lineage A, which differs from the more common Lineage B at 3 positions and could have different secondary structure. Furthermore, mutational fitness is estimated from the phylogenetic tree of published sequences (the public UShER tree (Turakhia et al., 2021) additionally curated to filter likely artifacts such as branches with numerous reversions) that are typically far more divergent and subsequently will have somewhat different secondary structures. Since the dataset used for mutational fitness aggregates data across viral clades, my analysis will not capture secondary structure variation between clades or indels and masked sites that were not considered in that analysis (Bloom and Neher, 2023).”
To determine the extent to which the results depend on the single RNA structure model, it would be informative "turn the crank again" on the analysis with one of the other RNA structure datasets for SARS-CoV-2 (though most other datasets suffer from similar problems of asynchronicity of infection).
I have added comparisons when the Lan et al secondary structure model is replaced by one of two others derived from SHAPE data as described above. Also, I conducted preliminary comparisons of underlying DMS and SHAPE reactivity data as described above, but I opted not to include these in the revised manuscript given that methods different beyond the chemical probe used. I also discuss how multiple data sources can potentially be integrated to more accurately predict the impact of a substitution of viral fitness.
In Figure 1 it would be helpful to add the values of the unpaired/basepaired ratios in the plot for clarity.
Furthermore, a similar analysis using the substitution frequency, which strengthens the conclusions, is mentioned in the text, however, it is not shown. It could be shown as part of Figure 1, or as a supplementary figure.
This was a good suggestion since numbers around 1 are not perceived as being very significant. I added the ratio of median unpaired:paired rates to Figure 1, updated the corresponding manuscript text and the figure caption, and note that the numbers are somewhat changed from the first version of my manuscript because of updating to use the most up-to-date mutational fitness estimates.
It is not clear how the two constants were calculated to obtain the "adjusted mutational fitness". It could be shown as part of Figure 2, or as a supplementary figure.
I added dashed lines and arrows to Figure 2 showing median paired/unpaired mutational fitnesses and the adjustment made to normalize to the overall median. I also added Figure S3 showing this for synonymous substitutions, where it is more clear given the lower fraction of mutations with substantial fitness impacts.
Minor comments
Statements like "the current fast-growing lineage JN.1.7" never age well... please revise to state the period of time to which this refers.
Revised:
“…lineage JN.1.7, which had over 20% global prevalence in Spring 2024…”
Also, I checked the list of mutations and the examples given remain in the top 15 ranked basepaired, non-synonymous C→U mutations (BA.2-defining C26060U is added to the list, but I did not update to include this). It replaces C9246U, which was not mentioned in the first version of the manuscript.
Similarly, please provide context for the reader in the phrase: "This was one mutation that characterized the B.1.177 lineage" (e.g. add its early reference as "EU1" and that it predominated in Europe in autumn 2020, prior to the emergence of the Alpha variant).
Revised to add detail:
This was one of the mutations that characterized the B.1.177 lineage. This lineage, also known as EU1, characterized a majority of sequences in Spain in summer 2020 and eventually in several other countries in Europe prior to the emergence of the Alpha variant. However, it was unclear whether or this lineage had higher fitness than other lineages or if A222V specifically conferred a fitness advantage.
"massive sequencing of SARS-CoV-2" - the meaning of the word "massive" is unclear. Revise.
Revised “…millions of patient SARS-CoV-2 sequences published during the pandemic…”
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
We were pleased that many of the critical comments of the reviewers have allowed us to improve our manuscript. In addition to revise the originally submitted figures, we performed new experiments (e.g. new Fig.2, Fig.3, Fig.4, and Fig.6) and revised the manuscript substantially following the reviewers’ comments and suggestions to our initial submission. A point-by-point response to the reviewers’ critiques are summarized below, and new supportive data are provided in this revised manuscript. Per the Reviewers’ comments and revisions, we revised the title to be “Cold induces brain region-selective cell activity-dependent lipid metabolism”.
Reviewer #1:
Strengths:
A strength of the study is trying to better understand how metabolism in the brain is a dynamic process, much like how it has been viewed in other organs. The authors also use a creative approach to measuring in vivo lipid peroxidation via delivery of a BD-C11 sensor through a cannula to the region in conjunction with fiber photometry to measure fluorescence changes deep in the brain.
We thank the Reviewer so much for the positive comments on this interesting study on metabolism in the brain.
Weaknesses:
One weakness was many of the experiments were done in a manner that could not distinguish between the contributions of neurons and glial cells, limiting the extent of conclusions that could be made. While this is not easily doable for all experiments, it can be done for some. For example, the Fos experiments in Figure 3 would be more conclusive if done with the labeling of neuronal nuclei with NeuN, as glial cells can also express Fos. To similarly show more conclusively that neurons are being activated during cold exposure, the calcium imaging experiments in Figure S3 can be done with cold exposure.
We agreed with the Reviewers’ comments. We revised the original Figure 3 (new Figure 6) and Figure S3 (new Figure S4). Our data show that cold increased Fos-positive cells in the PVH (Figure 6) and increased neuronal Ca2+ signals (new Figure S4). As it is difficult to exclude the involvements of astrocytes in the cold-induced lipid metabolism, and to address this reviewer’s questions, we revised the title and the text with replacing “neuronal” with “‘cell” activity, and we concluded that cold induced lipid metabolism depending on “cell activity” instead of “neuronal activity”. Studying cell type-specific contributions to the cold-induced effects on lipid metabolism will require many efforts beyond the scope of this study, to which we assumed that both neurons and glial cells contribute.
Additionally, many experiments are only done with the minimal three animals required for statistics and could be more robust with additional animals included.
We thank this reviewer for the comments. We added the sample sizes accordingly in this revised manuscript.
Another weakness is that the authors do not address whether manipulating lipid droplet accumulation or lipid peroxidation has any effect on PVH function (e.g. does it change neuronal activity in the region?).
We thank this reviewer for bringing up this interesting point. The focus of this study was to examine how cold modulates lipid metabolism in the brain, while it is another interesting project studying how brain lipid metabolism (e.g. manipulating LD accumulation or lipid peroxidation) modulates neuronal activity, which however will require many efforts beyond the scope of this study. Manipulating LD or peroxidation would affect multiple cellular signaling pathways and physiological experimental conditions need to be developed. However, to address this reviewer’s questions, we performed preliminary studies with treating brain slices with the lipid peroxidation inhibitor a-TP and recorded PVH neurons, but did not observe differences in firing rates in a-TPtreated brain slices and controls (Data not shown).
Reviewer #2:
Strengths:
A set of relatively novel and interesting observations. Creative use of several in vivo sensors and techniques.
We thank the Reviewer so much for the positive comments on our studies in both concept and techniques.
Weaknesses:
(1) The physiological relevance of lipolysis and thermogenesis genes in the PVH. The authors need to provide quantitative and substantial characterizations of lipid metabolism in the brain beyond a panel of qPCRs, especially considering these genes are likely expressed at very low levels. mRNA and protein level quantification of genes in Fig 1, in direct comparison to BAT/iWAT, should be provided. Besides bulk mRNA/protein, IHC/ISH-based characterization should be added to confirm to cellular expression of these genes.
We agreed with the Reviewer’s comments and thank this reviewer for the constructive suggestions. To address this reviewer’s comments and suggestions, we performed additional experiments to verify cold-induced expressions of lipid lipolytic genes and proteins. For example, we stained ATGL and HSL in both neurons and astrocytes in the PVH. Matching with the increased gene expressions, cold increased protein expressions of ATGL (new Figure 2) and HSL (new Figure 3) in both neurons and astrocytes. We also performed western blots of p-HSL and HSL and observed that cold increased the expression level of p-HSL (new Figure 4). These new results support our conclusions and further demonstrate that cold increases lipid metabolism in the PVH.
(2) The fiberphotometry work they cited (Chen 2022, Andersen 2023, Sun 2018) used well-established, genetically encoded neuropeptide sensors (e.g., GRABs). The authors need to first quantitatively demonstrate that adapting BD-C11 and EnzCheck for in vivo brain FP could effectively and accurately report peroxidation and lipolysis. For example, the sensitivity, dynamic range, and off-time should all be calibrated with mass spectrometry measurements before any conclusions can be made based on plots in Figures 4, 5, and 6. This is particularly important because the main hypothesis heavily relies on this unvalidated technique.
We thank this reviewer’s comments. Fiber photometry has been well demonstrated to detect fluorescent-labelled biomolecules in my laboratory and other labs, as indicated in the above stated publications. In this study, we combined photometry with the well commercially developed and validated lipid metabolic fluorescent-labelled biomarkers to monitor lipid metabolic dynamics in vivo. We indeed verified this approach in both brain (this study) and peripheral adipose tissues (another project). Particularly, our data in this study show that lipid peroxidation inhibitor a-TP blocked the cold-induced lipid peroxidation signals (Fig. 7A-C) and the pan-lipase inhibitor DEUP blocked the cold-induced lipolytic signals (Fig. 8A-C). These results demonstrate that the signals detected by photometry indeed reflect lipid peroxidation and lipolysis respectively in the brain. Meanwhile, we agreed with the reviewer’s suggestions on mass spectrometry measurements, while it is not feasible for us to perform the spectrometry in the brain in vivo at this moment.
(3) Generally, the histology data need significant improvement. It was not convincing, for example, in Figure 3, how the Fos+ neurons can be quantified based on the poor IF images where most red signals were not in the neurons.
We thank this reviewer for this comment. We performed additional experiments to add sample size and presented high quality images.
(4) The hypothesis regarding the direct role of brain temperature in cold-induced lipid metabolism is puzzling. From the introduction and discussion, the authors seem to suggest that there are direct brain temperature changes in responses to cold, which could be quite striking. However, this was not supported by any data or experiments. The authors should consolidate their ideas and update a coherent hypothesis based on the actual data presented in the manuscript.
We thank this reviewer for bringing up this comment and constructive suggestions. To make this study more concise on the cold-induced lipid metabolism, we removed the statements related to the brain temperature.
Reviewer #1 (Recommendations For The Authors):
An additional minor weakness is that the authors are redundant in their discussion, sometimes repeating sections from the introduction (e.g. this line in the discussion "Evidence shows that the brain's energy expenditure efficiency largely depends on the temperature (Yu et al., 2012), and temperature gradients between different brain regions exist (Anderson and Moser, 1995; Delgado and Hanai, 1966; Hayward and Baker, 1968; McElligott and Melzak, 1967; Moser and Mathiesen, 1996; Thornton, 2003)").
We thank the Reviewer for these comments. We revised the text following the suggestions accordingly and removed the statements and references related to brain temperatures.
-
-
www.medrxiv.org www.medrxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public review):
Summary:
IPF is a disease lacking regressive therapies which has a poor prognosis, and so new therapies are needed. This ambitious phase 1 study builds on the authors' 2024 experience in Sci Tran Med with positive results with autologous transplantation of P63 progenitor cells in patients with COPD. The current study suggests that P63+ progenitor cell therapy is safe in patients with ILD. The authors attribute this to the acquisition of cells from a healthy upper lobe site, removed from the lung fibrosis. There are currently no cell-based therapies for ILD and in this regard the study is novel with important potential for clinical impact if validated in Phase 2 and 3 clinical trials.
Strengths:
This study addresses the need for an effective therapy for interstitial lung disease. It offers good evidence that the cells used for therapy are safe. In so doing it addresses a concern that some P63+ progenitor cells may be proinflammatory and harmful, as has been raised in the literature (articles which suggested some P63+ cells can promote honeycombing fibrosis; references 26 &35). The authors attribute the safety they observed (without proof) to the high HOPX expression of administered cells (a marker found in normal Type 1 AECs. The totality of the RNASeq suggests the cloned cells are not fibrogenic. They also offer exploratory data suggesting a relationship between clone roundness and PFT parameters (and a negative association between patient age and clone roundness).
We thank the reviewer for the important comments.
Weaknesses:
The authors can conclude they can isolate, clone, expand, and administer P63+ progenitor cells safely; but with the small sample size and lack of a placebo group, no efficacy should be implied.
We thank the reviewer for the suggestion and agree that we should be more cautious to discuss the efficacy of current study.
Specific points:
(1) The authors acknowledge most study weaknesses including the lack of a placebo group and the concurrent COVID-19 in half the subjects (the high-dose subjects). They indicate a phase 2 trial is underway to address these issues.
N/A
(2) The authors suggest an efficacy signal on pages 18 (improvement in 2 subjects' CT scans) and 21 (improvement in DLCO) but with such a small phase 1 study and such small increases in DLCO (+5.4%) the authors should refrain from this temptation (understandable as it is).
We believe that exploring potential efficacy signal is also one aim of this study. All these efficacy endpoint analyses had been planned in prior to the start of clinical trials (as registered in ClinicalTrial.gov) and the data need be analyzed anyhow.
(3) Likewise most CT scans were unchanged and those that improved were in the mid-dose group (albeit DLCO improved in the 2 patients whose CT scans improved).
Yes, it is.
(4) The authors note an impressive 58m increase in 6MWTD in the high-dose group but again there is no placebo group, and the low-dose group has no net change in 6MWTD at 24 weeks.
Yes.
(5) I also raise the question of the enrollment criteria in which 5 patients had essentially normal DLCO/VA values. In addition there is no discussion as to whether the transplanted stem cells are retained or exert benefit by a paracrine mechanism (which is the norm for cell-based therapies).
Thank you for your detailed feedback. The enrollment criteria are based on DLCO instead of DLCO/VA. And we would like to further discuss the possible benefit by paracrine mechanism in the revised manuscript.
Recommendations for the authors:
(1) Four of the enrolled subjects had normal DLCO/VA (% of predicted) (>90% of predicted). This raises questions about the severity of their illness see: Table 1: Subjects 103, 105, 112, and 204 have DLCO/VA % predicted >90% of predicted and would appear not to qualify for the study. While technically enrollment criteria for DLCO are satisfied, DLCO/VA is an equally valid measure of ILD severity, and these 4 cases seem very mild.
Thank you for your detailed feedback. Yes, the current inclusion criteria is based on DLCO but not DLCO/VA. And we believe improvement of DLCO and DLCO/VA is both meaningful. In future trial, we will consider DLCO/VA as inclusion criteria as well.
(2) The authors state "Resolution of honeycomb lesion was also observed in patients of higher dose groups". This appears inaccurate as only 2 subjects in the study showed CT improvement and they were not in the highest dose group. This statement is an overreach for a Phase 1 study and should be removed from the abstract and more balance inserted in the text. The phase 2 study they are doing will answer these questions.
Thank you. We changed our statement about efficacy in the abstract part.
a) Under exclusion criteria: More detail is required as to what defines "subjects who cannot tolerate cell therapy".
Those patients cannot tolerate previous cell therapy, for example mesenchymal stem cell transplantation, would not be included in the current trial.
b) Figure S6 is important and should be in the main manuscript. This Figure shows that many (6) subjects had COVID at some trial measurement time points. This is an unfortunate confounder for efficacy signals (but efficacy is not the point of this study). Second, Figure 6 (in my view) shows little efficacy signal, which is a reminder to the authors that efficacy should not be implied in a study that was not powered to detect efficacy.
We agreed that the efficacy should be discussed very carefully.
(3) Figure S3: It appears at some does there is a significant rise in monocytes (1M cells) and neutrophils (3 M cells).
Thank you for your reasonable concerns regarding the safety of the treatment. The monocyte counts in the S3 patients, even after an increase, remains within the reference range, and therefore we consider this elevation to be clinically meaningless. One patient exhibited a significant increase in neutrophils at 24 weeks, which was attributed to a grade II adverse event, acute bronchitis, which was unrelated to cell therapy. The symptoms resolved within three days following treatment with appropriate medication.
(4) Figure 3: I wonder about the statistical significance of the 6MWD. Was this done by repeat measure ANOVA? The analysis suggests a p=0.08 but all error bars between low and high dose overlap and the biggest difference is at 24 weeks, and that appears to be labelled as not significant.
Thank you for your kind reminding. The 6MWD result with a p-value of 0.008 was derived to compare the improvement in 6MWD at the 24-week time point versus baseline within the higher group. Therefore, a paired t-test was used for this analysis. In the revised version, we label them more clearly.
Reviewer #2 (Public review):
Summary:
This manuscript describes a first-in-human clinical trial of autologous stem cells to address IPF. The significance of this study is underscored by the limited efficacy of standard-of-care anti-fibrotic therapies and increasing knowledge of the role p63+ stem cells in lung regeneration in ARDS. While models of acute lung injury and p63+ stem cells have benefited from widespread and dynamic DAD and immune cell remodeling of damaged tissue, a key question in chronic lung disease is whether such cells could contribute to the remodeling of lung tissue that may be devoid of acute and dynamic injury. A second question is whether normal regions of the lung in an otherwise diseased organ can be identified as a source of "normal" p63+ stem cells, and how to assess these stem cells given recently identified p63+ stem cell variants emerging in chronic lung diseases including IPF. Lastly, questions of feasibility, safety, and efficacy need to be explored to set the foundation for autologous transplants to meet the huge need in chronic lung disease. The authors have addressed each of these questions to different extents in this initial study, which has yielded important if incomplete information for many of them.
Strengths:
As with a previous study from this group regarding autologous stem cell transplants for COPD (Ref. 24), they have shown that the stem cells they propagate do not form colonies in soft agar or cancers in these patients. While a full assessment of adverse events was confounded by a wave of Covid19 infections in the study participants, aside from brief fevers it appears these transplants are tolerated by these patients.
We thank the reviewer for the important comments.
Weaknesses:
The source of stem cells for these autologous transplants is generally bronchoscopic biopsies/brushings from 5th-generation bronchi. Although stem cells have been cloned and characterized from nasal, tracheal, and distal airway biopsies, the systematic cloning and analysis of p63+ stem cells across the bronchial generations is less clear. For instance, p63+ stem cells from the nasal and tracheal mucosa appear committed to upper airway epithelia marked by 90% ciliated cells and 10% goblet cells (Kumar et al., 2011. Ref. 14). In contrast, p63+ stem cells from distal lung differentiate to epithelia replete with Club, AT2, and AT1 markers. The spectrum of p63+ stem cells in the normal bronchi of any generation is less studied. In the present study, cells are obtained by bronchoscopy from 3-5 generation bronchi and expanded by in vitro propagation. Single-cell RNA-seq identifies three clusters they refer to as C1, C2, and C3, with the major C1 cluster said to have characteristics of airway basal cells and C2 possibly the same cells in states of proliferation. Perhaps the most immediate question raised by these data is the nature of the C1/C2 cells. Whereas they are clearly p63/Krt5+ cells as are other stem cells of the airways, do they display differentiation character of "upper airway" marked by ciliated/goblet cell differentiation or those of the lung marked by AT2 and AT1 fates? This could be readily determined by 3-D differentiation in so-called airliquid interface cultures pioneered by cystic fibrosis investigators and should be done as it would directly address the validity of the sourcing protocol for autologous cells for these transplants. This would more clearly link the present study with a previous study from the same investigators (Shi et al., 2019, Ref. 9) whereby distal airway stem cells mitigated fibrosis in the murine bleomycin model. The authors should also provide methods by which the autologous cells are propagated in vitro as these could impact the quality and fate of the progenitor cells prior to transplantation.
We totally agree that the sub-population of the progenitor cells should be further analyzed. We would try this in the revised manuscript. And the methods to expand P63+ lung progenitor cells have been described in full details by Frank McKeon/Wa Xian group (Rao, et.al., STAR Protocols, 2020), which is adapted to pharmaceutical-grade technology patented by Regend Therapeutics, Ltd.
The authors should also make a more concerted effort to compare Clusters 1, 2, and 3 with the variant stem cell identified in IPF (Wang et al., 2023, Ref. 27). While some of the markers are consistent with this variant stem cell population, others are not. A more detailed informatics analysis of normal stem cells of the airways and any variants reported could clarify whether the bronchial source of autologous stem cells is the best route to these transplants.
We thank for reviewer for the good suggestion and would like to make more detailed comparison in the revised manuscript.
Other than these issues the authors should be commended for these firstin-human trials for this important condition.
Thank you so much for the kind compliment.
Recommendations for the authors:
Described in the review text but the authors need to be clear about how they propagated autologous stem cells in vitro.
(1) Perhaps the most immediate question raised by these data is the nature of the C1/C2 cells. Whereas they are clearly p63/Krt5+ cells as are other stem cells of the airways, do they display differentiation character of "upper airway" marked by ciliated/goblet cell differentiation or those of the lung marked by AT2 and AT1 fates?
The differentiation potential of the P63+/KRT5+ basal progenitor cells have been analyzed in multiple previous literatures, which are mentioned in the revised introduction part. Basically, the human P63+ progenitor cells can differentiate into airway epithelial cells in the airway area, while give rise to immature, but functional AT1 cells in alveolar area.
(2) The authors should also provide methods by which the autologous cells are propagated in vitro as these could impact the quality and fate of the progenitor cells prior to transplantation.
The methods to expand P63+ lung progenitor cells have been described in full details by Frank McKeon/Wa Xian group (Rao, et.al., STAR Protocols, 2020), which is adapted to pharmaceutical-grade technology patented by Regend Therapeutics, Ltd.
(3) A more detailed informatics analysis of normal stem cells of the airways and any variants reported could clarify whether the bronchial source of autologous stem cells is the best route to these transplants.
We thank the reviewer for the kind suggestion and have included the comparative analysis in revised Figure S2.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public review):
Summary:
In this manuscript, authors have investigated the effects of JNK inhibition on sucrose-induced metabolic dysfunction in rats. They used multi-tissue network analysis to study the effects of the JNK inhibitor JNK-IN-5A on metabolic dysfunction associated with excessive sucrose consumption. Their results show that JNK inhibition reduces triglyceride accumulation and inflammation in the liver and adipose tissues while promoting metabolic adaptations in skeletal muscle. The study provides new insights into how JNK inhibition can potentially treat metabolic dysfunction-associated fatty liver disease (MAFLD) by modulating inter-tissue communication and metabolic processes.
Strengths:
The study has several notable strengths:
Comprehensive Multi-Tissue Analysis: The research provides a thorough multi-tissue evaluation, examining the effects of JNK inhibition across key metabolically active tissues, including the liver, visceral white adipose tissue, skeletal muscle, and brain. This comprehensive approach offers valuable insights into the systemic effects of JNK inhibition and its potential in treating MAFLD.
Robust Use of Systems Biology: The study employs advanced systems biology techniques, including transcriptomic analysis and genome-scale metabolic modeling, to uncover the molecular mechanisms underlying JNK inhibition. This integrative approach strengthens the evidence supporting the role of JNK inhibitors in modulating metabolic pathways linked to MAFLD.
Potential Therapeutic Insights: By demonstrating the effects of JNK inhibition on both hepatic and extrahepatic tissues, the study offers promising therapeutic insights into how JNK inhibitors could be used to mitigate metabolic dysfunction associated with excessive sucrose Behavioral and Metabolic Correlation: The inclusion of behavioral tests alongside metabolic assessments provides a more holistic view of the treatment's effects, allowing for a better understanding of the broader physiological implications of JNK inhibition.
Weaknesses:
While the study provides a comprehensive evaluation of JNK inhibitors in mitigating MAFLD conditions, addressing the following points will enhance the manuscript's quality:
The authors should explicitly mention and provide a detailed list of metabolites affected by sucrose and JNK inhibition treatment that have been previously associated with MAFLD conditions. This will better contextualize the findings within the broader field of metabolic disease research.
We fully agreed on this constructive suggestion to improve our understanding of the metabolic effect of JNK inhibition under sucrose overconsumption. While technical limitations made it challenging to directly analyze metabolites in the current study, we employed genome-scale metabolic modeling—a robust approach for studying metabolism—to predict the metabolic pathways potentially impacted by the interventions (Fig. 7 and Data S8). Additionally, as part of this revision, we conducted an extensive literature review to identify metabolites previously reported to be affected by sucrose consumption in MAFLD rodent models and MASLD patients. A detailed summary of these metabolites is now presented in attached Table 1 and several of these metabolites have been incorporated into the revised results section (Lines 308-314) to support some of the predicted metabolic activities.
“Some of the predicted metabolic changes align with previous findings in rodents subjected to sucrose overconsumption. For example, Öztürk et al. reported altered tryptophan metabolism, including decreased serum levels of kynurenic acid and kynurenine, in rats consuming 10% sucrose in drinking water. Similarly, increased triglyceride-bound oleate, palmitate, and stearate were observed in the livers of rats fed a 10% sucrose solution, indicating JNK-IN-5A treatment may regulate lipid metabolism by modulating these metabolic activities.”
It is important to note, however, that data on metabolites specifically affected by JNK inhibition in MASLD contexts remains lacking in the literature. The predicted metabolites and associated metabolic pathways in the current study could provide a starting point for such exploration in future studies. We have emphasized this in the revised manuscript and highlighted the need for further studies to explore these mechanisms in greater detail.
Author response table 1.
Metabolites associated with sucrose overconsumption in MASLD.
The limitations of the study should be clearly stated, particularly the lack of evidence on the effects of chronic JNK inhibitor treatment and potential off-target effects. Addressing these concerns will offer a more balanced perspective on the therapeutic potential of JNK inhibition.
Thank you for this constructive comment. We have acknowledged limitations of the current study in Discussion section (Lines 397-406) of the revised manuscript:
“Nevertheless, several limitations warrant consideration. First, while we observed transcriptional adaptations in skeletal muscle tissue following treatment, the exact molecular mechanisms underlying these changes and their roles in skeletal muscle function and systemic metabolic homeostasis remain unclear. Further investigation is warranted to elucidate the muscle-specific effects of JNK inhibition. Second, our study did not investigate the dosedependent or potential off-target effects of JNK-IN-5A, particularly its activity on other members of the kinase family and associated signaling pathways. Lastly, the long-term effects of JNKIN-5A administration remain unexplored. Understanding its prolonged impact across different stages of MAFLD, including advanced MASH, is crucial for assessing the full therapeutic potential of JNK inhibition in the treatment of MAFLD.“
The potential risks of using JNK inhibitors in non-MAFLD conditions should be highlighted, with a clear distinction made between the preventive and curative effects of these therapies in mitigating MAFLD conditions. This will ensure the therapeutic implications are properly framed.
Thank you for this insightful suggestion. The potential risks of using JNK inhibitors in nonMAFLD conditions have been considered and are now highlighted in Lines 369-390 of the revised discussion
“Although overactivated JNK activity presents an attractive opportunity to combat MAFLD, inhibition of JNK presents substantial challenges and potential risks due to its broad and multifaceted roles in many cellular processes. One key challenge is the dual role of JNK signaling (Lamb et al., 2003). For instance, long-term JNK inhibition may disrupt liver regeneration, as JNK plays a critical role in liver repair by regulating hepatocyte proliferation and survival following injury or stress (Papa and Bubici, 2018). In HCC, it has been reported that JNK acts as both a tumor promoter, driving inflammation, fibrosis, and metabolic dysregulation, and a tumor suppressor, facilitating apoptosis and cell cycle arrest in damaged hepatocytes. Its inhibition, therefore, carries the risk of inadvertently promoting tumor progression under certain conditions (Seki et al., 2012). Furthermore, the differential roles of JNK isoforms (JNK1, JNK2, JNK3) and a lack of specificity of JNK inhibitors present another layer of complexity. Given these challenges, while our study demonstrated the potential of JNK-IN-5A in mitigating early metabolic dysfunction in the liver and adipose tissues, JNK targeting strategies should be carefully tailored to the disease stage under investigation. For curative approaches targeting advanced MAFLD, such as MASH, future studies are warranted to address considerations related to dosing, tissue specificity, and the long-term effects.”
The statistical analysis section could be strengthened by providing a justification for the chosen statistical tests and discussing the study's power. Additionally, a more detailed breakdown of the behavioral test results and their implications would be beneficial for the overall conclusions of the study.
We would like to thank you for this constructive suggestion. In this study, differences among more than two groups were tested using ANOVA or Kruskal-Wallis test based on the normality testing (Shapiro–Wilk test) on the data (continuous variables from different measurements). Pairwise comparisons, were performed using Tukey’s post hoc test following ANOVA or Dunn’s multiple comparisons post hoc test following the Kruskal-Wallis test, as appropriate.
The study used 11 animals per group, a group size widely used in preclinical animal research [13]. To evaluate the power of this study design to detect group differences, we conducted a power analysis using G*Power 3.1 software [14], with ANOVA used as an example. The power analysis revealed the following:
- For a small effect size (partial eta.sq = 0.01), the power was 7.5% at 𝑝<0.05.
- For a medium effect size (partial eta.sq = 0.06), the power was 23.7% at 𝑝<0.05.
- For a large effect size (partial eta.sq = 0.14), the power is 55.4% at 𝑝<0.05
Bonapersona et al. reported that the median statistical power in animal studies is often between 15–22% [15], the achieved power of the current study design is within the range observed in most exploratory animal research. However, we acknowledge that the power for detecting smaller effects within groups is limited, which is also a common challenge in animal research due to ethical considerations on increasing sample sizes.
As suggested, we’ve revised the ‘Statistical Analysis’ and ‘Result’ sections to improve clarity:
“Statistical Analysis:
Data were shown as mean ± standard deviation (SD), unless stated otherwise. The assumption of normality for continuous variables from behavior test, biometric measurements, and plasm biochemistry was determined using the Shapiro–Wilk test. Differences among multiple groups were tested by ANOVA or, for data that were not normally distributed, the non-parametric Kruskal-Wallis test. Pairwise comparisons were performed using Tukey’s post hoc test following the ANOVA or Dunn’s multiple comparisons post hoc test following the Kruskal-Wallis test, as appropriate. The Jaccard index was used to evaluate the similarity and diversity of two gene sets, and a hypergeometric test was used to test the significance of their overlap. All results were considered statistically significant at p < 0.05, unless stated otherwise.”
Behavior tests (Lines 150-157):
“We found no significant differences among groups in retention latencies, a measure of learning and memory abilities in passive avoidance test (Data S3). Additionally, the locomotor activity test was used to analyze behaviors such as locomotion, anxiety, and depression in rat. No significant differences were observed among groups in stereotypical movements, ambulatory activity, rearing, resting percentage, and distance travelled (Data S4). Similarly, the elevated plus maze test (Walf and Frye, 2007), an assay for assessing anxiety-like behavior in rodents, showed that rats in all groups had comparable open-arm entries and durations (Data S5). Collectively, the behavior tests indicate the JNK-IN-5A-treated rats exhibit no evidence of anxiety and behavior disorders.”
Reviewer #2 (Public review):
Summary:
Excessive sucrose is a possible initial factor for the development of metabolic dysfunctionassociated fatty liver disease (MAFLD). To investigate the possibility that intervention with JNK inhibitor could lead to the treatment of metabolic dysfunction caused by excessive sucrose intake, the authors performed multi-organ transcriptomics analysis (liver, visceral fat (vWAT), skeletal muscle, and brain) in a rat model of MAFLD induced by sucrose overtake (+ a selective JNK2 and JNK3 inhibitor (JNK-IN-5A) treatment). Their data suggested that changes in gene expression in the vWAT as well as in the liver contribute to the pathogenesis of their MAFLD model and revealed that the JNK inhibitor has a cross-organ therapeutic effect on it.
Strengths:
(1)It has been previously reported that inhibition of JNK signaling can contribute to the prevention of hepatic steatosis (HS) and related metabolic syndrome in other models, but the role of JNK signaling in the metabolic disruption caused by excessive intake of sucrose, a possible initial factor for the development of MAFLD, has not been well understood, and the authors have addressed this point.
(2)This study is also important because pharmacological therapy for MAFLD has not yet been established.
(3)By obtaining transcriptomic data in multiple organs and comprehensively analyzing the data using gene co-expression network (GCN) analysis and genome-scale metabolic models (GEM), the authors showed the multi-organ interaction in not only in the pathology of MAFLD caused by excessive sucrose intake but also in the treatment effects by JNK-IN-5A.
(4) Since JNK signaling has diverse physiological functions in many organs, the authors effectively assessed possible side effects with a view to the clinical application of JNK-IN-5A.
Weaknesses:
(1) The metabolic process activities were evaluated using RNA-seq results in Figure 7, but direct data such as metabolite measurements are lacking.
Thank you for these valuable insights. We fully agree that direct metabolite measurements would provide a deeper understanding of the metabolic impact of sucrose overconsumption and JNK-IN-5A administration. Unfortunately, due to technical limitations, we were unable to directly measure metabolites in this study. To address this, we supported our genome-scale metabolic modeling predictions with an extensive literature review, which is summarized in attached Table 1. This table highlights key metabolites and associated metabolic pathways that have been previously associated with sucrose overconsumption in MAFLD contexts. We incorporated some of these metabolites into the revised results section (Lines 308–314) to demonstrate the consistency between our predicted metabolic changes and experimental findings from the literature. For instance, studies have reported altered tryptophan metabolism, including decreased serum kynurenic acid and kynurenine levels, as well as increased triglyceride-bound oleate, palmitate, and stearate in sucrose-fed rodents. These findings align with our predictions of altered metabolic activities in fatty acid oxidation, fatty acid synthesis, and tryptophan metabolism.
(2) There is a lack of consistency in the data between JNK-IN-5A_D1 and _D2, and there is no sufficient data-based explanation for why the effects observed in D1 were inconsistent in the D2 samples.
Thank you for raising this important point regarding the differences between the two dosages. As this was not the primary focus of the current study and we do not have sufficient data to fully explain these observations. Our speculation is that this may arise from pharmacokinetic differences associated with the dosing of this small molecule inhibitor, including potential saturation of transport mechanisms, alter tissue distribution, or off-target effects.
(3) Although it is valuable that the authors were able to suggest the possibility of JNK inhibitor as a therapeutic strategy for MAFLD, the evaluation of the therapeutic effect was limited to the evaluation of plasma TG, LDH, and gene expression changes. As there was no evaluation of liver tissue images, it is unclear what changes were brought about in the liver by the excessive sucrose intake and the treatment with JNK-IN-5A.
We acknowledge that the lack of histological evaluations may limit to having a complete picture of the interventions' effects. However, as you noted, our transcriptional and systems-wide investigation across multiple tissues provides novel and significant insights into the molecular and systemic impacts of JNK-IN-5A treatment.
Recommendations for the authors:
Reviewer #2 (Recommendations for the authors):
(1) It would be useful to explain why the authors conducted their research using female rats but not male rats.
Thank you for raising this insightful point. We chose female rats for the current study was based on several considerations. 1) Previous research has demonstrated that female rats exhibit metabolic dysfunction (e.g., hypertriglyceridemia, liver steatosis, insulin resistance) in response to dietary factors, such as high-sucrose feeding [16-19]. These metabolic characteristics made them an appropriate model for assessing the in vivo effects of JNK inhibition under high-sucrose conditions. 2) It is also reported that female rats show resilience to high-sucrose-induced metabolic dysfunction due to the protective effects of estrogen [8], we aimed to determine whether JNK inhibition could provide therapeutic benefits in this context. This allows us to evaluate the effect of JNK inhibition even in metabolically advantaged groups. 3) Our results from the tolerance test (Fig. 2a) indicated that female rats displayed more fluctuating variation to JNK-IN-5A administration. This variation allowed us to evaluate how JNK inhibition influences metabolic outcomes in a sex that is more responsive to the intervention. Nonetheless, we emphasize the importance of future studies involving male rats to better understand sex-specific responses to JNK inhibition and to provide more comprehensive guidance for the development of JNK-targeting therapies in MAFLD treatment.
(2) Figure 2C shows that JNK-IN-5A administration reduces the mRNA levels of Mapk8 and Mapk9 in the liver and the SkM. It would be useful to provide the authors' insight into the data.
In the liver, the data in Fig. 2c in original submission and the attached Fig. 1 show that sucrose feeding induces opposite alterations in the mRNA expression of Mapk8 (Jnk1, increased, log2FC<sub>SucrosevsControl</sub>= 0.02) and Mapk9 (Jnk2, decreased, log2FC<sub>SucrosevsControl</sub>= -0.43), though these changes do not reach statistical significance. JNK-IN-5A administration reverses these effects, significantly decreasing Mapk8 expression (log2FC<sub>Sucrose+JNK_D1vsSucrose</sub>= -0.37) while increasing Mapk9 expression (log2FC<sub>Sucrose+JNK_D1vsSucrose</sub>= 0.42). This suggests potential differential yet compensatory roles of these two isoforms in regulating JNK activity during these interventions in the liver, keeping in line with the findings from Jnk1- and/or Jnk2-specific knockout studies [20, 21]. Additionally, emerging evidence indicates that Jnk1 plays a major role in diet-induced liver fibrosis and metabolic dysfunction [22-25]. Therefore, the reduced Mapk8 expression following JNK-IN-5A administration may contribute to the observed improvements in liver metabolism.
Author response image 1.
The spearman correlation between expression levels of Mapk8
In skeletal muscle, the primary site for insulin-stimulated glucose uptake, insulin signaling is crucial for maintaining metabolic homeostasis [26]. Numerous studies have demonstrated that JNK activation promotes insulin resistance and targeting JNK might be a promising therapeutic strategy for the treatment of metabolic diseases associated with insulin resistance, such as MAFLD [24]. In our study, while sucrose overconsumption did not significantly alter the mRNA levels of JNK isoforms in this tissue, JNK-IN-5A at dosage 30 mg/kg/day administration significantly reduced the expression of both Jnk1 and Jnk2 as well as genes involved in insulin signaling (Fig. 5). This suggests a potential interplay between JNK inhibition and insulin signaling pathways in the skeletal muscle, where inhibition of JNK activity may improve insulin sensitivity by modulating these pathways. However, it is also crucial to investigate the longterm effects of JNK-IN-5A administration and its broader impact on many other physiological processes regulated by the JNK pathway. These aspects will be a focus of our future studies.
(3) The notations a and b in Figure S5 are missing.
Thank you for this constructive comment. We have corrected this in the revised figure S5.
(4) Data S13 described in the figure legend for Figure 7 (lines 630 and 632) seems a mistake and should be Data S8.
(5) The notations a, b, and c in Figure 7 are incorrect. The figure legend for Figure 7a doesn't seem to match the figure contents.
We appreciate your attention to details regarding Fig. 7. We have corrected the reference and the figure legend in revised Fig. 7.
Reference
(1) Fujii, A., et al., Sucrose Solution Ingestion Exacerbates DinitrofluorobenzeneInduced Allergic Contact Dermatitis in Rats. Nutrients, 2024. 16(12).
(2) Sun, S., et al., High sucrose diet-induced dysbiosis of gut microbiota promotes fatty liver and hyperlipidemia in rats. J Nutr Biochem, 2021. 93: p. 108621.
(3) Qi, S., et al., Inositol and taurine ameliorate abnormal liver lipid metabolism induced by high sucrose intake. Food Bioscience, 2024. 60: p. 104368.
(4) Ramos-Romero, S., et al., The Buckwheat Iminosugar d-Fagomine Attenuates Sucrose-Induced Steatosis and Hypertension in Rats. Mol Nutr Food Res, 2020. 64(1): p. e1900564.
(5) Ortiz, S.R. and M.S. Field, Sucrose Intake Elevates Erythritol in Plasma and Urine in Male Mice. J Nutr, 2023. 153(7): p. 1889-1902.
(6) Beckmann, M., et al., Changes in the human plasma and urinary metabolome associated with acute dietary exposure to sucrose and the identification of potential biomarkers of sucrose intake. Mol Nutr Food Res, 2016. 60(2): p. 444-57.
(7) He, X., et al., High Fat Diet and High Sucrose Intake Divergently Induce Dysregulation of Glucose Homeostasis through Distinct Gut Microbiota-Derived Bile Acid Metabolism in Mice. J Agric Food Chem, 2024. 72(1): p. 230-244.
(8) Stephenson, E.J., et al., Chronic intake of high dietary sucrose induces sexually dimorphic metabolic adaptations in mouse liver and adipose tissue. Nat Commun, 2022. 13(1): p. 6062.
(9) Mock, K., et al., High-fructose corn syrup-55 consumption alters hepatic lipid metabolism and promotes triglyceride accumulation. J Nutr Biochem, 2017. 39: p. 32-39.
(10) Eryavuz Onmaz, D. and B. Ozturk, Altered Kynurenine Pathway Metabolism in Rats Fed Added Sugars. Genel Tıp Dergisi, 2022. 32(5): p. 525-529.
(11) Gariani, K., et al., Eliciting the mitochondrial unfolded protein response by nicotinamide adenine dinucleotide repletion reverses fatty liver disease in mice. Hepatology, 2016. 63(4): p. 1190-204.
(12) Togo, J., et al., Impact of dietary sucrose on adiposity and glucose homeostasis in C57BL/6J mice depends on mode of ingestion: liquid or solid. Mol Metab, 2019. 27: p. 22-32.
(13) Arifin, W.N. and W.M. Zahiruddin, Sample Size Calculation in Animal Studies Using Resource Equation Approach. Malays J Med Sci, 2017. 24(5): p. 101-105.
(14) Faul, F., et al., G*Power 3: a flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behav Res Methods, 2007. 39(2): p. 175-91.
(15) Bonapersona, V., et al., Increasing the statistical power of animal experiments with historical control data. Nat Neurosci, 2021. 24(4): p. 470-477.
(16) Kendig, M.D., et al., Metabolic EYects of Access to Sucrose Drink in Female Rats and Transmission of Some EYects to Their OYspring. PLoS One, 2015. 10(7): p. e0131107.
(17) Harris, R.B.S., Source of dietary sucrose influences development of leptin resistance in male and female rats. Am J Physiol Regul Integr Comp Physiol, 2018. 314(4): p. R598-R610.
(18) Velasco, M., et al., Sexual dimorphism in insulin resistance in a metabolic syndrome rat model. Endocr Connect, 2020. 9(9): p. 890-902.
(19) Maniam, J., C.P. Antoniadis, and M.J. Morris, The eYect of early-life stress and chronic high-sucrose diet on metabolic outcomes in female rats. Stress, 2015. 18(5): p. 524-37.
(20) Singh, R., et al., DiYerential eYects of JNK1 and JNK2 inhibition on murine steatohepatitis and insulin resistance. Hepatology, 2009. 49(1): p. 87-96.
(21) Sabapathy, K., et al., Distinct roles for JNK1 and JNK2 in regulating JNK activity and c-Jun-dependent cell proliferation. Mol Cell, 2004. 15(5): p. 713-25.
(22) Zhao, G., et al., Jnk1 in murine hepatic stellate cells is a crucial mediator of liver fibrogenesis. Gut, 2014. 63(7): p. 1159-72.
(23) Czaja, M.J., JNK regulation of hepatic manifestations of the metabolic syndrome. Trends Endocrinol Metab, 2010. 21(12): p. 707-13.
(24) Solinas, G. and B. Becattini, JNK at the crossroad of obesity, insulin resistance, and cell stress response. Mol Metab, 2017. 6(2): p. 174-184.
(25) Schattenberg, J.M., et al., JNK1 but not JNK2 promotes the development of steatohepatitis in mice. Hepatology, 2006. 43(1): p. 163-72.
(26) Sylow, L., et al., The many actions of insulin in skeletal muscle, the paramount tissue determining glycemia. Cell Metab, 2021. 33(4): p. 758-780.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews
Reviewer #1 (Public review)
Weaknesses:
The main weakness of the manuscript is that to a large degree, one of its main conclusions (MAP symmetry underlies differences in regenerative capacity) relies mainly on a correlation, without firmly establishing a causal link. However, this weakness is relatively minor because (1) it is partially addressed with the Spastin KO and (2) there isn't a trivial way to show a causal relationship in this case.
We thank Reviewer #1 for their positive assessment of our manuscript. To further strengthen the claim that MAP asymmetry underlies differences in regenerative capacity, we could investigate the effect of depleting other MAPs that lose asymmetry after conditioning lesion (CRMP5 and katanin). One would expect that similarly to spastin, this would disrupt the physiological asymmetry of DRG axons and impair axon regeneration. We further discussed this issue in the revised version of the manuscript (page 17, line 381).
Reviewer #2 (Public review)
Weaknesses:
In order for the method to be used it needs to be better described. For instance what proportion of neurons develop just two axonal branches, one of which is different? How selective are the researchers in finding appropriate neurons?
We thank Reviewer #2 for their positive assessment of our manuscript. As suggested, we included further methodological details on the in vitro system in the revised version of the manuscript. We have previously evaluated the percentage of DRG neurons exhibiting different morphologies in our cultures: multipolar (4±1%), bipolar, (35±8%) bell-shaped (17±5%), and pseudo-unipolar neurons (43±3%). This was included in the revised manuscript on Figure 1B and page 5, line 107. All the pseudo-unipolar neurons analysed had distinct axonal branches in terms of diameter and microtubule dynamics. For imaging purposes, we selected pseudounipolar neurons with axons unobstructed from other cells or neurites within a distance of at least 20–30 μm from the bifurcation point, to ensure optimal imaging. In the case of laser axotomy experiments, this distance was increased to 100–200 μm to ensure clear analysis of regeneration. These selection criteria is now detailed in the Methods (page 19, line 417, and page 21, line 474).
Reviewer #3 (Public review):
(1) Weaknesses:
While some of the data are compelling, experimental evidence only partially supports the main claims. In its current form, the study is primarily descriptive and lacks convincing mechanistic insights. It misses important controls and further validation using 3D in vitro models.
We recognize the importance of further exploring the contribution of other MAPs to microtubule asymmetry and regenerative capacity of DRG axons. In future work, we plan to investigate this issue using knockout mice for katanin and CRMP5. Regarding the mechanisms underlying the differential localization of proteins in DRG axons, we performed in-situ hybridization to evaluate the availability of axonal mRNA but no differences were found between central and peripheral DRG axons (Figure 4 – figure supplement 2). To address whether differences in protein transport exist, we attempted to transduce DRG neurons with GFP-tagged spastin both in vitro and in vivo. However, these experiments were inconclusive as very low levels of spastin-GFP were detected. We are actively optimizing these approaches and will address this challenge in future studies. These points were further discussed in the revised manuscript (page 15, line 330 and page 17, line 381).
(2) Given the heterogeneity of dorsal root ganglion (DRG) neurons, it is unclear whether the in vitro model described in this study can be applied to all major classes of DRG neurons.
We acknowledge the diversity of DRG neurons and agree that assessing the presence
of different DRG subtypes in our culture system will enrich its future use. Despite this heterogeneity, we focused on DRG neuron features that are common to all subtypes i.e, pseudo-unipolarization and higher regenerative capacity of peripheral branches. This point was addressed on page 14, line 309 of the revised manuscript.
(3) Also unclear is the inconsistency with embryonic DRG cultures with embryonic (E)16 from rats and E13 from mice (spastin knockout and wild-type controls).
Given our previous experience in establishing DRG neuron cultures from E16 Wistar rats and E13 C57BL/6 mice, these developmental stages are equivalent, yielding cultures of DRG neurons with similar percentages of different morphologies. Of note, in our colonies, gestation length is ~19 days in C57BL/6 mice (background of the spastin knockout line) and ~22 days in Wistar Han rats. This was further clarified in the Methods (page 18, line 404).
(4) Furthermore, the authors stated (line 393) that only a small subset of cultured DRG neurons exhibited a pseudo-unipolar morphology. The authors should include the percentage of the neurons that exhibit a pseudo-unipolar morphology.
We have previously evaluated the percentage of DRG neurons exhibiting different morphologies in our cultures: multipolar (4±1%), bipolar, (35±8%) bell-shaped (17±5%), and pseudo-unipolar neurons (43±3%). This was included in the revised manuscript on Figure 1B and on page 5, line 107. In line 393, we referred specifically to an experimental setup where DRG neuron transduction was done, and 30 transduced neurons were randomly selected for longitudinal imaging. From these, the number of viable pseudo-unipolar DRG neurons was limited by both the random nature of viral transduction and light-induced toxicity throughout continuous imaging over seven consecutive days at hourly intervals. This was clarified in the revised manuscript (page 20, line 438).
(5) The significance of studying microtubule polymerization to DRG asymmetry in vitro is questionable, especially considering the model's validity. The authors might consider eliminating the in vitro data and instead focus on characterizing DRG asymmetry in vivo both before and after a conditioning lesion. If the authors choose to retain the in vitro data, classifying the central and peripheral-like branches in cultured DRG neurons will require further in-depth characterization. Additional validation should be performed in adult DRG neuron cultures not aged in vitro.
The in vitro system here presented reliably reproduces several key features of DRG neurons observed in vivo, including asymmetry in axon diameter, regenerative capacity, axonal transport, and microtubule dynamics. Of note, most studies in the field have been done using multipolar DRG neurons that do not recapitulate in vivo morphology and asymmetries. Thus, the current in vitro model serves as a versatile tool for advancing our understanding of DRG biology and associated diseases. This system is particularly suited to study axon regeneration asymmetries, and enables the investigation of mechanisms occurring at the stem axon bifurcation, such as asymmetric protein transport and microtubule dynamics, which are challenging to examine in vivo due to the length of the stem axon and the difficulty of locating the DRG T-junction. It will be important to optimize similar cultures using adult DRG neurons. However, this comes with challenges, such as lower cell viability. This is the case with multiple other neuron types for which the vast majority of cultures are obtained from embryonic tissue. These concerns were addressed in the revised version of the manuscript (page 13, line 296 and page 14 line 302).
(6) The comparison of asymmetry associated with a regenerative response between in vitro and in vivo paradigms has significant limitations due to the nature of the in vitro culture system. When cultured in isolation, DRG neurons fail to form functional connections with appropriate postsynaptic target neurons (the central branch) or to differentiate the peripheral domains associated with the innervation of target organs. Rather than growing neurons on a flat, hard surface like glass, more physiologically relevant substrates and/or culturing conditions should be considered. This approach could help eliminate potential artifacts caused by plating adult DRG neurons on a flat surface. Additionally, the authors should consider replicating their findings in a 3D culture model or using dorsal root ganglia explants, where both centrally and peripherally projecting axons are present.
We agree that a more sophisticated system, such as a compartmentalized culture, holds great potential for future research. In this respect, we are currently engaged in developing such models. A compartmentalized system would enable the separation of three compartments: central nervous system neurons, DRG neurons, and peripheral targets. While previous efforts to create compartmentalized DRG cultures have been reported (e.g., PMID: 11275274 and PMID: 37578145), these systems have not demonstrated the development of pseudo-unipolar morphology. Incorporating non-neuronal DRG cells into the DRG neuron compartment, may successfully support the development of a pseudo-unipolar morphology.
We also recognize the importance of dimensionality in fostering pseudo-unipolar morphology. Of note, our model provides a 3D-like environment, as DRG glial cells are continuously replicating over the 21 days in culture. In relation to DRG explants, we attempted their use but encountered limitations with confocal microscopy as the axial resolution was insufficient to resolve processes at the DRG T-junction or within individual branches. The above issues are now discussed in the revised manuscript (page 14, line 312).
(7) Panels 5H-J require additional processing with astrocyte markers to accurately define the lesion borders. Furthermore, including a lower magnification would facilitate a direct comparison of the lesion site.
In our study, we relied on the alignment of nuclei to delineate the lesion site as in our accumulated experience, this provides an accurate definition of the lesion boarder. Outside the lesion, the nuclei are well-aligned, while at the lesion site, they become randomly distributed. Additionally, CTB staining further supports the identification of the rostral boarder of the lesion, as most injured central DRG axons stop their growth at the injury site. This was further detailed in the Methods of the revised manuscript (page 32, line 730).
(8) The use of cholera toxin subunit B (CTB) to trace dorsal column sensory axons is prone to misinterpretation, as the tracer accumulates at the axon's tip. This limitation makes it extremely challenging to distinguish between regenerating and degenerating axons.
While alternative methods to trace or label regenerating axons exist, CTB is a wellestablished and widely used tracer for central sensory projections, as shown in different studies (PMID: 22681683, PMID: 26831088 and PMID: 33349630). Regarding the concern of possiblebCTB labeling in degenerating axons, we believe this is unlikely to be the case in our system, as in spinal cord injury controls, CTB-positive axons are nearly absent. Also, as regeneration was investigated six weeks after injury, axon degeneration has most likely already occurred as shown in (PMID: 15821747 and PMID: 25937174).
Recommendations for the authors:
Reviewer #1:
(1) Figure 1 can be improved by adding a quantification of the fraction of neurons at each stage as a function of time.
We have updated Figure 1 to include the quantification of the percentages of different DRG neuron morphologies at DIV21 (Figure 1B), which corresponds to the stage at which all in vitro experiments were conducted.
(2) Figure 3A: why are retrograde transport events not shown?
Retrograde transport events are not displayed as results did not reach statistical significance.
(3) Figure 3 and 4: Combine the quantifications of with/without lesion, such that not only the differences between branches are apparent, but also the differences induced in each branch by the lesion.
As requested, only combined quantifications of microtubule dynamics for naive and conditioning lesion are provided in the revised version of Figure 3 (Figures 3H and 3K), to highlight both branch-specific differences and lesion-induced changes. However, for Figure 4, as the western blots for naive and conditioning lesion were performed on separate gels, it is unfeasible to combine their quantification.
(4) Figure 5: does spastin KO lead to a difference in the "MAP signature" of each branch? Also, if in addition to MAPs there are other known molecules (and an antibody is available) that show differential localization to peripheral/central branches, it would be nice to check if this asymmetry is also lost in spastin KO.
Evaluating the MAP signature in DRG axons from spastin KO mice will be important to explore in future experiments. Despite some scattered reports in the literature, our study is the first to identify a distinct protein signature of central and peripheral DRG axons. This is especially relevant in the case of Tau, as irrespective of the experimental conditions, its levels are always increased in the peripheral DRG axon.
Reviewer #2:
(1) Please provide a more complete description of the culture method. Do all neurons develop two asymmetric branches or just a few, and how are they selected? Does the timing of the events in vitro correspond with what is happening to the neurons in embryos?
We have included the percentages of the various DRG neuron morphologies at DIV21 in the revised manuscript (Figure 1B and on page 5, line 107). Additionally, a more detailed description of the culture method is now provided in the Methods, including the criteria used to select pseudo-unipolar neurons (page 19, line 417, and page 21, line 474).
Regarding the timing of events, upon DRG dissociation, neurons reinitiate polarization, taking 21 days to reach approximately 40% pseudo-unipolar morphology. A similar percentage is reached at E16.5 during rat development in vivo (PMID: 8729965).
(2) Are the neurons and their branches resting on the glia? Is there any relation to the presence of glia and the type of growth that is seen?
Yes, neurons and their branches rest on glia. This is required for DRG pseudounipolarization. In future studies, we plan to further investigate neuron-extrinsic mechanisms leading pseudo-unipolarization, and to identify the specific glial cell type(s) needed throughout this process. This is now discussed in the revised manuscript (page 14, line 306).
(3) Is it possible to trace microtubules so as to see whether the microtubules of the two branches mix, or whether they remain separate all the way to the cell bodies?
We used DRG neurons transduced with EB3-GFP, to examine microtubule polymerization at the T-junction through live imaging. This revealed a high continuum of polymerization from the stem axon to the central-like axon (Figure 4 – figure supplement 2D-G). To further determine whether microtubules from both branches mix or remain separate, alternative techniques such as FIB-SEM could be performed. This point is now further discussed in the revised manuscript (page 16, line 352).
(4) Using the term MAPs would lead readers to expect to see an analysis of different levels of MAP1, MAP2, etc. It would be interesting to see this if the authors have done it, but it is not necessary for the paper.
We assessed the expression of MAP2 via western blot in DRG peripheral and central axons and no differences were found. This is now referred to in the Discussion (pages 15, line 327).
(5) The regeneration experiments on the spastin knockouts are complicated by the lesion being in CNS tissue, which introduces various issues. Is there a difference in regeneration after dorsal root crush?
We have not yet examined whether regeneration differs after dorsal root crush in the spastin knockout model. However, this presents an interesting question, as Schwann cells in the dorsal root, may support regeneration of central DRG axons.
Reviewer #3:
The authors stated that the normality of the datasets was tested using the Shapiro-Wilk or D'Agostino-Pearson omnibus normality test. Given the low sample size (n=4) for some of the experiments presented (e.g., Figure 3B), it is not clear how normality was assessed which justifies the use of parametric tests.
We followed GraphPad’s recommendations for selecting the appropriate normality test (https://www.graphpad.com/support/faqid/959/). The D'Agostino-Pearson omnibus K2 test, recommended for its versatility, was used when sample size was 8 or more. For smaller sample sizes (n < 8), we used the Shapiro-Wilk test, which is also widely used in biological research and can be employed with datasets of at least 3 values. These tests guided our decision-making regarding the use of parametric or non-parametric statistical tests.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the previous reviews.
Public Reviews:
Reviewer #2 (Public Review):
The manuscript by Zhang et al. explores the effect of autophagy regulator ATG6 on NPR1-mediated immunity. The authors propose that ATG6 directly interacts with NPR1 in the nucleus to increase its stability and promote NPR1-dependent immune gene expression and pathogen resistance. This novel role of ATG6 is proposed to be independent of its role in autophagy in the cytoplasm. The authors demonstrate through biochemical analysis that ATG6 interacts with NPR1 in yeast and very weakly in vitro. They further demonstrate using overexpression transgenic plants that in the presence of ATG6-mcherry the stability of NPR1-GFP and its nuclear pool is increased.
Comments on latest version:
The term "invasion" has to be replaced with infection, as it doesn't have much meaning to this particular study. I already explained this point in the first review, but authors did not address it throughout the manuscript.
Thank you for your constructive feedback. We have taken your suggestion into account and replaced "invasion" with "infection" in the revised manuscript (Lines 44,45,99,100,298,341,387,415,461,463,464,1002).
In fig. 1e there's no statistical analysis. How can one show measurements from multiple samples without statistical analysis? All the data points have to be shown in the graph and statistics performed. In the arg6-npr1 and snrk-npr1 pairs no nuclear marker is included. How can one know where the nucleus is, particularly in such poor quality low res. images? The nucleus marker has to be included in this analysis and shown. This is an important aspect of the study as nuclear localization of ATG6 is proposed to be essential for its new function.
Thank you for bringing this to our attention. We conducted the BIFC experiments again using nls-mCherry transgenic tobacco, which yielded clearer images. The results clearly demonstrate that ATG6 interacts with NPR1 in both the cytoplasm and nucleus. YFP signaling in the nucleus co-localizes with nls-mCherry (a nuclear localization mark). SnRK2.8 was employed as a positive control for NPR1 interaction." Relative fluorescence intensity of YFP were analyzed using image J software, n = 15 independent images were analyzed to quantify YFP fluorescence. All data points are displayed in the image, and we also conducted a Student's t-test analysis. We have incorporated these results into the revised manuscript (Fig 1d and e).
Co-localization provided in the fig. S2 cannot complement this analysis, particularly since no cytoplasmic fraction is present for NPR1-GFP in fig. S2.
Thank you for your observation. We repeated the experiment and confirmed that NPR1 and ATG6 co-localize in both the nucleus and cytoplasm. The image in Figure S2 has been updated accordingly.
In the alignment in fig 2c, it is not explained what are the species the atg6 is taken from. The predicted NLS has to be shown in the context of either the entire protein sequence alignment or at least individual domain alignment with the indication of conserved residues (consensus). They have to include more species in the analysis, instead of including 3 proteins from a single species. Also, the predicted NLS in atg6 doesn't really have the classical type architecture, which might be an indication that it is a weak NLS, consistent with the fact that the protein has significant cytoplasmic accumulation. They also need to provide the NLS prediction cut-off score, as this parameter is a measure of NLS strength.
Line 150: the NLS sequence "FLKEKKKKK" is a wrong sequence.
Thank you for your suggestion. In both plants and animals, proteins are transported to the nucleus via specific nuclear localization signals (NLSs), which are typically characterized by short stretches of basic amino acids (Dingwall and Laskey, 1991, Raikhel, 1992, Nigg, 1997). Following your recommendation, we re-predicted potential NLS sequences in the ATG6 protein using NLSExplorer (http://www.csbio.sjtu.edu.cn/bioinf/NLSExplorer). Although we did not identify a classical monopartite NLS, we discovered a bipartite NLS similar to the consensus bipartite sequence (KRX<sub>(10-12)</sub>K(KR)(KR)) (Kosugi et al., 2009)in the carboxy-terminal region (475-517 aa) of ATG6, with a cut-off score of 2.6. These findings are consistent with substantial accumulation of ATG6 in the cytoplasm and minimal accumulation in the nucleus. Additionally, our comparison of ATG6 C-terminal sequences across several species, including Microthlaspi erraticum, Capsella rubella, Brassica carinata, Camelina sativa, Theobroma cacao, Brassica rapa, Eutrema salsugineum, Raphanus sativus, Hirschfeldia incana and Brassica napus, sequence comparison indicates that this bipartite NLS is relatively conserved. We have incorporated these results into the revised manuscript (lines 450-160).
In fig. 3d no explanation for the error bars is included, and what type of statistical analysis is performed is not explained.
Thank you for bringing this to our attention. In Figure 3d, a Student's t-test was conducted to analyze the data. The mean and standard deviation were calculated from three biological replicates, and the relevant description has been included in the figure notes.
Reference
Dingwall, C. and Laskey, R.A. (1991) Nuclear targeting sequences--a consensus? Trends Biochem Sci, 16, 478-481.
Kosugi, S., Hasebe, M., Matsumura, N., Takashima, H., Miyamoto-Sato, E., Tomita, M. and Yanagawa, H. (2009) Six classes of nuclear localization signals specific to different binding grooves of importin alpha. J Biol Chem, 284, 478-485.
Nigg, E.A. (1997) Nucleocytoplasmic transport: signals, mechanisms and regulation. Nature, 386, 779-787.
Raikhel, N. (1992) Nuclear targeting in plants. Plant Physiol, 100, 1627-1632.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public review):
Weaknesses:
However, given that S1P is upstream NF-κB signaling, it is unclear if it offers conceptual innovations as compared to previous studies from the same team (Palazzo et al. 2020; 2022, 2023)
We find distinct differences between the impacts of S1P- and NFkB-signaling on glial activation, neuronal differentiation of the progeny of MGPCs and neuronal survival in damaged retinas. In the current study we demonstrate that 2 consecutive daily intravitreal injections of S1P selectively activated mTor (pS6) and Jak/Stat3 (pStat3), but not MAPK (pERK1/2) signaling in Müller glia. Further, inhibition of S1P synthesis (SPHK1 inhibitor) decreased ATF3, mTor (pS6) and pSmad1/5/9 levels in activated Müller glia in damaged retinas. Inhibition of NFkB-signaling in damaged chick retinas did not impact the above-mentioned cell signaling pathways (Palazzo et al., 2020). Thus, S1P-signaling impacts cell signaling pathways in MG that are distinct from NFκB, but we cannot exclude the possibility of cross-talk between NFkB and these pathways. Further, inhibition of NFκB-signaling potently decreases numbers of dying cells and increases numbers of surviving ganglion cells (Palazzo et al 2020). Consistent with these findings, a TNF orthologue, which presumably activates NFκB-signaling, exacerbates cell death in damage retinas (Palazzo et al., 2020). By contrast, 5 different drugs targeting S1P-signaling had no effect on numbers of dying cells and only one S1PR1 inhibitor modestly decreased numbers of dying cells (current study). Although two different inhibitors of NFkB-signaling suppressed the proliferation of microglia in damaged retinas (Palazzo et al., 2020), all of the S1P-targeting drugs had no effect upon the proliferation of microglia (current study). In addition, inhibition of NFκB does not influence the neurogenic potential of MGPCs in damaged chick retinas (Palazzo et al., 2020), whereas inhibition of S1P receptors (S1PR1 and S1PR3) and inhibition of S1P synthesis (SPHK1) significantly increased the differentiation of amacrine-like neurons in damaged retinas (current study). Collectively, in comparison to the effects of pro-inflammatory cytokines and NFκB-signaling, our current findings indicate that S1P-signaling through S1PR1 and S1PR3 in Müller glia has distinct effects upon cell signaling pathways, neuronal regeneration and cell survival in damaged retinas. We will revise text in the Discussion (pages 33-34) to better highlight these important distinctions between NFκB- and S1P-signaling.
Reviewer #2 (Public review):
Weaknesses:
The methodology is not very clean. A number of drugs (inhibitors/ antagonists/agonists signal modulators) are used to modulate S1P expression or signaling in the retina without evidence that these drugs are reaching the target cells. No alternative evaluation if the drugs, in fact, are effective. The drug solubility in the vehicle and in the vitreous is not provided, and how did they decide on using a single dose of each drug to have the optimal expected effect on the S1P pathway?
Müller glia are the predominant retinal cell type that expresses S1P receptors. Consistent with these patterns of expression, we report Müller glia-specific effects of different agonists and antagonists that increase or decrease S1P-signaling. Since we compare cell-level changes within contralateral eyes wherein one retina is exposed to vehicle and the other is exposed to vehicle plus drug, it seems highly probable that the drugs are eliciting effects upon the Müller glia. It is possible, but very unlikely, that the responses we observed could have resulted from drugs acting on extra-retinal tissues, which might secondarily release factors that elicit cellular responses in Müller glia. However, this seems unlikely given the distinct patterns of expression for different S1P receptors in Müller glia, and the outcomes of inhibiting Sphk1 or S1P lyase on retinal levels of S1P.
For example, we provide evidence that S1PR1 and S1PR3 expression is predominant in Müller glia in the chick retina using single cell-RNA sequencing and fluorescence in situ hybridization (FISH). Thus, we expect that S1PR1/3-targeting small molecule inhibitors to directly act on Müller glia, which is consistent with our read-outs of cell signaling with injections of S1P in undamaged retinas. We show that SPHK1 and SGPL1, which encode the enzymes that synthesize or degrade S1P, are expressed by different retinal cell types, including the Müller glia. The efficacy of the drugs that target SPHK1 and SGPL1 was assessed by measuring levels of S1P in the retina. By using liquid chromatography and tandem mass spectroscopy (LC-MS/MS), we provide data that inhibition of S1P synthesis (inhibition of SPHK1) significantly decreased levels of S1P in normal retinas, whereas inhibition of S1P degradation (inhibition of SGPL1) increased levels of S1P in damaged retinas (Fig. 5). These data suggest that the SPHK1 inhibitor and the SGPL1 inhibitor specifically act at the intended target to influence retinal levels of S1P. Further, inhibition of SPHK1 (to decrease levels S1P) results in decreased levels of ATF3, pS6 (mTor) and pSMAD1/5/9 in Müller glia, consistent with the notion that reduced levels of S1P in the retina impacts signaling at Müller glia. Finally, we find similar cellular responses to chemically different agonists or antagonists, and we find opposite cellular responses to agonists and antagonists, which are expected to be complimentary if the drugs are specifically acting at the intended targets in the retina. We will revise the Discussion to better address caveats and concerns regarding the actions and specificity of different drugs within the retina following intravitreal delivery.
We will provide the drug solubility specifications and estimates of the initial maximum dose per eye for each drug. For chick eyes between P7 and P14, these estimates will assume a volume of about 100 ul of liquid vitreous, 800 ul gel vitreous and an average eye weight of 0.9 grams. We will revise Table 1 (pharmacological compounds) with ranges of reported in vivo ED50’s (mg/kg) for drugs and we will list the calculated initial maximum dose (mg/kg equivalent) per eye. Doses were chosen based on estimates of the initial maximum ocular dose that were within the range of reported ED50’s. However, as is the case for any in vivo model system, it is difficult to predict rates of drug diffusion out of the vitreous, how quickly the drugs are cleared from the entire eye, how much of the compound enters the retina, and how quickly the drug is cleared from the retina. Accordingly, we assessed drug specificity and sites of activation by relying upon readouts of cell signaling pathways that are parsed with patterns of expression of different S1P receptors and measurements of retinal levels of S1P following exposure to drugs targeted enzymes that synthesize or degrade S1P, as described above.
Reviewer #1 (Recommendations for the authors):
I am wondering if Muller glia can be considered as fully differentiated at early postnatal stages as those used in this study. Is this mechanism operative in adult retinas? Could the authors perform studies in older animals, just to have the proof of principle that the proposed mechanism is retained.
Chickens are considered to be adult at about 4 months of age, when the females start laying eggs. Unfortunately, housing, maintenance, handling and experimentation on large adult chickens has proven to be challenging. Nevertheless, there is evidence that Muller glia reprogramming remains robust in mature chick retinas from the P1 through P30, but the zones of proliferation shift away from central retina and become increasingly confined to the retinal periphery (Fischer, 2005). MG “maturation” appears to occur in a central-to-peripheral gradient, much like the process of embryonic retinal differentiation, but a zone of regeneration-competent MG remains in the periphery during adolescent development (Fischer, 2005).
We have defined central vs peripheral retina in the Methods.
To partially address this question, we have generated a new supplemental Figure 6 showing (i) SPHK1 fluorescent in-situ labeling of central and peripheral regions at P10, and (ii) analysis of EdU+Sox2+ MGPCs in central versus regions treated with NMDA +/-S1PR1 inhibitor or NMDA+/- SPHK1 inhibitor. We find that patterns of S1PR1 transcription in the central region are similar to the peripheral region (not shown), and S1PR1 inhibition modestly increased numbers of MGPCs in central regions. Unlike the peripheral regions of retina, SPHK1 FISH signal in the central region remains low at 48 hours post-injury (supplemental Fig. 6). Additionally, we found that the SPHK1 inhibitor had no effect on numbers of proliferating MGPCs in the central regions of retina, whereas SPHK1 inhibitors stimulated proliferation of MGPCs in the periphery (Fig. 4). It is likely that mature MG in central retinal regions are not responsive to SPHK1 inhibition due to low levels of expression.
We have previously shown that Notch-related genes show unique patterns of expression in the central and peripheral retinas, and expression levels significantly change at P0, P7, and P21 (Ghai et al, 2010). We found that Notch inhibition reduced cell death and numbers of MGPCs in central regions but not peripheral regions. Recent sc-RNA sequencing analysis of murine macula and peripheral retinal regions has revealed interesting differences in NFKBIA/Z and NFIA expression, possibly indicating a difference in the early inflammatory transcriptional response to retinal damage (Zhang et al, 2024 biorxiv). We believe that spatial sequencing of peripheral “immature” and central “mature” chick Muller glia will be a useful tool in the future to reveal key differences in signaling pathway-related gene expression which confer a competence for regeneration in the periphery.
We have added text to the Results (pages 20-21) and Discussion (page 32) to address the S1P-signaling in central (mature MG) vs peripheral (immature MG) regions of the retina.
Minor points.
The abstract is difficult to follow and consists of a list of what activates or represses the formation of MGPC. Please rewrite the abstract to integrate information and provide a clearer message. Also, please include the species of study in the abstract and mention it again at the beginning of the results, at least.
We have rewritten the abstract to simplify and clarify our main points (p 2).
Lines 65-69. The sentence is unclear, perhaps there are words either missing or in excess and there is a need to check the spelling.
We have simplified this sentence to improve clarity and referenced our recently published review to support.
Lines 112-113. Please explain why " retinas were treated with saline, NMDA, or 2 or 3 doses insulin+FGF2 and the combination of NMDA and insulin+FGF2". There is a reference but readers will appreciate understanding right away why.
We have added a sentence to clarify the purpose of comparing gene expression patterns in MG and MGPCs in NMDA-damaged retinas versus retinas treated with insulin+FGF2.
Lines 223-257. This list of experiments is difficult to follow and perhaps should be summarized better. Somehow lines 257-261 say it all.
We have revised this section to clarify differences in outcomes between S1PR1/3 activators and inhibitors. We also stated the enzymatic functions of SPHK1 and SGPL1 to improve clarity.
Lines 392-441. Comparative expression analysis should be summarized as the message is somehow simple but the description is rather lengthy.
We have revised our comparative expression analyses to be more concise.
Reviewer #2 (Recommendations for the authors):
(1) Only a single dose of the drugs (inhibitor/ antagonists/agonists signal modulators) is used for each drug, as shown in Table 1. How do they know this is an effective dose?
We estimated the appropriate dose based on the initial maximum dose, which we based on the reported ED50 values for each drug. We have revised Table 1 to include this information.
(2) Most of the drugs appeared to be hydrophobic, but except for sphingosine and S1P, all are described to be injected with sterile saline. They must provide solubility characteristics of these drugs in solvents. For example, FTY720 is not water-soluble, which raises the question of all of their drugs' solubility, bioavailability to the cells of interest, and their effectivity in signal transduction in the retinal cells.
Some S1P-targeting compounds were delivered in 20% DMSO in saline to support the solubility of the different lipophillic small molecule agonists/antagonists. We have added information to the Methods to describe the use of DMSO to solubilize these drugs (p 6) in Table 1 and p 5. We have also revised Table 1 with ranges of reported ED50’s (mg/kg) for all drugs and listed the calculated initial maximum dose (mg/kg) per eye.
(3) Drugs were delivered to the vitreous chamber, but there was no information on how they would cross the inner limiting membrane to affect or modulate S1P metabolism in retinal MG or to bind the S1P receptors on MG or other retinal cell types.
All selected compounds are small-molecule drugs, many of which are structural analogues of sphingosine or S1P. These drugs would be classified as BDDCS Class II drugs, meaning they have low solubility but high cell permeability. Thus, it is highly probable that they diffuse across the ILM to act on S1P receptors on MG, but it is also likely that their bioavailability is more limited, requiring a higher dose, repeated doses, and the use of solubilizing agents. We have clarified our use of DMSO to solubilize these drugs (p 6) according to vendor recommendations (p 5). This information has been added to the Methods.
(4) Gene expression is a very dynamic process; without providing more evidence that the expression changes are the direct effect of the drug treatment, the conclusions made based on the gene expression profiles are not strong. Additional points:
We do not make assertions that changes in scRNA-seq expression profiles are the direct result of S1P-targetting drugs. We report significant changes in cellular expression profiles following NMDA-induced retinal damage or ablation of microglia. We feel that new experiments to assess the gene expression profiles of retinal cells that are directly downstream of the different S1P-targetting drugs is better suited for future studies.
(5) Please add in the introduction that there is only one sphingosine kinase in chicken, as no SPHK2 is known to be present.
We have added additional information regarding the expression of SPHK1 and SPHK2 genes in the chick genome (p 4).
(6) Fig 1d and in many other UMAP clusters, the low expressing genes are barely visible (Ex. 1d, S1PR2, and S1PR3); please extract them in separate UMAP clusters and provide them in supplements.
We have revised supplemental Figure 1 to include separate panels for each of the S1P-related gene.
(7) The Figure References for SPHK1 (Fig. 2e), SGPL1 (Fig. 2e), ASAH1 (Fig. 2f), CERS6 (Fig. 2f), and CERS5 (Fig. 2f) in the line # 124- 132 should belong to Figure 1, not Figure 2.
We have corrected these figure references (p 14).
(8) The description of the expression of zebrafish genes does not match the figures. For example, 'Similarly, sphk1 was detected in very few cells in the retina (Fig. 10j). By comparison, sphk2 was detected in a few bipolar cells and rod photoreceptors (Fig. 10j). Similar to patterns of expression seen in chick and human retinas, sgpl1 was detected in microglia and a few cells scattered among the different clusters of inner retinal neurons and rod photoreceptors (Fig. 10j)', the expression of these genes are not in very few or few scattered cells rather in many cells.
We have revised these statements to improve clarity and more accurately describe the data in Figure 10 (p 28).
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
Summary:
The authors employed a combinatorial CRISPR-Cas9 knockout screen to uncover synthetically lethal kinase genes that could play a role in drug resistance to kinase inhibitors in triple-negative breast cancer. The study successfully reveals FYN as a mediator of resistance to depletion and inhibition of various tyrosine kinases, notably EGFR, IGF-1R, and ABL, in triple-negative breast cancer cells and xenografts. Mechanistically, they demonstrate that KDM4 contributes to the upregulation of FYN and thereby is an important mediator of drug resistance. All together, these findings suggest FYN and KDM4A as potential targets for combination therapy with kinase inhibitors in triple-negative breast cancer. Moreover, the study may also have important implications for other cancer types and other inhibitors, as the authors suggest that FYN could be a general feature of drug-tolerant persister cells.
Strengths:
(1) The authors used a large combination matrix of druggable tyrosine kinase gene knockouts, enabling studying of co-dependence of kinase genes. This approach mitigates off-target effects typically associated with kinase inhibitors, enhancing the precision of the findings.
(2) The authors demonstrate the importance of FYN in drug resistance in multiple ways. They demonstrate synergistic interactions using both knockouts and inhibitors, while also revealing its transcriptional upregulation upon treatment, strengthening the conclusion that FYN plays a role in the resistance.
(3) The study extends its impact by demonstrating the potent in vivo efficacy of certain combination treatments, underscoring the clinical relevance of the identified strategies.
Weaknesses:
(1) The methods and figure legends are incomplete, posing a barrier to the reproducibility of the study and hindering a comprehensive understanding and accurate interpretation of the results.
We thank the reviewer for pointing this out. We tried adding as much detail in methods and figures legends as possible to maximize reproducibility and accuracy in interpreting our results as will be described for our responses for the recommendations for authors.
(2) The authors make use of a large quantity of public data (Fig. 2D/E, Fig. 3F/L/M, Fig 4C, Fig 5B/H/I), whereas it would have strengthened the paper to perform these experiments themselves. While some of this data would be hard to generate (e.g. patient data) other data could have been generated by the authors. The disadvantage of the use of public data is that it merely comprises associations, but does not have causal/functional results (e.g. FYN inhibition in the different cancer models with various drugs). Moreover, by cherry-picking the data from public sources, the context of these sources is not clear to the reader, and thus harder to interpret correctly. For example, it is not directly clear whether the upregulation of FYN in these models is a very selective event or whether it is part of a very large epigenetic re-programming, where other genes may be more critical. While some of the used data are from well-known curated databases, others are from individual papers that the reader should assess critically in order to interpret the data. Sometimes the public data was redundant, as the authors did do the experiments themselves (e.g. lung cancer drug-tolerant persisters), in this case, the public data could also be left out.
More importantly, the original sources are not properly cited. While the GEO accession numbers are shown in a supplementary table, the articles corresponding to this data should be cited in the main text, and preferably also in the figure legend, to clarify that this data is from public sources, which is now not always the case (e.g. line 224-226). If these original papers do already mention the upregulation of FYN, and the findings from the authors are thus not original, these findings should be discussed in the Discussion section instead of shown in the Results.
We welcome the reviewer’s concern. As reviewer pointed out, our analysis with FYN expression levels in multiple studies with drug tolerant cells may merely reflect association and not causal relationships. We had at least shown that FYN inhibition may reduce drug tolerance in TNBC and EGFR inhibitor treated lung cancer cells (figures 2H, 5E). The causal role of FYN in emergence of drug tolerance in other cancers treated with different drugs (such as irinotecan treated colon adenocarcinoma and gemcitabine treated pancreatic adenocarcinoma) may be beyond scope of this study. We made a brief discussion addressing this concern in lines 273-275.
We also added proper citations of the public data used in this study in main text and figure legends in lines 267-269. The GEO accession numbers are listed in supplementary table S2. Importantly, none of the referenced studies identified FYN as key factor in generating drug tolerant cells.
(3) The claim in the abstract (and discussion) that the study "highlights FYN as broadly applicable mediator of therapy resistance and persistence", is not sufficiently supported by the results. The current study only shows functional evidence for this for an EGFR, IGF1R, and Abl inhibitor in TNBC cells. Further, it demonstrates (to a limited extent) the role of FYN in gefitinib and osimertinib resistance (also EGFR inhibitors) in lung cancer cells. Thus, the causal evidence provided is only limited to a select subset of tyrosine kinase inhibitors in two cancer types. While the authors show associations between FYN and drug resistance in other cancer types and after other treatments, these associations are not solid evidence for a causal connection as mentioned in this statement. Epigenetic reprogramming causing drug resistance can be accompanied by altered gene expression of many genes, and the upregulation of FYN may be a consequence, but not a cause of the drug resistance. Therefore, the authors should be more cautious in making such statements about the broad applicability of FYN as a mediator of therapy resistance.
We fully agree with the reviewer’s concern that FYN upregulation is simply an association, and may not be the cause of drug tolerance and resistance. Therefore, to accurately convey our findings, we edited our manuscript in lines 34-36 in abstract to “FYN expression is associated with therapy resistance and persistence by demonstrating its upregulation in various experimental models of drug-tolerant persisters and residual disease following targeted therapy, chemotherapy, and radiotherapy” and lines 288-290 in discussion to “ Upregulation of FYN is a general feature of drug tolerant cancer cells, suggesting the association of FYN expression with drug resistance and tumor recurrence after treatment.” We hope this satisfies the reviewer.
(4) The rationale for picking and validating FYN as the main candidate gene over other genes such as FGFR2, FRK2, and TEK is not clear.
a. While gene pairs containing FGFR2 knockouts seemed to be equally effective as FYN gene pairs in the primary screening, these could not be validated in the validation experiment. It is unclear whether multiple individual or a pool of gRNAs were used for this validation, or whether only 1 gRNA sequence was picked per gene for this validation. If only 1 gRNA per gene was used, this likely would have resulted in variable knockout efficiencies. Moreover, the T7 endonuclease assay may not have been the best method to check knockout efficiency, as it only implies endonuclease activity around a gene (but not to the extent of indels that can cause frameshifts, such as by TIDE analysis, or extent of reduction in protein levels by western blot).
b. Moreover, FRK2 and TEK, also demonstrated many synergistic gene pairs in the primary screen. However, many of these gene pairs were not included in the validation screening. The selection criteria of candidate gene pairs for validation screening is not clear. Still, TEK-ABL2 was also validated as a strong hit in the validation screen. The authors should better explain the choice of FYN over other hits, and/or mention that TEK and FRK2 may also be important targets for combination treatment that can be further elucidated.
We thank the reviewer for improving our manuscript. We had concerns with the generalizability of FGFR2, FRK and TEK in TNBC as their expressions are very low in MDA-MB-231, nor were they enriched in TNBC compared to cancer cell lines of other subtypes. We added a brief comment on this concern in results section and discussion section (lines 150-154, figure S3). Although we acknowledge that the validations done in figure 2B is a result of only one guide RNA, with validations with pharmacological inhibition of FYN (figure 2F-I), we hope the reader and reviewer can be convinced with our key findings in synthetic lethality between FYN and other tyrosine kinases.
(5) On several occasions, the right controls (individual treatments, performed in parallel) are not included in the figures. The authors should include the responses to each of the single treatments, and/or better explain the normalization that might explain why the controls are not shown.
a. Figure 2G: The effect of PP2 treatment, without combined treatment, is not shown.
b. Figure 2H/3G: The effect of the knockouts on growth alone, compared to sgGFP, is not demonstrated. It is unclear whether the viability of knockouts is normalized to sgGFP, or to each untreated knockout.
c. Figure 2L: The effect of SB203580 as a single treatment is not shown.
We thank the reviewer for pointing this out. The data shown for all figures listed in these concerns were normalized by the changes in viability by pharmacological or genetic perturbations that synergized with TKIs (NVP-ADW742, gefitinib…etc.) used in the figures in the original manuscript. As reviewer had suggested, we newly added the effect of SB203580 and PP2 treatment on cell viability in supplementary figures S4A, S4K. SB203580 had no significant effect on cell viability, while PP2 treatment caused significant decrease in cell viability, which is expected as PP2 can inhibit activity of multiple Src family kinases. Regardless of the effect of SB203580 and PP2 on cell viability as single agent, it is evident that treatment of TKIs synergistically decreased cell viability in cancer cell lines. The change in viability by FYN or histone lysine demethylase knockout was also provided in newly added figure S4D and S6C. Notably, genetic ablation of FYN or histone lysine demethylases had modest, if any, influences on cell viability.
(6) The study examines the effects at a single, relatively late time point after treatment with inhibitors, without confirming the sequential impact on KDM4A and FYN. The proposed sequence of transcriptional upregulation of KDM4A followed by epigenetic modifications leading to FYN upregulation would be more compellingly supported by demonstrating a consecutive, rather than simultaneous, occurrence of these events. Furthermore, the protein level assessment at 48 hours (for RNA levels not clearly described), raises concerns about potential confounding factors. At this late time point, reduced cell viability due to the combination treatment could contribute to observed effects such as altered FYN expression and P38 MAPK phosphorylation, making it challenging to attribute these changes solely to the specific and selective reduction of FYN expression by KDM4A.
We thank the reviewer for pointing this out. We performed time course experiment for NVP-ADW742 treatment on MDA-MB-231 cells in our newly added figure 3E. Surprisingly, treatment of NVP-ADW742 increased KDM4A protein level within two hours. FYN protein accumulation followed KDM4A accumulation after 24 hours. This observation, with our chromatin immunoprecipitation data in figure 3O, provide evidence that FYN accumulation is a consequence of KDM4A accumulation and H3K9me3 demethylation upon TKI treatment. We newly discussed this data in results and discussion section in lines 214-216.
(7) The cut-off for considering interactions "synergistic" is quite low. The manual of the used "SynergyFinder" tool itself recommends values above >10 as synergistic and between -10 and 10 as additive ( https://synergyfinder.fimm.fi/synergy/synfin_docs/). Here, values between 5-10 are also considered synergistic. Caution should be taken when discussing those results. Showing the actual dose response (including responses to each single treatment) may be required to enable the reader to critically assess the synergy, along with its standard deviation.
We thank the reviewer for careful comments. We reanalyzed our data with SynergyFinder plus tool (Zheng, Genomics, Proteomics, and Bioinformatics 2022), which implements mathematical models distinct from SynergyFinder 3, for more faithful implementation of Bliss, Loewe independence models, and more critically, calculates statistical significance of the synergy. We provide updates synergy plots with statistics in figures 2F, 3J, and S4B. All drug combinations show statistically significant synergy (p<0.01). We also add raw data used to calculate synergy in figures 2F, 3J and S4B in supplementary dataset S2.
(8) As the effect size on Western blots is quite limited and sometimes accompanied by differences in loading control, these data should be further supported by quantifications of signal intensities of at least 3 biological replicates (e.g. especially Figure 3A/5A). The figure legends should also state how many independent experiments the blots are representative of.
We added quantifications for figure 3A and 5A for better depiction of our results. Figure legends were edited to indicate this is a representative of three independent experiments.
(9) While the article provides mechanistic insights into the likely upregulation of FYN by KDM4A, this constitutes only a fragment of the broader mechanism underlying drug resistance associated with FYN. The study falls short in investigating the causes of KDM4A upregulation and fails to explore the downstream effects (except for p38 MAPK phosphorylation, which may not be complete) of FYN upregulation that could potentially drive sustained cell proliferation and survival. These omissions limit the comprehensive understanding of the complete molecular pathway, and the discussion section does not address potential implications or pathways beyond the identified KDM4A-FYN axis. A more thorough exploration of these aspects would enhance the study's contribution to the field.
We welcome the reviewer’s careful concern. We agree our delineation of mechanisms underlying TKI resistance in TNBC involving KDM4 and FYN is far from complete. The increases in expression of histone demethylases were observed in cancers treated with different drugs. The mechanisms governing the increase in histone demethylase expression is not known and is beyond the scope of this paper. We newly added this in discussion section in lines 299-304.
(10) FYN has been implied in drug resistance previously, and other mechanisms of its upregulation, as well as downstream consequences, have been described previously. These were not evaluated in this paper, and are also not discussed in the discussion section. Moreover, the authors did not investigate whether any of the many other mechanisms of drug resistance to EGFR, IGF1R, and Abl inhibitors that have been described, could be related to FYN as well. A more comprehensive examination of existing literature and consideration of alternative or parallel mechanisms in the discussion would enhance the paper's contribution to understanding FYN's involvement in drug resistance.
FYN has been implicated in TKI resistance in CML cell lines (Irwin, Oncotarget, 2015). In this study, FYN is similarly transcriptionally upregulated in imatinib resistant CML, and this upregulation is dependent on EGR1 transcription factor. To address this concern, we generated EGR1 KO MDA-MB-231 cells and tested whether these cells retain the ability to accumulate FYN. Consistent with the previous study, imatinib treatment increased EGR1 protein level. However, EGR1 knockout did not influence FYN accumulation in MDA-MB-231 cells. EGR1 mediated accumulation of FYN may be context specific phenomenon to CML (Figure S5B). We newly discussed this result in result sections in lines 187-190. We also acknowledge that SRC family kinases are generally involved in drug resistance in many cancers. We discuss the recent findings regarding SRC family kinases in drug resistance in result section in lines 145-147 and discussion sections in lines 315-317.
Reviewer #2 (Public Review):
Summary:
Kim et al. conducted a study in which they selected 76 tyrosine kinases and performed CRISPR/Cas9 combinatorial screening to target 3003 genes in Triple-negative breast cancer (TNBC) cells. Their investigation revealed a significant correlation between the FYN gene and the proliferation and death of breast cancer cells. The authors demonstrated that depleting FYN and using FYN inhibitors, in combination with TKIs, synergistically suppressed the growth of breast cancer tumor cells. They observed that TKIs upregulate the levels of FYN and the histone demethylase family, particularly KDM4, promoting FYN expression. The authors further showed that KDM4 weakens the H3K9me3 mark in the FYN enhancer region, and the inhibitor QC6352 effectively inhibits this process, leading to a synergistic induction of apoptosis in breast cancer cells along with TKIs. Additionally, the authors discovered that FYN is upregulated in various drug-resistant cancer cells, and inhibitors targeting FYN, such as PP2, sensitize drug-resistant cells to EGFR inhibitors.
Strengths:
This study provides new insights into the roles and mechanisms of FYN and KDM4 in tumor cell resistance.
Weaknesses:
It is important to note that previous studies have also implicated FYN as a potential key factor in drug resistance of tumor cells, including breast cancer cells. While the current study is comprehensive and provides a rich dataset, certain experiments could be refined, and the logical structure could be more rigorous. For instance, the rationale behind selecting FYN, KDM4, and KDM4A as the focus of the study could be more thoroughly justified.
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
(1) The methods and figure legends are incomplete, posing a barrier to the reproducibility of the study and hindering a comprehensive understanding and accurate interpretation of the results. A critical revision of these aspects is needed, for example:
a. Catalogue numbers of certain products critical to reproduce the study (e.g. antibodies) and/or at what company they have been purchased (e.g. used compounds)
b. On several occasions the used concentrations of drugs or exposure time are not mentioned (e.g. Figure 2H, G (PP2), I, J, K, L, etc.)
c. Figure legend of figure panels E-I in Figure 5 seems to be completely incorrect and not consistent with the figure axis etc.
d. RT-qPCR methodology is not described in Methods.
e. Western blot methods are very limited: these should be described in more detail or cite an article that does.
f. Organoid culture: Information about the source of tumour cells (e.g. pre-treatment biopsy, material after surgery), isolation of tumor cells (e.g. methodology, characterization of material) and culture conditions (e.g. culture time before the experiment) is lacking.
g. Information about how gefitinib/osimertinib-resistant PC9 and HCC827 cells are generated (as well as culture conditions and where they are from) is missing.
We thank the reviewer for pointing these out. We have done our best to add experimental details for reproducibility in methods section and figure legends in lines 343-348, 408-426, 431-432, 439-453, 648-650, 671-672 and 691-693.
(2) Figure 1B/C/D: it would be more meaningful if the most important hits (at least in one of these panels) were highlighted (e.g. line with gene-pair named), or visualized separately, so that the reader does not have to read the supplementary table to know what the most important hits were.
We thank the reviewer for careful concern. We newly added labels for key synergistic gene pairs in figures 1D as reviewer suggested.
(3) qPCR data shown in Figure S4 is from 1 independent experiment. As these experiments (especially qPCR) can be rather variable and the effect size is not very large, I would highly recommend repeating these experiments, or excluding them, as conclusions from them are not solid.
We found performing qPCR with many drugs that did not cause substantial synergistic cell death with NVP-ADW742 in figure S5C (figure S4A in previous version of manuscript) will not provide much additional insights. Also, as we were more interested in finding direct regulators of FYN expression, we focused on drugs that inhibit epigenetic regulator that activate transcription. Therefore, we focused on performing FYN qPCR with drug combinations involving GSK-J4 (KDM6 inhibitor) and pinometostat(DOT1L inhibitor). As shown in our newly added figure in S5D, while GSK-J4 inhibited FYN expression, pinometostat failed to do so. Also, we also confirm that knockout of KDM5 or KDM6 reproducibly failed to decrease FYN expression upon TKI treatment (figure S5E and S5G). The new results are discussed in lines 193-198. We hope these additions satisfy the reviewer.
(4) For validation of synergistic knockouts, it would be helpful for the interpretation to also show the viability/growth of each knockout (or treatment), instead of mostly normalized scores. For example, the reader now has no insight into whether FYN knockout itself already affects cell viability, or not. If it (or EGFR/IGF1R/ABL knockout) would already substantially affect cell viability, a further reduction in cell viability may not be as relevant as when it would not affect cell viability at all.
We thank the reviewer for pointing this out. We replaced our figure in figure 2A to indicate raw changes in cell viability in each single and double knockout cells in figure S2A. We hope this satisfies the reviewer.
(5) The curve fitting as in Figure 2G is somewhat misleading. While the curve seems to be forced to go from 1-0, the +PP2 dose-response curve does actually not seem to start at 1, but rather at 0.8, likely resulting from the effect of PP2 as a single treatment, thus, effects may be interpreted as more synergistic than that they truly are.
The results shown in figure 2G is actually normalized to cells treated or not with PP2 to better reflect the effect of NVP-ADW742, gefitinib and imatinib in the presence of PP2. So viability value starting at 0.8 is not because of the effect of PP2 treatment as single agent (because it is normalized to PP2 treated cells), but is actually because very small dose of particularly NVP-ADW742 resulted in modest decrease in viability. To more accurately depict our findings, we added the data point in figure 2G with TKI dose of 0uM at viability 1. We also added details for normalization of viability in figure legends.
(6) The readability of the paper could be enhanced by higher-quality images (now the text is quite pixelated).
We had technical difficulties in converting file types. We have replaced figures for better resolution for all main and supplementary figures.
(7) The discussion now contains one paragraph about the selectivity of kinase inhibitors, and that repurposing of inhibitors with more relaxed specificity or multi-kinase inhibitors can be beneficial. This does not seem to fall within the scope of the study, as there was no comparison between selective and non-selective inhibitors. It was also not clearly mentioned that the non-selective inhibitors worked better than the gene knockouts, or that for example, KDM3 and KDM4 knockout together worked better than only KDM4 knockout. It is recommended to either remove this paragraph, or rephrase it so that it better fits the actual results
We agree with the reviewer. We chose to remove this paragraph in lines 308-313.
(8) The entire paper does not discuss any known functions of FYN. Its function could be very briefly introduced in the results section when highlighting it as an important hit. More importantly, its known role in cancer and especially drug resistance should be discussed in the discussion (see also Public review).
We thank the reviewer for pointing this out. We added brief description of the role of FYN in cancer malignancy and drug resistance in lines 145-147. Particularly, FYN accumulation by EGR1 transcription factor had been described in the context of imatinib resistant chronic myeloid leukemia (Irwin, Oncotarget, 2015). To address this, we tested whether EGR1 knockout decreases FYN level in MDA-MB-231 (Figure S5A). Notably EGR1 knockout failed to decrease FYN protein level. This result was discussed in lines 187-190.
(9) Textual changes including:
a. Line 29 (and others) "Massively parallel combinatorial CRISPR screens": I would rather choose a more descriptive term, such as "combinatorial tyrosine kinase knockout CRISPR screen", which already clarifies the screen used knockouts of (druggable) tyrosine kinases only. Using both "Parallel" and "combinatorial" is somewhat redundant, and "massively" is subjective, in my opinion.
Manuscript edited as suggested (lines 29, 63, 86, 283). The term “massively parallel” have been removed as they don’t significantly change our scientific findings.
b. Line 67 (and others): "to identify ... for elimination of TNBC": while this may be its potential implication, this study has identified genes in (mostly) TNBC cell lines and cell line xenografts. Please rephrase to something more within the scope of this research.
Manuscript edited as suggested (lines 68-69) as “we utilize CombiGEM-CRISPR technology to identify tyrosine kinase inhibitor combinations with synergistic effect in TNBC cell line and xenograft models for potential combinatorial therapy against TNBC.” We hope it satisfies the reviewer.
c. Line 31 (and others): Please check the capitals of words describing inhibitors, and make them consistent (e.g. Imatinib written with capital I, other inhibitors without capitals).
We thank the reviewer for catching this error. We changed all “imatinib” and “osimertinib” to lowercase.
d. Line 71: "... combining PP2, saracatinib (FYN inhibitor), .." ..." Here it is not clear PP2 is a FYN inhibitor, and, as saracatinib is a well-known Src-inhibitor, it is not correct to just say "FYN inhibitor". Better to rephrase to something such as: "combining PP2 (Lck/Fyn inhibitor), saracatinib (Src/FYN inhibitor).
As reviewer noted, most Src family kinase inhibitors are not selective against specific member among other Src family members. Therefore, we changed line 73 to “PP2, saracatinib (Src family kinase / FYN inhibitor).”
e. Line 81: "The resulting library enabled massively parallel screens of pairwise knockouts, .." To clarify this is for the selected kinases only: "The resulting library enabled screens of pairwise knockouts of the 76 tyrosine kinase genes, .."
Manuscript edited as suggested by the reviewer in line 86.
f. Line 88 (and others): "after infection" consider rephrasing to "after transduction" as this is more commonly used when using lentiviral vectors only.
We thank the reviewer for this. Every “infection” that designates lentiviral transduction were changed to “transduction”.
g. Line 97-99: While being described as "good" correlation, a correlation of the same sgRNA pair, yet in a different order, of r=0.5 does not seem to be very good, neither does a correlation of r=0.74 for biological replicates. Please consider describing in a less subjective way.
We removed the subjective terms and changed the manuscript as follows: “sgRNA pair (e.g., sgRNA-A + sgRNA-B and sgRNA-B + sgRNA-A) were positively correlated (r = 0.50) and were combined when calculating Z (Fig. S1D). The Z scores for three biological replicates were also correlated with r = 0.74 between replicates #2 and #3 (Fig. S1E).” in lines 97-101.
h. Lines 92-96 and lines 102-115: The results section here contains quite a lot of technical information. While some information may be directly needed to understand the described results (such as a very short and simple explanation of how to interpret gene interaction score), other information may be more appropriate for the Methods section, to enhance the readability of the paper. Consider simplifying here and giving a more detailed overview in the Methods section. Also, the text is not entirely clear. You seem to give two separate explanations of how the GI scores were calculated (Starting in lines 106 and 111): please rephrase and clearly indicate the connections between those two explanations (in the Methods section).
We thank the reviewer for valuable suggestion. We moved significant portions of the technical descriptions in methods section. We also clarified the text regarding the procedures for calculating GI scores in lines 385-387.
i. Line 142: "These findings suggest that gene A could represent an attractive drug target.." "Gene A" should be "FYN"?
We thank the reviewer for catching this. Indeed, it is “FYN” and we changed it in line 154.
j. Line 149: Introduce Saracatinib, and make the reader aware that it actually mostly targets Src, and FYN with lower affinity.
We newly added text in lines 73 and 164 to indicate that saracatinib is an inhibitor against Src family kinases.
k. Line 469: "by the two sgRNA." "by the two sgRNAs".
Corrected
l. Throughout text/figures/figure legends, please check for consistency in the naming of cell lines, compounds, referring to figures etc. (E.g. MDA-MB-231/MDA MB 231/MDAMB-231 ; Fig. 1/Figure 1).
Corrected. Thank you for catching this error.
m. In Methods, frequently ug or uL are used instead of µg or µL
Corrected.
n. Legend Figure 5: Clarify what A, G, I, D, and P mean.
Corrected in line 685-686 to: “A: NVP-ADW742, G: gefitinib, I: imatinib, D: doxorubicin, P: Paclitaxel.”
o. Line 303: What is meant by: "The six variable nucleotides were added in reverse primer for multiplexing". Could you clarify this in the text?
We apologize for confusion the six nucleotides is index sequence for multiplexed run in NGS. The text in lines 373-374 is edited to: “The six nucleotides described as “NNNNNN” in reverse primer above represents unique index to identify biological replicates in multiplexed NGS run.”
Reviewer #2 (Recommendations For The Authors):
To enhance the robustness of the conclusions drawn from this study, certain concerns merit attention.
Concerns:
(1) Line 130 indicates that eight synergistic target gene combinations were validated. It would be helpful to clarify the criteria used to select these gene pairs and provide the rationale for studying these specific combinations of genes.
In fact, we had selected the gene pairs that we had the sgRNAs against available when we performed the experiments, so we did not have very good reason to explain our selections. Instead we added a brief discussion in lines 304-306 that further validations are required for the gene pairs not experimentally tested.
(2) According to Figure 2C, FYN was identified as crucial among the 30 gene pairs, and its upregulation in TNBC prompted further investigation. It would be informative to discuss the expression levels of TEK, FRK, and FGFR2 in TNBC and explain why these nodes were not studied. Is there existing evidence demonstrating the superiority of FYN over these other genes?
The similar concern was raised by reviewer #1. The expression levels of TEK, FRK and FGFR2 were relatively low in MDA-MB-231 and TNBCs in general, and we were concerned about the generalizability of these targets for treating TNBC. While the validation of these genes for possible synthetic lethality may lead to valuable insight, this may be beyond scope of this paper. This concern is newly discussed in result and discussion sections in lines 150-154.
(3) The screening process employed only one cell line, and validation was conducted with only one cell line (Figure 2A). Consider supplementing the findings with more convincing evidence from other breast cancer cell lines to strengthen the conclusions.
Although the CRISPR screens and primary validations were done with only one cell line, further validations with drug combinations were done in independent cancer cell lines such as Hs578T (figures S4E-J). Also, the possible association of FYN expression in drug tolerant cells were also demonstrated in lung cancer cells. We hope this satisfies the reviewer.
(4) The network analysis in Figure 2C lacks a description of the methodology used. It would be beneficial to provide a brief explanation of the methods employed for this analysis.
The network analysis was done manually with the size of each node proportional to the number of gene pairs. We newly added text in figure legend in line 638 to clarify this.
(5) The significance of gene A mentioned in line 142 is unclear. Please provide a clear explanation or context for the importance of this gene.
This is a mistake that were also pointed out by reviewer #1. The “gene A” should have been “FYN”. We corrected this in line 154.
6. In Figure 2J and Figure 2K, it would be more informative to measure the phosphorylation levels of FYN and SRC rather than just their baseline levels. Consider revising the figures accordingly.
We thank the reviewer for a careful comment. We newly provide supplementary figure S5A to show that phosphorylation level of FYN is increased, but this increase was proportional to the increase in FYN protein level, so the ratio of pFYN/FYN did not change significantly. We discussed this result in lines 187-190.
(7) Figure S4B lacks biological replicates, which could impact the reliability of the experimental results. Consider adding biological replicates to enhance the robustness of the findings.
This was also pointed out by reviewer #1. Instead of performing qPCR for all drugs, we focused on validating the decrease in FYN mRNA level for drug combinations that synergistically kill cancer cells. We were also aiming to identify direct mediator of FYN mRNA upregulation, so we focused on drug combination that involves inhibitor of epigenetic regulator that promotes transcription. To this end, we tested the impact of GSK-J4(KDM6 inhibitor) and pinometostat (DOT1L inhibitor) in combination with TKI in regulating FYN expression level. Notably, while GSK-J4 attenuated FYN mRNA accumulation by NVP-ADW742 treatment, pinometostat failed to do so (figure S5C). We newly described these results in lines 192-197 in results section.
(8) Line 186 indicates that KDM3 knockout was not tested in Figure S5A. It would be helpful to provide an explanation for this omission or consider including the data if available.
We thank the reviewer for pointing this out. The T7 endonuclease assay results for KDM3, KDM4 and PHF8 are added in figure S6B. All guide RNAs used in the study efficiently generated indel mutations.
(9) In line 206, KDM4A is introduced, but Figures 3J and 3M had already pointed to KDM4A. The authors did not analyze the ChIP results for other members of the KDM4 family at this point. Please address this inconsistency and provide a rationale for focusing on KDM4A. Additionally, in Figure 3M, consider adding peak labeling to the enriched portion for clarity.
We welcome the reviewer’s careful concern. KDM4 family enzymes perform catalytically identical reactions, and are thought to be redundant. Therefore, we judged that the most abundantly expression genes among KDM4 family should be the primary target to focus on. To this end, we analyzed the expression levels of KDM4 family genes in supplementary figure S6A. Indeed KDM4A expression was the highest among other KDM4 family genes. We discussed this in results section in lines 218-220.
(10) The author only indicated the relationship between the H3K9me3 level in the enhancer region and FYN expression. It would be valuable to verify the activity of the enhancers and investigate additional markers such as H3K27ac and H3K4me1. Consider discussing these aspects to provide a more comprehensive understanding.
Since we and others had shown that histone dementhylases are increased upon drug treatment, we focused on histone methylation marks which are associated with gene repression and whose removal by demethylases are associated with drug resistance. To this end, KDM6 demethylases removing H3K27me3 may serve as attractive alternative. In our newly added supplementary figure S6E, ADW742 treatment did not decrease H3K27me3 level in FYN promoter, indicating that H3K9me3 may be the dominant epigenetic change that modulates FYN expression upon drug treatment. This was briefly discussed in lines 233-235.
(11) In Figure 4A, the addition of the drug alone does not inhibit tumor growth. Please provide an explanation for this result and consider discussing potential reasons for the observed lack of inhibition.
The drug dose was adjusted carefully to minimize tumor shrinkage by single drug so that synergistic tumor shrinkage can be clearer.
(12) Line 208 indicates missing parentheses in the text describing Figure 4C. Please correct the text accordingly to ensure clarity.
Corrected. Thank you for catching this error.
(13) The figure legends for Figures 5E, F, G, and H contain errors. Please correct the figure legends to accurately describe the respective figures.
We thank the reviewer for catching this error. We have changed the figure legends in lines 691-697 to accurately describe the figures.
(14) It may be beneficial for the authors to divide the results section into several subsections and add headings to improve the overall understanding of the findings.
This is an excellent suggestion. We divided our results section into subsections and added headings in lines 80, 141, 181, 237 and 251 to help readers understand our findings.
(15) The authors should include the sgRNA sequences used for gene targeting, along with details of the target genes and negative/positive controls, in the Supplementary Materials to enhance reproducibility and transparency.
This is a critical point for improving reproducibility of our work. The sgRNA sequences used in the study are newly added in supplementary table S3.
(16) The resolution of the figures in the Supplementary Materials is too low, which may impede the authors' ability to interpret the data. Consider providing higher-resolution figures for better readability.
We had similar concern posed by reviewer #1, we provided higher resolution image for all main and supplementary figures.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public Review):
Summary:
The authors constructed a novel HSV-based therapeutic vaccine to cure SIV in a primate model. The novel HSV vector is deleted for ICP34.5. Evidence is given that this protein blocks HIV reactivation by interference with the NF-kB pathway. The deleted construct supposedly would reactivate SIV from latency. The SIV genes carried by the vector ought to elicit a strong immune response. Together the HSV vector would elicit a shock and kill effect. This is tested in a primate model.
Thank you for your kind comments and suggestions, which are very helpful in improving our manuscript. We have carefully revised our manuscript and performed additional experiments accordingly, and we now think this version has been substantially improved for your reconsideration.
Strengths and weaknesses:
(1) Deleting ICP34.5 from the HSV construct has a very strong effect on HIV reactivation. Why is no eGFP readout given in Figure 1C as for WT HSV? The mechanism underlying increased activation by deleting ICP34.5 is only partially explored. Overexpression of ICP34.5 has a much smaller effect (reduction in reactivation) than deletion of ICP34.5 (strong activation); so the story seems incomplete.
Thank you for your careful review and kind reminder.
(1) We are sorry for the misunderstanding of Figure 1C. In the experiment of Figue 1C, we used an HSV-1 17 strain containing GFP (HSV-GFP) and HSV-DICP34.5 (recombinant HSV-1 17 strain with ICP34.5 deletion based on HSV-GFP) to reactivate the HIV latency cell line (J-Lat 10.6 cell). Since detecting GFP cannot distinguish between HSV infection and HIV reactivation, we assessed the reactivation by measuring the mRNA levels of HIV LTR upon stimulation with either HSV-GFP or HSV-ΔICP34.5. Actually, in Figure 1B, we had verified the reactivation efficacy by infecting J-Lat 10.6 cells with the HSV-1 17 strain containing GFP (HSV-GFP) and found significant upregulation of mRNA levels of HIV-1 LTR, Tat, Gag, Vif, and Vpr. We have adjusted the corresponding descriptions accordingly in the revised manuscript.
(2) We agree with your insightful mention that the mechanism underlying increased activation by HSV-ΔICP34.5 is worthy to be further explored in the future study. In this study, we found that ICP34.5 play an antagonistic role with the reactivation of HIV latency by HSV-1 mainly through the modulation of host NF-κB and HSF1 pathways, while HSV-1 (especially HSV-ΔICP34.5) might reactivate HIV latency through NF-κB, HSF1, and other yet-to-be-determined mechanisms. Thus, ICP34.5 overexpression can only a partial effect on the reduction of the HIV latency reactivation by HSV-1. We have mentioned this issue in the revised “Discussion section”. “Intriguingly, these findings collectively indicated that ICP34.5 might play an antagonistic role in the reactivation of HIV by HSV-1, and thus our modified HSV-DICP34.5 constructs can effectively reactivate HIV/SIV latency through the release of imprisonment from ICP34.5. However, ICP34.5 overexpression had only a partial effect on the reduction of the HIV latency reactivation, indicating that HSV-DICP34.5-based constructs can also reactivate HIV latency through other yet-to-be-determined mechanisms.” (Lines 334 to 340).
(2) No toxicity data are given for deleting ICP34.5. How specific is the effect for HIV reactivation? An RNA seq analysis is required to show the effect on cellular genes.
Thank you for your questions and suggestions.
(1) It’s well known that ICP34.5 is a neurotoxicity factor that can antagonize host immune responses, and previous studies (in gene therapy and oncolytic virotherapy) have shown that the safety of recombinant HSV-based vector can be improved by deleting ICP34.5. In this study, we also found that HSV-DICP34.5 exhibited lower virulence and replication ability than its parental strain (HSV-GFP) (Figure 1D, Figure S1). In addition, HSV-DICP34.5 induced a lower level of inflammatory cytokines (including IL-6, IL-1β, and TNF-α) in primary CD4+ T cells from PLWH compared to HSV-GFP stimulation, likely due to its lower virulence and replication ability (Figure 1I-K). In addition, the CD4+ /CD8+ T cell ratio (Figure 5I) and body weight (Figure S9) after treatment were effectively ameliorated in the SIV-infected macaques of the ART+HSV-DICP34.5-sPD1-SIVgag/SIVenv group. Our data also demonstrated that there was no significant effect on the cell composition of peripheral blood in the SIV-infected macaques of ART+HSV-sPD1-SIVgag/SIVenv group (Figure S10). Thus, these data suggest the safety of HSV-DICP34.5 in PLWH might be tolerable. We have added the corresponding description in the revised manuscript.
(2) In our study, we found both adenovirus and vaccinia virus cannot reactivate HIV latency (Figure S3). In addition, the deletion of ICP0 gene from HSV-1 diminished the reactivation effect of HIV latency by HSV-1 (Figure S4). Thus, these data suggested the reactivation of HIV latency by HSV-1 might be virus-specific. Of course, this might be further investigated in future studies. We have added the corresponding description in the revised manuscript.
(3) To explore the mechanism of reactivating viral latency by HSV-DICP34.5-based constructs, we performed RNA-seq analysis (Figure S5). We have added the corresponding description accordingly in the revised manuscript.
(3) The primate groups are too small and the results to variable to make averages. In Figure 5, the group with ART and saline has two slow rebounders. It is not correct to average those with a single quick rebounder. Here the interpretation is NOT supported by the data.
We agree with you that this is a pilot study with limited numbers of rhesus macaques. Although the number of macaques was relatively limited, these nine macaques were distributed evenly based on the background level of age, sex, weight, CD4 count, and viral load (VL) (Table S2). All SIV-infected macaques used in this study had a long history of SIV infection and had several courses of ART therapy, which mimics treatment of chronic HIV-1 infection in humans. These macaques were infected with SIVmac239 for more than 5 years, and highly pathogenic SIV-infected macaques have been well-validated as a stringent model to recapitulate HIV-1 pathogenesis and persistence during ART therapy in humans. Indeed, in our Chinese rhesus model, ART treatment effectively suppressed SIV infection to undetectable levels in plasma, and upon ART discontinuation, virus rapidly rebounded, which is very similar with that in ART-treated HIV patients. We think the results of this pilot study were very promising for further studies which will be expanded the scale of animals and then to preclinical and clinical study in our next projects. Thank you for your understanding.
As for your question regarding “the two animals with low VL and slow rebound”, our explanation is following: As mentioned above, these macaques were distributed evenly based on the background level of CD4 count and VL (Table S2), and then there were different change of viral load and viral rebound in different groups. Thus, we think these data can support our interpretation. Moreover, our conclusion can also be supported from at least three evidences.
(1) The VL in the ART+saline group promptly rebounded after ART discontinuation, with an average 8.63-fold increase in the rebounded peak VL compared with the pre-ART VL (Figure 5A, D and E). However, plasma VL in the ART+HSV-sPD1-SIVgag/SIVenv group exhibited a delayed rebound interval (Figure 5B-D).
(2) There was a lower rebounded peak VL than pre-ART VL in the ART+HSV-sPD1-SIVgag/SIVenv group (average 12.20-fold decrease), while a higher rebounded peak VL than pre-ART VL in the ART+HSV-empty group (average 2.74-fold increase) (Figure 5E).
(3) We found significant suppression of total SIV DNA and integrated SIV DNA provirus in the ART+HSV-sPD1-SIVgag/SIVenv group. However, the copies of the SIV DNA provirus were significantly improved in the ART+HSV-empty group and ART+saline group (Figure 5F-G).
Thank you for your understanding.
Discussion
HSV vectors are mainly used in cancer treatment partially due to induced inflammation. Whether these are suitable to cure PLWH without major symptoms is a bit questionable to me and should at least be argued for.
Thank you for your kind question comment and question. We confirmed the enhanced reactivation of HIV latency by HSV-∆ICP34.5 in primary CD4+ T cells from people living with HIV (PLWH) (Figure S2). As mentioned above, previous studies have shown that the safety of recombinant HSV-based vector can be improved by deleting ICP34.5. In this study, we also found that HSV-DICP34.5 exhibited lower virulence and replication ability than its parental strain (HSV-GFP) (Figure 1D, Figure S1). In addition, HSV-DICP34.5 induced a lower level of inflammatory cytokines (including IL-6, IL-1β, and TNF-α) in primary CD4+ T cells from PLWH compared to HSV-GFP stimulation, likely due to its lower virulence and replication ability (Figure 1I-K). In addition, the CD4+ /CD8+ T cell ratio (Figure 5I) and body weight (Figure S9) after treatment were effectively ameliorated in the SIV-infected macaques of the ART+HSV-DICP34.5-sPD1-SIVgag/SIVenv group. Our data also demonstrated that there was no significant effect on the cell composition of peripheral blood in the SIV-infected macaques of ART+HSV-sPD1-SIVgag/SIVenv group (Figure S10). Thus, these data suggest the safety of HSV-DICP34.5 in PLWH might be tolerable. We have added the corresponding description in the revised manuscript.
Reviewer #2 (Public Review):
Summary:
In this article, Wen et. al. describe the development of a 'proof-of-concept' bi-functional vector based on HSV-deltaICP-34.5's ability to purge latent HIV-1 and SIV genomes from cells. They show that co-infection of latent J-lat T-cell lines with an HSV-deltaICP-34.5 vector can reactivate HIV-1 from a latent state. Over- or stable expression of ICP 34.5 ORF in these cells can arrest latent HIV-1 genomes from transcription, even in the presence of latency reversal agents. ICP34.5 can co-IP with- and de-phosphorylate IKKa/b to block its interaction with NF-k/B transcription factor. Additionally, ICP34.5 can interact with HSF1 which was identified by mass-spec. Thus, the authors propose that the latency reversal effect of HSV-deltaICP-34.5 in co-infected JLat cells is due to modulatory effects on the IKKa/b-NF-kB and PP1-HSF-1 pathway.
Next, the authors cleverly construct a bifunctional HSV-based vector with deleted ICP34.5 and 47 ORFs to purge latency and avoid immunological refluxes, and additionally, expand the application of this construct as a vaccine by introducing SIV genes. They use this 'vaccine' in mouse models and show the expected SIV-immune responses. Experiments in rhesus macaques (RM), further elicit the potential for their approach to reactivate SIV genomes and at the same time block their replication by antibodies. What was interesting in the SIV experiments is that the dual-functional vector vaccine containing sPD1- and SIV Gag/Env ORFs effectively delayed SIV rebound in RMs and in some cases almost neutralized viral DNA copy detection in serum. Very promising indeed, however, there are some questions I wish the authors had explored to get answers to, detailed below.
Overall, this is an elegant and timely work demonstrating the feasibility of reducing virus rebound in animals, with the potential to expand to clinical studies. The work was well-written, and sections were clearly discussed.
Strengths:
The work is well designed, rationale explained, and written very clearly for lay readers.<br /> Claims are adequately supported by evidence and well-designed experiments including controls.
Thank you for your nice comments regarding our work.
Weaknesses:
(1) While the mechanism of ICP34.5 interaction and modulation of the NF-kB and HSF1 pathways are shown, this only proves ICP34.5 interactions but does not give away the mechanism of how the HSV-deltaICP-34.5 vector purges HIV-1 latency. What other components of the vector are required for latency reversal? Perhaps serial deletion experiments of the other ORFs in the HSV-deltaICP-34.5 vector might be revealing.
Thank you for your valuable suggestion. In fact, we are currently further exploring some potential viral genes of HSV-1 that might play a role in the reactivation of HIV latency. We have found that the deletion of ICP0 gene from HSV-1 diminished the reactivation effect of HIV latency by HSV-1 (Figure S4), showing that ICP0 might play a vital role for the reactivation. Of course, this might be further investigated in future studies. We have added the corresponding description in the revised manuscript.
(2) The efficacy of the HSV vaccine vectors was evaluated in Rhesus Macaque model animals. Animals were chronically infected with SIV (a parent of HIV), treated with ART, challenged with bi-functional HSV vaccine or controls, and discontinued treatment, and the resulting virus burden and immune responses were monitored. The animals showed SIV Gag and Env-specific immune responses, and delayed virus rebound (however rebound is still there), and below-detection viral DNA copies. What would make a more convincing argument to this reviewer will be data to demonstrate that after the bi-functional vaccine, the animals show overall reduction in the number of circulating latent cells. The feasibility of obtaining such a result is not clearly demonstrated.
Thank you for your valuable mention. We have now provided more data about this issue. We found significant suppression of total SIV DNA and integrated SIV DNA provirus in the ART+HSV-sPD1-SIVgag/SIVenv group. However, the copies of the SIV DNA provirus were significantly improved in the ART+HSV-empty group and ART+saline group (Figure 5F-G). We have added the corresponding description in the revised manuscript.
(3) The authors state that the reduced virus rebound detected following bi-functional vaccine delivery is due to latent genomes becoming activated and steady-state neutralization of these viruses by antibody response. This needs to be demonstrated. Perhaps cell-culture experiments from specimens taken from animals might help address this issue. In lab cultures one could create environments without antibody responses, under these conditions one would expect a higher level of viral loads to be released in response to the vaccine in question.
Thanks for your kind mention and suggestion. We performed the following cell experiment to address this issue. Primary CD4+ T cells from people living with HIV (PLWH) were isolated, and then infected with HSV or HSV-∆ICP34.5 constructs. As expected, we confirmed the enhanced reactivation of HIV latency by HSV-∆ICP34.5 (Figure S2). Thank you.
(4) How do the authors imagine neutralizing HIV-1 envelope epitopes by a similar strategy? A discussion of this point may also help.
Thank you for your kind comment. We have added the corresponding discussion in the revised manuscript. “The current consensus on HIV/AIDS vaccines emphasizes the importance of simultaneously inducing broadly neutralizing antibodies and cellular immune responses. Therefore, we believe that incorporating the induction of broadly neutralizing antibodies into our future optimizing approaches may lead to better therapeutic outcomes.” (Lines 384 to 388)
(5) I thought the empty HSV-vector control also elicited somewhat delayed kinetics in virus rebound and neutralization, can the authors comment on why this is the case?
Thank you for your careful review and mention. We agree with you that the HSV-1 empty vector does exhibit somewhat a delayed rebound. We think the possible reason is: Although the empty HSV-vector cannot elicit SIV-specific CTL responses, it effectively activates the latent SIV reserviors, and then these activated virions can be partially killed by ART drugs. Therefore, even without carrying HIV/SIV antigens, somewhat delayed kinetics in virus rebound may be observed. Thank you.
Reviewer #1 (Recommendations For The Authors):
(1) The authors should provide toxicity data for HSV transduction after deleting ICP34.5 and provide an explanation of why overexpression of ICP34.5 has such a small effect.
Thank you for your questions and suggestions. As mentioned above, we now provided data for the safety of HSV-DICP34.5-based constructs.
(1) It’s well known that ICP34.5 is a neurotoxicity factor that can antagonize host immune responses, and previous studies (in gene therapy and oncolytic virotherapy) have shown that the safety of recombinant HSV-based vector can be improved by deleting ICP34.5. In this study, we also found that HSV-DICP34.5 exhibited lower virulence and replication ability than its parental strain (HSV-GFP) (Figure 1D, Figure S1). In addition, HSV-DICP34.5 induced a lower level of inflammatory cytokines (including IL-6, IL-1β, and TNF-α) in primary CD4+ T cells from PLWH compared to HSV-GFP stimulation, likely due to its lower virulence and replication ability (Figure 1I-K). In addition, the CD4+ /CD8+ T cell ratio (Figure 5I) and body weight (Figure S9) after treatment were effectively ameliorated in the SIV-infected macaques of the ART+HSV-DICP34.5-sPD1-SIVgag/SIVenv group. Our data also demonstrated that there was no significant effect on the cell composition of peripheral blood in the SIV-infected macaques of ART+HSV-sPD1-SIVgag/SIVenv group (Figure S10). Thus, these data suggest the safety of HSV-DICP34.5 in PLWH might be tolerable. We have added the corresponding description in the revised manuscript.
(2) We agree with your insightful mention that the mechanism underlying increased activation by HSV-ΔICP34.5 is worthy to be further explored in the future study. In this study, we found that ICP34.5 play an antagonistic role with the reactivation of HIV latency by HSV-1 mainly through the modulation of host NF-κB and HSF1 pathways, while HSV-1 (especially HSV-ΔICP34.5) might reactivate HIV latency through NF-κB, HSF1, and other yet-to-be-determined mechanisms. Thus, ICP34.5 overexpression can only a partial effect on the reduction of the HIV latency reactivation by HSV-1. We have mentioned this issue in the revised “Discussion section”. “Intriguingly, these findings collectively indicated that ICP34.5 might play an antagonistic role in the reactivation of HIV by HSV-1, and thus our modified HSV-DICP34.5 constructs can effectively reactivate HIV/SIV latency through the release of imprisonment from ICP34.5. However, ICP34.5 overexpression had only a partial effect on the reduction of the HIV latency reactivation, indicating that HSV-DICP34.5-based constructs can also reactivate HIV latency through other yet-to-be-determined mechanisms.” (Lines 334 to 340).
(2) How specific is the effect for HIV reactivation? An RNA seq analysis is required to show the effect on cellular genes.
Thank you for your questions and suggestions.
(1) In our study, we found both adenovirus and vaccinia virus cannot reactivate HIV latency (Figure S3). In addition, the deletion of ICP0 gene from HSV-1 diminished the reactivation effect of HIV latency by HSV-1 (Figure S4). Thus, these data suggested the reactivation of HIV latency by HSV-1 might be virus-specific. Of course, this might be further investigated in future studies. We have added the corresponding description in the revised manuscript.
(2) To explore the mechanism of reactivating viral latency by HSV-DICP34.5-based constructs, we performed RNA-seq analysis (Figure S5). Results showed that there were numerous differentially expressed genes (DEGs) in response to HSV-ΔICP34.5 infection. Among them, 2288 genes were upregulated, and 611 genes were downregulated. GO analysis showed the enrichment of these DEGs in cellular cycle, cellular development, and cellular proliferation, and KEGG enrichment analysis indicated the enrichment in pathways such as cellular cycle and cytokine-cytokine receptor interaction. We have added the corresponding description accordingly in the revised manuscript.
(3) A comparison in primates has to be given for constructs with or without ICP34.5 to validate cell culture data (what is an empty vector?)
Thank you for your reminder. In the revised manuscript, we performed the following cell experiment to address this issue. Primary CD4+ T cells from people living with HIV (PLWH) were isolated, and then infected with HSV or HSV-∆ICP34.5 constructs. As expected, we confirmed the enhanced reactivation of HIV latency by HSV-∆ICP34.5 (Figure S2). Thank you.
(4) Legends should be improved in writing and content.
Thank you for your kind mention. In the revised version, we have improved both the manuscript content and the legends of all Figures have been carefully revised in writing and content. Thank you.
(5) The primate groups should be enlarged before any reliable conclusions can be made. Inflammatory/tox data should be provided.
Thank you for your question.
(1) As mentioned above, we agree with you that this is a pilot study with limited numbers of rhesus macaques. Although the number of macaques was relatively limited, these nine macaques were distributed evenly based on the background level of age, sex, weight, CD4 count, and viral load (VL) (Table S2). All SIV-infected macaques used in this study had a long history of SIV infection and had several courses of ART therapy, which mimics treatment of chronic HIV-1 infection in humans. These macaques were infected with SIVmac239 for more than 5 years, and highly pathogenic SIV-infected macaques have been well-validated as a stringent model to recapitulate HIV-1 pathogenesis and persistence during ART therapy in humans. Indeed, in our Chinese rhesus model, ART treatment effectively suppressed SIV infection to undetectable levels in plasma, and upon ART discontinuation, virus rapidly rebounded, which is very similar with that in ART-treated HIV patients. We think the results of this pilot study were very promising for further studies which will be expanded the scale of animals and then to preclinical and clinical study in our next projects. Thank you for your understanding.
(2) As well known, ICP34.5 is a neurotoxicity factor that can antagonize host immune responses, and previous studies have shown that the safety of recombinant HSV-based vector can be improved by deleting ICP34.5. In this study, we also found that HSV-DICP34.5 exhibited lower virulence and replication ability than its parental strain (HSV-GFP) (Figure 1D, Figure S1). In addition, HSV-DICP34.5 induced a lower level of inflammatory cytokines (including IL-6, IL-1β, and TNF-α) in primary CD4+ T cells from PLWH compared to HSV-GFP stimulation, likely due to its lower virulence and replication ability (Figure 1I-K). In addition, the CD4+ /CD8+ T cell ratio (Figure 5I) and body weight (Figure S9) after treatment were effectively ameliorated in the SIV-infected macaques of the ART+HSV-DICP34.5-sPD1-SIVgag/SIVenv group. Our data also demonstrated that there was no significant effect on the cell composition of peripheral blood in the SIV-infected macaques of ART+HSV-sPD1-SIVgag/SIVenv group (Figure S10). Thus, these data suggest the safety of HSV-DICP34.5 in PLWH might be tolerable. We have added the corresponding description in the revised manuscript.
(6) Discuss the potential of inflammatory HSV vaccines to be used in PLWH without clinical symptoms.
Thank you for your mention. As discussed above, we found that HSV-DICP34.5 exhibited lower virulence and replication ability than its parental strain (Figure 1D, Figure S1), and we also found that HSV-DICP34.5 induced a lower level of inflammatory cytokines (including IL-6, IL-1β, and TNF-α) in primary CD4+ T cells from PLWH compared to HSV-GFP stimulation, likely due to its lower virulence and replication ability (Figure 1I-K). In addition, the CD4+ /CD8+ T cell ratio (Figure 5I) and body weight (Figure S9) after treatment were effectively ameliorated in the SIV-infected macaques of the ART+HSV-DICP34.5-sPD1-SIVgag/SIVenv group. Our data also demonstrated that there was no significant effect on the cell composition of peripheral blood in the SIV-infected macaques of ART+HSV-sPD1-SIVgag/SIVenv group (Figure S10). Thus, these data suggest the safety of HSV-DICP34.5 in PLWH might be tolerable. We have added the corresponding description in the revised manuscript.
Reviewer #2 (Recommendations For The Authors):
I think the authors have done due diligence to the experimental system, and collected evidence to show the feasibility of delaying virus rebound in macaques. However, I would encourage the authors to perform experiments that can back up the claim that delayed virus rebound is due to neutralization effects, or perhaps due to a reduction in viral reservoir. I believe insights into this process will add rigor, and push the relevance of the study to the next level.
Thank you for your nice comment and valuable suggestion. We have now provided more data about this issue. We found significant suppression of total SIV DNA and integrated SIV DNA provirus in the ART+HSV-sPD1-SIVgag/SIVenv group. However, the copies of the SIV DNA provirus were significantly improved in the ART+HSV-empty group and ART+saline group (Figure 5F-G). We also discussed that incorporating the induction of broadly neutralizing antibodies into our future optimizing approaches may lead to better therapeutic outcomes in the revised Discussion section. We have added the corresponding description in the revised manuscript. Thank you.
Altogether, all of the above comments and suggestions are very helpful in improving our manuscript. We have taken these comments into account seriously and try our best to address these questions point-by-point. After making extensive revisions, we now submit this revised manuscript for your re-consideration. Thank you again for all of your comments and suggestions.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
Summary:
This study uses single nucleus multiomics to profile the transcriptome and chromatin accessibility of mouse XX and XY primordial germ cells (PGCs) at three time-points spanning PGC sexual differentiation and entry of XX PGCs into meiosis (embryonic days 11.5-13.5). They find that PGCs can be clustered into sub-populations at each time point, with higher heterogeneity among XX PGCs and more switch-like developmental transitions evident in XY PGCs. In addition, they identify several transcription factors that appear to regulate sex-specific pathways as well as cell-cell communication pathways that may be involved in regulating XX vs XY PGC fate transitions. The findings are important and overall rigorous. The study could be further improved by a better connection to the biological system, including the addition of experiments to validate the 'omics-based findings in vivo and putting the transcriptional heterogeneity of XX PGCs in the context of findings that meiotic entry is spatially asynchronous in the fetal ovary. Overall, this study represents an advance in germ cell regulatory biology and will be a highly used resource in the field of germ cell development.
Strengths:
(1) The multiomics data is mostly rigorously collected and carefully interpreted.
(2) The dataset is extremely valuable and helps to answer many long-standing questions in the field.
(3) In general, the conclusions are well anchored in the biology of the germ line in mammals.
Weaknesses:
(1) The nature of replicates in the data and how they are used in the analysis are not clearly presented in the main text or methods. To interpret the results, it is important to know how replicates were designed and how they were used. Two "technical" replicates are cited but it is not clear what this means.
The two independent technical replicates comprised different pools of paired gonads. This sentence was added to the methods section of the revised manuscript.
(2) Transcriptional heterogeneity among XX PGCs is mentioned several times (e.g., lines 321-323) and is a major conclusion of the paper. It has been known for a long time that XX PGCs initiate meiosis in an anterior-to-posterior wave in the fetal ovary starting around E13.5. Some heterogeneity in the XX PGC populations could be explained by spatial position in the ovary without having to invoke novel subpopulations.
We thank the reviewer for pointing out this important biological phenomenon. We also recognize that transcriptional heterogeneity among XX PGCs is likely due to the anterior-to-posterior wave of meiotic initiation in E13.5 ovaries and highlight this possibility in our manuscript. However, since our study utilizes single-nucleus RNA-sequencing and not spatial transcriptomics, we are not able to capture the spatial location of the XX PGCs analyzed in our dataset. As such, our analysis applied clustering tools to classify the populations of XX PGCs captured in our dataset.
(3) There is essentially no validation of any of the conclusions. Heterogeneity in the expression of a given marker could be assessed by immunofluorescence or RNAscope.
In our revised manuscript, we included immunofluorescence staining of potential candidate factors involved in PGC sex determination, such as PORCN and TFAP2C. Testing and optimizing antibodies for the targets identified in this study are ongoing efforts in our lab and we look forward to sharing our results with the research community.
(4) The paper sometimes suffers from a problem common to large resource papers, which is that the discussion of specific genes or pathways seems incomplete. An example here is from the analysis of the regulation of the Bnc2 locus, which seems superficial. Relatedly, although many genes and pathways are nominated for important PGC functions, there is no strong major conclusion from the paper overall.
In this manuscript, we set out to identify candidate factors, some already known and many others unknown, involved in the developmental pathways of PGC sex determination using computational tools. Our goal, as a research group and with future collaborators, is to screen these interesting candidates and discover their function in the primordial germ cell. Our research, presented in this study, represents a launching pad for which to identify future projects that will investigate these factors in further detail.
Reviewer #2 (Public Review):
Summary:
This manuscript by Alexander et al describes a careful and rigorous application of multiomics to mouse primordial germ cells (PGCs) and their surrounding gonadal cells during the period of sex differentiation.
Strengths:
In thoughtfully designed figures, the authors identify both known and new candidate gene regulatory networks in differentiating XX and XY PGCs and sex-specific interactions of PGCs with supporting cells. In XY germ cells, novel findings include the predicted set of TFs regulating Bnc2, which is known to promote mitotic arrest, as well as the TFs POU6F1/2 and FOXK2 and their predicted targets that function in mitosis and signal transduction. In XX germ cells, the authors deconstruct the regulation of the premeiotic replication regulator Stra8, which reveals TFs involved in meiosis, retinoic acid signaling, pluripotency, and epigenetics among predictions; this finding, along with evidence supporting the regulatory potential of retinoic acid receptors in meiotic gene expression is an important addition to the debate over the necessity of retinoic acid in XX meiotic initiation. In addition, a self-regulatory network of other TFs is hypothesized in XX differentiating PGCs, including TFAP2c, TCF5, ZFX, MGA, and NR6A1, which is predicted to turn on meiotic and Wnt signaling targets. Finally, analysis of PGC-support cell interactions during sex differentiation reveals more interactions in XX, via WNTs and BMPs, as well as some new signaling pathways that predominate in XY PGCs including ephrins, CADM1, Desert Hedgehog, and matrix metalloproteases. This dataset will be an excellent resource for the community, motivating functional studies and serving as a discovery platform.
Weaknesses:
My one major concern is that the conclusion that PGC sex differentiation (as read out by transcription) involves chromatin priming is overstated. The evidence presented in the figures includes a select handful of genes including Porcn, Rimbp1, Stra8, and Bnc2 for which chromatin accessibility precedes expression. Given that the authors performed all of their comparisons between XX versus XY datasets at each timepoint, have they missed an important comparison that would be a more direct test of chromatin priming: between timepoints for each sex? Furthermore, it remains possible that common mechanisms of differentiation to XX and XY could be missing from this analysis that focused on sexspecific differences.
We thank the reviewer for their thoughtful assessment and suggestions, as stated here. We note that chromatin priming in PGCs prior to sex determination is a well-documented research finding (see references below), that is further supported by our single-nucleus multiomics data. To support these findings previously stated in the scientific literature, we included data demonstrating the asynchronous correlation between chromatin accessibility and gene expression during PGC sex determination. Specifically, we investigated the associations of differentially accessible chromatin peaks with differentially expressed gene expression for each PGC type (between sexes and across embryonic stages) using computational tools and methods that are well-established and applied by the research community. In our manuscript, we note that the patterns we identified support the potential role of chromatin priming in PGC sex determination. Nevertheless, we further highlight that a comprehensive profile of 3D chromatin structure and enhancer-promoter contacts in differentiating PGCs is needed to fully understand how changes to chromatin facilitate PGC sex determination.
References:
(1) Chen, M., et al. Integration of single-cell transcriptome and chromatin accessibility of early gonads development among goats, pigs, macaques, and humans. Cell Reports 41 (2022).
(2) Huang, T.-C. et al. Sex-specific chromatin remodelling safeguards transcription in germ cells. Nature 600, 737–742 (2021).
Reviewer #3 (Public Review):
Summary:
Alexander et al. reported the gene-regulatory networks underpinning sex determination of murine primordial germ cells (PGCs) through single-nucleus multiomics, offering a detailed chromatin accessibility and gene expression map across three embryonic stages in both male (XY) and female (XX) mice. It highlights how regulatory element accessibility may precede gene expression, pointing to chromatin accessibility as a primer for lineage commitment before differentiation. Sexual dimorphism in these elements and gene expression increases over time, and the study maps transcription factors regulating sexually dimorphic genes in PGCs, identifying sex-specific enrichment in various transcription factors. Strengths:
The study includes step-wise multiomic analysis with some computational approach to identify candidate TFs regulating XX and XY PGC gene expression, providing a detailed timeline of chromatin accessibility and gene expression during PGC development, which identifies previously unknown PGC subpopulations and offers a multimodal reference atlas of differentiating PGC clusters. Furthermore, the study maps a complex network of transcription factors associated with sex determination in PGCs, adding depth to our understanding of these processes.
Weaknesses:
While the multiomics approach is powerful, it primarily offers correlational insights between chromatin accessibility, gene expression, and transcription factor activity, without direct functional validation of identified regulatory networks.
As stated in our response above to a similar concern, we note that our research study represents a launching pad for which to identify future projects that will investigate candidates that may be involved in PGC sex determination, in further detail. With this rich dataset in hand, our goal in future research projects is to screen these candidates and discover their function in PGCs.
Response to Recommendations
Reviewer #1 (Recommendations For The Authors):
(1) Clarify at first introduction how combined ATAC-seq/RNA-seq mulitomics libraries were prepared, including if ATAC and RNA-seq data are from the same cell.
This information was added to the introduction of the revised manuscript.
(2) Clarify what the two technical replicates represent. Are they two libraries from the same gonad or the same pool of gonads? Are they from 2 different gonads?
The two independent technical replicates comprised different pools of paired gonads. This sentence was added to the methods section of the revised manuscript.
(3) In Supplemental Figure 1, there is substantial variation in the number of unique snATAC-seq fragments between some conditions. Could this create a systematic bias that affects clustering?
We recognize the concern that substantial variation in the number of unique snATAC-seq fragments between conditions could potentially create a systematic bias that affects clustering. However, we analyzed our snATAC-seq dataset with Signac, which performs term frequency-inverse document frequency (TF-IDF) normalization. This is a process that normalizes across cells to correct for differences in cellular sequencing depth. Given that sequencing depth was taken into account in our normalization and clustering procedures, and that the unbiased clustering of PGCs also reflects the sex and embryonic stage of PGCs, we are confident that the clustering of the snATAC-seq datasets closely reflects the biological variability present in the PGCs collected.
References:
Signac Website: https://stuartlab.org/signac/articles/pbmc_vignette
Stuart, T., Srivastava, A., Madad, S., Lareau, C. A., & Satija, R. (2021). Single-cell chromatin state analysis with Signac. Nature methods, 18(11), 1333-1341.
(4) In Figures 2a, 2e, 3a, and 3e, the visualization scheme is very difficult to follow. It's very hard to see the colors corresponding to average expression for many genes because the circles are so small. In addition, the yellow color is hard to see and makes it hard to estimate the size of the circle since the boundaries can be indistinct. I recommend using a different visualization scheme and/or set of size scales be used.
In Figures 2a, 2e, 3a, and 3e, we chose this color palette to be inclusive of viewers who are colorblind. The chosen colors are visible on both a computer screen and on printed paper. We also included a legend of the color scale and dot size representing the average expression and percent of cells expressing the gene, respectively. If the color cannot be seen, it is because the cell population is not expressing the gene.
(5) Perform in vivo validation (immunofluorescence or RNAscope) of at least some targets implicated in PGC development by this study.
Such validations (immunofluorescence staining of PORCN and TFAP2C) are now included in Figure 4 and the supplement.
(6) In line 351, the authors state that "we observed a strong demarcation between XX and XY PGCs at E12.5-E13.5." But in Figure 1j it looks like a reasonably high fraction of both XX and XY E12.5 cells are in cluster 1, which should mean that there is some overlap.
While it is true that Figure 1j shows overlap of both XX and XY E12.5 cells in cluster 1, we were commenting on the separation of E12.5 XX (clusters 4 and 5) and E12.5 XY (clusters 8 and 9) PGCs. We have modified the sentence beginning at line 351 to state that the separation between XX and XY PGCs occurs at E13.5.
(7) In lines 404-405: "We first linked snATAC-seq peaks to XY PGC functional genes". It is important to know how the peaks were linked to genes.
We added the following sentence to address this comment: “Peak-to-gene linkages were determined using Signac functionalities and were derived from the correlation between peak accessibility and the intensity of gene expression.”
(8) In Supplemental Figure 5c, the XX E11.5 condition has a substantially higher fraction of ATAC peaks at promoter regions compared to the others. Does this have statistical and biological significance?
This is an interesting observation beyond the scope of our manuscript. Many interesting questions arise from this study and it is our plan to investigate further in the future.
(9) Line 885: "The increased number of DA peaks at E13.5 may be the result of changes to chromatin structure as XX PGCs enter meiotic prophase I"; but in Figure 4b, there's only a modest increase in DAP number from E12.5 to E13.5 in XX PGCs, compared to a massive gain in XY PGCs.
In our manuscript, we comment on both phenomena: the doubling of differentially accessible peaks in XX PGCs from E12.5 to E13.5 and the massive increase in differentially accessible peaks in XY PGCs from E12.5 to E13.5. In our description of these results, we propose several hypotheses leading to these increases in differentially accessible peaks. As such, it cannot be ruled out that the changes to chromatin structure that occur during meiotic prophase I contribute to the gain in differentially accessible peaks in XX PGCs at E13.5, and we included this statement in the manuscript accordingly.
Reviewer #2 (Recommendations For The Authors):
(1) The methods state at line 141 that nuclei with mitochondrial reads of more than 25% were removed, however our understanding from the Bioconductor manual and companion manuscript (Amezquita, R.A., Lun, A.T.L., Becht, E. et al. Orchestrating single-cell analysis with Bioconductor. Nat Methods 17, 137-145 (2020). https://doi.org/10.1038/s41592-019-0654-x) is that snRNA-seq approaches remove mitochondrial transcripts entirely and datasets containing mitochondrial transcripts are thought to feature incompletely stripped nuclei. It is thought that mitochondrial transcripts participating in nuclear import may remain hanging on to the nuclear envelope and get encapsulated into GEMs. If the mitochondrial read cutoff of 25% was used intentionally to keep this potentially contaminating signal, please justify why this was done for this dataset.
We agree with the reviewer that the presence of mitochondrial transcripts may be potentially contaminating signal. In our preprocessing steps, we removed the mitochondrial genes and transcripts from our datasets so that they would not influence or affect our analyses. The following sentence was added to the methods section on snRNA-seq data processing: “Mitochondrial genes and transcripts were removed from the snRNA-seq datasets to eliminate any potentially contaminating signal.”
(2) Methods line 227: please include log2fold change and p-adjusted value cutoffs for GO enrichment.
We used clusterprofiler for our GO enrichment analysis. Our GO enrichment analysis did not include a log2fold change analysis and the p-adjusted value cutoff is stated in the methods.
(3) Results line 310: the claim that "At E12.5-E13.5, XY PGCs converged onto a single distinct population (cluster 7), indicating less transcriptional diversity among E12.5-E13.5 XY PGCs when compared to E12.5E13.5 XX PGCs (Fig1d)" would be strengthened if the authors quantified transcriptional distance with distance metrics such as euclidean or cosine distance.
We used a clustering approach to gain insights into the transcriptional diversity of PGC populations. Using an additional metric, such as Euclidean or cosine distance, would not provide meaningful information not already achieved by clustering or change the conclusions presented in the manuscript.
(4) Results line 317: the authors allude to Lars2 defining clusters 2 & 3 as a marker gene, but it is not clear why this is highlighted until the reader reaches the discussion, which alludes to the published role of Lars2 in reproduction. Please consider moving this sentence to the results section for clarity and perhaps expanding the discussion on the meaning.
To provide clarity, we added the statement “genes with reported roles in reproduction” to the results section.
(5) In Figure 2a, why do the authors choose to focus on Zkscan5 in XY PGCs when it is expressed by such a small portion of cells (<25%)? Do they assume that this is due to dropouts?
We chose to focus on Zkscan5 as an example because of its enriched and differential expression in male PGCs, the motif for Zkscan5 is not enriched in female PGCs, and the reported roles of Zkscan5 in regulating cellular proliferation and growth. Zkscan5 is an example of how candidate genes can be identified for further investigation.
(6) Line 461: "the population of E13.5 XX PGCs displaying the strongest Stra8 expression levels corresponded to the same population of XX PGCs with the highest module score of early meiotic prophase I genes (Figure 3c; Supplementary Fig. 3a-b)". However did the authors also consider examining the Stra8+ XX PGCs that do not robustly express meiotic genes to understand more about their differentiation potential?
We are thankful to the reviewer for this suggestion. However, this research question is beyond the scope of the manuscript. We plan to investigate further in future research studies.
(7) Line 505: "when we searched for the presence of RA receptor motifs in peaks linked to genes related to meiosis and female sex determination, we found that Stra8, Rec8, Rnf2, Sycp1, Sycp2, Ccnb3, and Zglp1 contain the RA receptor motifs in their regulatory sequences (Supplementary Figure 4g)." My read of the text is that the authors are not taking a side on the RA and meiosis controversy, but rather trying to reveal what the data can tell us, and the answer is that there is a strong signature linking RA to meiotic genes, which supports this as a valid biological pathway. But what is the strength of the RA>meiosis pathway compared to other mechanisms (which must be functioning in the triple receptor KO)? Perhaps the authors could take this analysis further with the following questions: (1) ask whether meiotic genes are more enriched in RA motifs compared to other expressed genes or other motifs (2) compare the strength of peak-gene correlations for all peaks containing RA receptor motifs vs. those with peaks for Zglp1, Rnf2, etc binding. The strengths of these correlations could provide clues to how much gene expression varies in response to RA exposure vs. modulation of these other factors and thus tell us something about how much RA is playing a role.
We agree with the reviewer that this is a very interesting and important question. We also thank the reviewer for their thoughtful suggestions on the types of bioinformatics analyses that could answer this question. However, the section on RA signaling during PGC sex determination is only a small part of the manuscript and would be better analyzed in greater detail in a future research study or publication.
(8) The shift from promoters in E11.5 XX PGCs to distal intergenic regions is fascinating. What can we learn about epigenetic reprogramming/methylation changes across gene bodies?
We agree with the reviewer that this is an interesting question about gene regulation in E11.5 XX PGCs. However, we prefer to analyze the epigenetic reprogramming changes across gene bodies in this cell population in additional research studies. Our purpose and goal for this section was to link differentially accessible chromatin peaks with differentially expressed genes to identify putative gene regulatory networks.
(9) Line 581: why did the authors choose to highlight and validate PORCN1 in PGCs? Please elaborate.
As stated in the manuscript, we chose to highlight and validate PORCN1 in PGCs because of its role in WNT signaling and because of the visibly strong correlation between chromatin accessibility at the XXenriched DAP in Fig. 4c (dashed box) and and gene expression of PORCN1.
(10) Figure 5f would be easier to interpret if presented as two columns rather than a circle; show one line of the proteins and the other line with the transcripts so that each is on the same line and there are connections between them.
This comment is related to stylistic preferences. The purpose of Fig. 5f is to demonstrate that the candidate transcription factors may regulate the expression of other enriched transcription factors. Figure 5f figure accomplishes this goal.
(11) Line 640: "The predicted target genes of TCFL5 totaled 74% (367/494) of all DEGs with peak-to-gene linkages in XX PGCs". This seems like a high number and a lot of work for just TCFL5; given the overlap between other TFs and target genes, how many of these 367 target genes overlap with other TFs?
We agree with the reviewer that this is an important declaration to make. We added the following sentence to the results section on TCFL5: “A large majority of the predicted target genes of TCFL5 were also predicted to be the target genes of the enriched TFs presented in Fig. 5e, e.g., the predicted target genes of these TFs overlapped with 4%-100% of the predicted target genes of TCFL5.”
(12) The presentation of TCFL5 in the results section would make more sense with the additional mention of reproductive phenotypes already known (currently in the discussion Lines 914-917). I would furthermore suggest that the discussion goes into more depth on the difference between the regulatory network of TCFL5 in XX meiosis vs XY.
We thank the reviewer for this comment, however, we already state in the results section that TCFL5 is known to influence XX PGC sex determination.
(13) In the Methods, please state more clearly for those not familiar that the genetic background of mice is mixed.
We described the mice with their official names, which provides the context of their genetic backgrounds.
(14) Please specify which morphologic criteria were used to verify the stage of embryos in the methods.
We added the following text to the methods section of the revised manuscript: “Plug date was used to determine the stage of embryos collected for single-nucleus RNA-seq and ATAC-seq. The stage of E11.5 embryos was confirmed by counting somites. The stage of embryos collected at E12.5 was confirmed by the morphological presence of the vessel and cords of the testes collected from XY embryos. Similarly, we confirmed the stage of embryos collected at E13.5 by the size of the gonads, the presence of more distinct cords in the testes of XY embryos, and the elongation of the ovaries of XX embryos.”
(15) The total number of cells and PGCs that passed QC and are included in UMAPS should be stated.
The requested information was added to the legend for Fig. 1 of the revised manuscript: “The number of PGCs per sex and embryonic stage are: 375 E11.5 XX PGCs; 1,106 E12.5 XX PGCs; 750 E13.5 XX PGCs; 110 E11.5 XY PGCs; 465 E12.5 XY PGCs; and 348 E13.5 XY PGCs.”
(16) The order of timepoints changes between figures, and this is not for any obvious reason. Please make it consistent. Figures 1 and 6 list XX 11.5, 12.5, 13.5, and the same for XY, but Figures 2, 3, and 4 use the reverse order: XY E13.5, E12.5, E11.5, and then XX.
We thank the reviewer for this comment. However, we chose this order for each of the figures to match the coordinates of the graphs and where we would expect the reader to begin reading the graph first. For example, in Figure 3a, XX E11.5 is closest to the x-axis and would be expected to be read first.
(17) In Figure S2 the colors of clusters are hard to distinguish, and it is suggested that the cluster numbers should be listed above each colored bar to avoid frustration.
We made the suggested correction to Figure S2.
(18) In Figures 2e and 3e: what do the dashed boxes indicate?
The dashed boxes are to guide the reader’s eyes to the fact that the order of transcription factors/genes under the Cistrome DB regulatory potential score and gene expression plots are the same.
(19) In Figure 5a: break panels into i-iv so that the in-text call-outs are not all the same.
We made the suggested correction to Figure 5a and modified the in-text call-outs.
(20) Please indicate XX in Figure 5e and XY in Figure 5l.
We made the suggested correction to Figure 5e and 5l.
(21) In Figure S5c: Please reorganize DA chromatin peak charts so that columns are XX and XY with rows at the same timepoint.
We made the suggested correction to Figure S5c.
(22) In Figure S7a: please make images larger so that the overlapping expression of PORCN and TRA98 is more visible, and consider adding a more magnified panel.
This image is now included in the main text, with expanded panels.
(23) Line 742-754: this seems like a long introduction for the results section; please consider tightening it up.
We believe this text is important and necessary to provide context to the bioinformatics analyses of cell signaling pathways in PGCs. Not all readers will be familiar with the ligand-receptor signals between gonadal support cells and PGCs, and this text provides details on which signaling pathways are known to direct sex determination of PGCs.
(24) For UMAP plots in Figures 2c, 3c, S3b, and S4b, the text overlaid with the timepoints and sexes onto the UMAP plots is misleading, as it allows the reader to presume that the entire group of cells for a given sex/timepoint is located in the location of the text overlay. However, from the UMAP plots in Figure 1i-j, it is clear that the cells from a given sex/timepoint are actually spread across multiple identified clusters. Thus, the overlaid text obscures the important heterogeneity detected. To better represent the actual locations on the UMAP plot of cells from each sex/timepoint, it would be better to show inset density plots alongside these UMAP plots so the reader can locate the cells for themselves.
We thank the reviewer for this comment. However, we chose this formatting to offer simplicity and ease of understanding to our UMAPs in addition to highlighting the general biological patterns of gene expression. If the reader is interested in discerning more of the heterogeneity of the UMAPs, they may refer back to Figure 1.
Reviewer #3 (recommendations for the authors):
There are some errors or places that need clarification or corrections:
(1) Figure 1f, according to the graph, it should be 8 clusters, not 9.
There are 9 clusters because the numbering for the clusters start at ‘0’.
(2) Why did cluster 8 have so many different states of cells from both sexes?
The identification of cluster 8 is likely an artifact of sequencing, and would require several different analyses to figure out why cluster 8 has many different states of cells from both sexes. While this will address a technical issue associated with the dataset, this will not change any major conclusions of the study.
(3) Figure 1i, shouldn't that be ten instead of eleven?
There are 11 clusters because the numbering for the clusters start at ‘0’.
(4) Figure 2a, zkscan expression level comparison was not so obvious as the bubble size was small. How many folds of differences from xx pgc?
There is a 1.5 fold increase in the expression of Zkscan5 between XY and XX PGCs at E13.5. We included this information in the revised manuscript.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the previous reviews.
Public Reviews:
Reviewer #1 (Public review):
Summary:
The authors constructed a novel HSV-based therapeutic vaccine to cure SIV in a primate model. The novel HSV vector is deleted for ICP34.5. Evidence is given that this protein blocks HIV reactivation by interference with the NFkappaB pathway. The deleted construct supposedly would reactivate SIV from latency. The SIV genes carried by the vector ought to elicit a strong immune response. Together the HSV vector would elicit a shock and kill effect. This is tested in a primate model.
Strengths and weaknesses:
(1) Deleting ICP34.5 from the HSV construct has a very strong effect on HIV reactivation. The mechanism underlying increased activation by deleting ICP34.5 is only partially explored. Overexpression of ICP34.5 has a much smaller effect (reduction in reactivation) than deletion of ICP34.5 (strong activation); this is acknowledged by the authors that no full mechanistic explanation can be given at this moment.
Thank you for your comments. We agree with you that the mechanism underlying increased reactivation by deleting ICP34.5 is only partially explored. As you pointed out, the deletion of ICP34.5 leads to a significant reactivation, while the overexpression of ICP34.5 has a relatively weak inhibitory effect on reactivation. This difference prompts us to further contemplate the role of HSV-1 in regulating HIV latency and reactivation. Our data (Figure S4), along with previous literature (Mosca et al., 1987, Nabel et al., 1988), have indicated that the ICP0 protein might play a crucial role in the reactivation of HIV latency. However, we found for the first time that ICP34.5 can play an antagonistic role with this reactivation. This is a very interesting topic for understanding the complicated interactions between host cells and different viruses. We will investigate the deeper insights in future studies, and we have mentioned this limitation in the revised Discussion Section. Thank you!
(2) No toxicity data are given for deleting ICP34.5. How specific is the effect for HIV reactivation? A RNA seq analysis is required to show the effect on cellular genes.
A RNA seq analysis was done in the revised manuscript comparing the effect of HSV-1 and deleted vector in J-LAT cells (Fig S5). More than 2000 genes are upregulated after transduction with the modified vector in comparison with the WT vector. Hence, the specificity of upregulation of SIV genes is questioned. Authors do NOT comment on these findings. In my view it questions the utility of this approach.
Thank you for your mentions.
(1) As for the toxicity of HSV-ΔICP34.5, it is well known that ICP34.5 is a neurotoxicity factor that can antagonize host immune responses, and thus deleting ICP34.5 is beneficial to improve the safety of HSV-based constructs. As expected, we have demonstrated experimentally that HSV-DICP34.5 exhibited lower virulence and replication ability than wild-type HSV-1 (Figure S1). Importantly, we also observed a significant decrease in the expression of inflammatory factors in PWLH when compared to wild-type HSV-1 (Figure 1I-K). These data suggested that the safety of HSV-DICP34.5 should be more tolerable than wild-type HSV vector.
(2) The RNASeq analysis is aimed to explore the HSV-ΔICP34.5-induced signaling pathways, but it is not suitable to use this data for assessing the toxicity of HSV-ΔICP34.5 constructs. As for the RNASeq data, we think it is reasonable to observe many upregulated genes (which are involved in a variety of signaling pathways), since HSV-DICP34.5 constructs reactivated HIV latency more effectively than wild-type HSV by modulating the IKKα/β-NF-kB pathway and PP1-HSF1 pathway.
(3) To further validate whether HSV-ΔICP34.5 can specifically activate the HIV latent reservoir, we conducted additional experiments using vaccinia virus and adenovirus as controls, and results showed that both vaccinia virus and adenovirus cannot effectively reactivate HIV latency (Figure S3). Moreover, the deletion of ICP0 gene from HSV-1 diminished the reactivation effect of HIV latency by HSV-1, and overexpressing ICP0 greatly reactivate the latent HIV (Figure S4, Figure S5), implying that this reactivation should be virus-specific and ICP0 plays an important factor on reversing HIV latency. Interestingly, we herein found that ICP34.5 can act as an antagonistic factor for this reactivation of HIV latency by HSV-1. Thus, after the deletion of ICP34.5, the ability of HSV to reverse HIV latency was significantly enhanced. Our research group will investigate the underlying mechanism in future studies. Thank you for your insightful mention.
(3) The primate groups are too small and the results to variable to make averages. In Fig 5, the group with ART and saline has two slow rebounders. It is not correct to average those with the single quick rebounder. Here the interpretation is NOT supported by the data.
Although authors provided some promising SIV DNA data, no additional animals were added. Groups of 3 animals are too small to make any conclusion, especially since the huge variability in response. The average numbers out of 3 are still presented in the paper, which is not proper science.
No data are given of the effect of the deletion in primates. Now the deleted construct is compared with an empty vector containing no SIV genes. Authors provide new data in Fig S2 on the comparison of WT and modified vector in cells from PLWH, but data are not that convincing. A significant difference in reactivation is seen for LTR in only 2/4 donors and in Gag in 3/4 donors. (Additional question what is meaning of LTR mRNA, do authors relate to genomic RNA??)
Thank you for your serious review and kind reminder.
(1) We agree with you that it is not appropriated to use averages for this pilot study with limited numbers of macaques. We are currently unable to conduct another experiment with a larger number of macaques, but we think the results of this pilot study were very promising for further studies. Now, following your kind suggestions, we have removed the averages and now presented the data for each monkey individually in the revised manuscript. We have also modified the corresponding description accordingly (Line 254 to 262). Thank you for your understanding.
(2) Regarding your comment about the lack of data on the deletion of ICP34.5 from HSV-1, we are sorry for previously unclear description. In fact, the empty vector used in our animal experiments not only does not contain SIV antigens but also has the ICP34.5 deletion. We have revised the corresponding description accordingly (For example, we use HSV-DICP34.5DICP47-empty, HSV-DICP34.5DICP47-sPD1-SIVgag/SIVenv instead of HSV-empty, HSV-sPD1-SIVgag/SIVenv). We hope this revision will address your question.
(3) As for the reactivation effects observed in PLWH samples, the data may be not perfect, but we think this result (a significant difference in reactivation is seen for LTR in 2/4 donors and for Gag in 3/4 donors, and the purpose of detecting LTR RNA is to evaluate the level of virus replication) is promising to support our conclusion (The enhanced reactivation effect in primary CD4+ T cells by HSV-∆ICP34.5 than wild-type HSV). Of course, we recognize the need for more samples to gain a comprehensive understanding of reactivation effect in different individuals in future study. In addition, we corrected the description of LTR RNA (Lines 99-106 and 115-116). Thank you for the reminder!
Discussion
HSV vectors are mainly used in cancer treatment partially due to induced inflammation. Whether these are suitable to cure PLWH without major symptoms is a bit questionable to me and should at least be argued for.
The RNA seq data add on to this worry and should at least be discussed.
Thank you for your mention. As mentioned above, the RNASeq analysis is aimed to explore the HSV-ΔICP34.5-induced signaling pathways, but it is not suitable to use this data for assessing the toxicity of HSV-ΔICP34.5 constructs. Actually, ICP34.5 is a neurotoxicity factor that can antagonize innate immune responses, and thus ICP34.5 deletion is beneficial to improve the safety of HSV-based constructs. As expected, our data have demonstrated experimentally that HSV-DICP34.5 exhibited lower virulence and replication ability than wild-type HSV-1 (Figure S1). Importantly, HSV-DICP34.5 induced a lower level of inflammatory cytokines (including IL-6, IL-1β, and TNF-α) in primary CD4+ T cells from PLWH compared to HSV stimulation, likely due to its lower virulence and replication ability (Figure 1I-K). In addition, the CD4+ /CD8+ T cell ratio (Figure 5H) and body weight (Figure S10) after treatment were effectively ameliorated in the SIV-infected macaques of the ART+HSV-DICP34.5DICP47-sPD1-SIVgag/SIVenv group. Our data also demonstrated that there was no significant effect on the cell composition of peripheral blood in the SIV-infected macaques of ART+HSV-DICP34.5DICP47-sPD1-SIVgag/SIVenv group (Figure S11). These data suggested that the safety of HSV-DICP34.5 should be more tolerable than wild-type HSV vector. We have added a more comprehensive description in the revised Discussion (Lines 328-334). Thank you again for all of your kind comments and suggestions.
Reviewer #2 (Public review):
Summary:
In this article Wen et. al., describe the development of a 'proof-of-concept' bi-functional vector based out of HSV-deltaICP-34.5's ability to purge latent HIV-1 and SIV genomes from cells. They show that co-infection of latent J-lat T-cell lines with a HSV-deltaICP-34.5 vector can reactivate HIV-1 from a latent state. Over- or stable expression of ICP 34.5 ORF in these cells can arrest latent HIV-1 genomes from transcription, even in the presence of latency reversal agents. ICP34.5 can co-IP with- and de-phosphorylate IKKa/b to block its interaction with NF-k/B transcription factor. Additionally, ICP34.5 can interact with HSF1 which was identified by mass-spec. Thus, the authors propose that the latency reversal effect of HSV-deltaICP-34.5 in co-infected JLat cells is due to modulatory effects on the IKKa/b-NF-kB and PP1-HSF-1 pathway.
Next the authors cleverly construct a bifunctional HSV based vector with deleted ICP34.5 and 47 ORFs to purge latency and avoid immunological refluxes, and additionally expand the application of this construct as a vaccine by introducing SIV genes. They use this 'vaccine' in mouse models and show the expected SIV-immune responses. Experiments in rhesus macaques (RM), further elicit potential for their approach to reactivate SIV genomes and at the same time block their replication by antibodies. What was interesting in the SIV experiments is that the dual-functional vector vaccine containing sPD1- and SIV Gag/Env ORFs effectively delayed SIV rebound in RMs and in some cases almost neutralized viral DNA copy detection in serum. Very promising indeed, however there are some questions I wish the authors explored to answer, detailed below.
Overall, this is an elegant and timely work demonstrating the feasibility of reducing virus rebound in animals, and potentially expand to clinical studies. The work was well written, and sections were clearly discussed.
Strengths:
The work is well designed, rationale explained and written very clearly for lay readers.
Claims are adequately supported by evidence and well designed experiments including controls.
We appreciate your positive comment for our work.
Weaknesses:
(1) It looks like ICP0 is also involved in latency reversal effects. More follow-up work will be required to test if this is in fact true.
Both our data (Figure S4, Figure S5) and previous literature (Nabel et al., 1988, Mosca et al., 1987) have reported that HSV ICP0 may play a role in reversing HIV latency. However, the exact mechanisms behind this effect have not yet been fully elucidated. Of note, we herein reported for the first time that ICP34.5 can act as an antagonistic factor for this reactivation of HIV latency by HSV-1. Thus, after the deletion of ICP34.5, the ability of HSV to reverse HIV latency was significantly enhanced. Our research group will investigate the underlying mechanism in future studies. Thank you for your insightful mention.
(2) It is difficult to estimate the depletion of the latent viral reservoir. The authors have tried to address this issue. A more convincing argument to this reviewer will be data to demonstrate that after the bi-functional vaccine, the animals show overall reduction in the number of circulating latent cells. The feasibility to obtain such a result is not clearly demonstrated.
Thank you for your comment. As you mentioned, we have indeed measured both total DNA and integrated DNA (iDNA) in blood cells (see Figure 5E-F), which can provide support for the reduction of the latent viral reservoir. Thank you for your kind reminder.
(3) The authors state that the reduced virus rebound detected following bi-functional vaccine delivery is due to latent genomes becoming activated and steady-state neutralization of these viruses by antibody response. This needs to be demonstrated. Perhaps cell-culture experiments from specimen taken from animals might help address this issue. In lab cultures one could create environments without antibody responses, under these conditions one would expect higher level of viral loads being released in response to the vaccine in question.
Thank you for your valuable suggestion. We believe that the reduced virus rebound observed may be influenced by immune responses from T cells and antibodies induced by both ART and the vaccine. We appreciate your insight and agree that future studies should focus on investigating the activation effects of the vaccine under controlled conditions that simulate the absence of immune responses in primary animal cells. This will help us better understand the mechanisms involved and address your concerns more comprehensively.
Reviewer #2 (Recommendations for the authors):
The Authors have sufficiently addressed my comments. Below are a few minor changes that can help with clarity.
Lines 126-127: This sentence should be changed. Perhaps, "these data suggests that .... Safety of... in PLWH might be tolerable, at least in vitro."
Thanks for your suggestion. We have revised it accordingly. (Line 130).
Lines 128-132: Would this not mean that reactivation is due to ICP0 gene? Have the authors tried to express ICP0-gene into J-Lat cells and see if that is the reason for reactivation? This seems somewhat incomplete. At the end of 132, please add ", in the presence of ICP0". Also a sentence describing this effect is warranted.
Thank you for your insightful suggestion. Yes, both our data and previous literature supported that the ICP0 gene can play a significant role in the reactivation of HIV latency (Figure S4, Figure S5). Of note, we herein reported for the first time that ICP34.5 can act as an antagonistic factor for this reactivation of HIV latency by HSV-1. Thus, after the deletion of ICP34.5, the ability of HSV to reverse HIV latency was significantly enhanced. We have described this effect in the revised version accordingly. Additionally, we have added the phrase “in the presence of ICP0” to the results section (Lines 137) to clarify this point.
MOSCA, J. D., BEDNARIK, D. P., RAJ, N. B., ROSEN, C. A., SODROSKI, J. G., HASELTINE, W. A., HAYWARD, G. S. & PITHA, P. M. 1987. Activation of human immunodeficiency virus by herpesvirus infection: identification of a region within the long terminal repeat that responds to a trans-acting factor encoded by herpes simplex virus 1. Proc Natl Acad Sci U S A 84: 7408.DOI: https://doi.org/10.1073/pnas.84.21.7408, PMID: 2823260
NABEL, G. J., RICE, S. A., KNIPE, D. M. & BALTIMORE, D. 1988. Alternative mechanisms for activation of human immunodeficiency virus enhancer in T cells. Science 239: 1299.DOI: https://doi.org/10.1126/science.2830675, PMID: 2830675
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public Review):
Summary:
By using the biophysical chromosome stretching, the authors measured the stiffness of chromosomes of mouse oocytes in meiosis I (MI) and meiosis II (MII). This study was the follow-up of previous studies in spermatocytes (and oocytes) by the authors (Biggs et al. Commun. Biol. 2020: Hornick et al. J. Assist. Rep. and Genet. 2015). They showed that MI chromosomes are much stiffer (~10 fold) than mitotic chromosomes of mouse embryonic fibroblast (MEF) cells. MII chromosomes are also stiffer than the mitotic chromosomes. The authors also found that oocyte aging increases the stiffness of the chromosomes. Surprisingly, the stiffness of meiotic chromosomes is independent of meiotic chromosome components, Rec8, Stag3, and Rad21L. with aging.
Strengths:
This provides a new insight into the biophysical property of meiotic chromosomes, that is chromosome stiffness. The stiffness of chromosomes in meiosis prophase I is ~10-fold higher than that of mitotic chromosomes, which is independent of meiotic cohesin. The increased stiffness during oocyte aging is a novel finding.
Weaknesses:
A major weakness of this paper is that it does not provide any molecular mechanism underlying the difference between MI and MII chromosomes (and/or prophase I and mitotic chromosomes).
We acknowledge that our study does not provide a comprehensive explanation for the stage-related alterations in chromosome stiffness; however, we believe that the observation of these changes is itself of broad interest. Initially, we hypothesized that DNA damage or depletion of meiosis-specific cohesin might contribute to the observed increase in chromosome stiffness. However, our experimental finding did not support these hypotheses, indicating that neither DNA damage nor cohesion depletion is responsible for the stiffness increase. The molecular basis underlying the stage-related stiffness increase remains elusive and requires exploration in future studies. In the Discussion, we propose that factors such as condensin, nuclear proteins, and histone methylation may play a role in regulating meiotic chromosome stiffness. The involvement of these factors in stage-related chromosome stiffening requires future investigation.
Reviewer #2 (Public Review):
This paper reports investigations of chromosome stiffness in oocytes and spermatocytes. The paper shows that prophase I spermatocytes and MI/MII oocytes yield high Young Modulus values in the assay the authors applied. Deficiency in each one of three meiosis-specific cohesins they claim did not affect this result and increased stiffness was seen in aged oocytes but not in oocytes treated with the DNA-damaging agent etoposide.
The paper reports some interesting observations which are in line with a report by the same authors of 2020 where increased stiffness of spermatocyte chromosomes was already shown. In that sense, the current manuscript is an extension of that previous paper, and thus novelty is somewhat limited. The paper is also largely descriptive as it does neither propose a mechanism nor report factors that determine the chromosomal stiffness.
There are several points that need to be considered.
(1) Limitations of the study and the conclusions are not discussed in the "Discussion" section and that is a significant gap. Even more so as the authors rely on just one experimental system for all their data - there is no independent verification - and that in vitro system may be prone to artefacts.
Our experimental system has been used to study different types of chromosome stiffness as well as nuclear stiffness. We have compared our results with previously published data and found the data is consistent across different experiments. To address the reviewer’s concern, we describe the limitations of our in vitro experimental approach in the Discussion section.
(2) It is somewhat unfortunate that they jump between oocytes and spermatocytes to address the cohesin question. Prophase I (pachytene) spermatocytes chromosomes are not directly comparable to MI or MII oocyte chromosomes. In fact, the authors report Young Modulus values of 3700 for MI oocytes and only 2700 for spermatocyte prophase chromosomes, illustrating this difference. Why not use oocyte-specific cohesin deficiencies?
In this study, our goal was to investigate the mechanism underlying the increased chromosome stiffness observed during prophase I. Ideally, we would have compared wild-type and cohesin-deleted mouse oocytes at the metaphase I (MI) stage. However, experimental constraints made this approach unfeasible: spermatocytes and oocytes from Rec8<sup>-/-</sup> and Stag3<sup>-/-</sup> mutant mice cannot reach MI stage, and Rad21l<sup>-/-</sup> mutant mice are sterile in males and subfertile in females, because cohesin proteins are crucial for germline cell development.
Additionally, collecting prophase I chromosomes from oocytes is exceptionally challenging and requires fetal mice as prophase I oocyte sources because female oocytes progress to the diplotene stage during fetal development. The process is further complicated by the difficulty of genotyping fetal mice, making the study of female prophase I impracticable. By contrast, spermatocytes are continuously generated in males throughout life, with meiotic stages readily identifiable, making them more accessible for analysis.
Our findings consistently showed increased chromosome stiffness in both prophase I spermatocytes and MI oocytes, suggesting that the phenomenon is not sex-specific. This observation implies that similar effects on chromosome stiffness may occur across meiotic stages, from prophase I to MI.
(3) It remains unclear whether the treatment of oocytes with the detergent TritonX-100 affects the spindle and thus the chromosomes isolated directly from the Triton-lysed oocytes. In fact, it is rather likely that the detergent affects chromatin-associated proteins and thus structural features of the chromosomes.
Regarding the use of Triton X-100, it is important to emphasize that the concentration used (0.05%) is very low and unlikely to significantly affect chromosome stiffness. To support this assertion, we have provided additional evidence in the revised manuscript demonstrating that this low concentration of Triton X-100 has a negligible effect on chromosome stiffness (Supplement Fig. 5, Right panel).
(4) Why did the authors use mouse strains of different genetic backgrounds, CD-1, and C57BL/6? That makes comparison difficult. Breeding of heterozygous cohesin mutants will yield the ideal controls, i.e. littermates.
The genetic mutant mice, all in a C57BL/6 background, were generously provided by Dr. Philip Jordan and delivered to our lab. As our lab does not currently maintain C57BL/6 colony and given that this strain typically produces small litter sizes - which would have complicated the remainder of the study - we chose CD-1 mice as the control group and used C57BL/6 mice specifically for the cohesin study. To address potential concerns regarding genetic background differences, we compared our results with previously published data from C57BL/6 mice and found no significant differences (2710 ± 610 Pa versus 3670 ± 840 Pa, P= 0.4809) (Biggs et al., 2020). Furthermore, prophase I spermatocytes from CD-1 mice showed no significant difference compared to any of the three cohesin-deleted C57BL/6 mutant mice, suggesting that chromosome stiffness is not significantly influenced by genetic background.
(5) How did the authors capture chromosome axes from STAG3-deficienct spermatocytes which feature very few if any axes? How representative are those chromosomes that could be captured?
We isolated chromosomes from prophase I mutant spermatocytes, which were identified by their large size, round shape, and thick chromosomal threads - characteristics indicative of advanced condensation and a zygotene-like stage during prophase I (Supplemental Fig. 3). The methodology for isolating these chromosomes has been described in details in our previous publication (Biggs et al., 2020), which is referenced in the current manuscript.
Reviewer #3 (Public Review):
Summary:
Understanding the mechanical properties of chromosomes remains an important issue in cell biology. Measuring chromosome stiffness can provide valuable insights into chromosome organization and function. Using a sophisticated micromanipulation system, Liu et al. analyzed chromosome stiffness in MI and MII oocytes. The authors found that chromosomes in MI oocytes were ten-fold stiffer than mitotic ones. The stiffness of chromosomes in MI mouse oocytes was significantly higher than that in MII oocytes. Furthermore, the knockout of the meiosis-specific cohesin component (Rec8, Stag3, Rad21l) did not affect meiotic chromosome stiffness. Interestingly, the authors showed that chromosomes from old MI oocytes had higher stiffness than those from young MI oocytes. The authors claimed this effect was not due to the accumulated DNA damage during the aging process because induced DNA damage reduced chromosome stiffness in oocytes.
Strengths:
The technique used (isolating the chromosomes in meiosis and measuring their stiffness) is the authors' specialty. The results are intriguing and informative to the chromatin/chromosome and other related fields.
Weaknesses:
(1) How intact the measured chromosomes were is unclear.
Currently, a well-calibrated chromosome mechanics experiment requires the extracellular isolation of chromosomes. In experiments conducted parallel to those in our previous study (Biggs et al., 2020), we obtained quantitatively consistent results, including measurements of the Young modulus for prophase I spermatocyte chromosomes. Our isolation approach is significantly gentler than bulk methods that rely on hypotonic buffer-driven cell lysis and centrifugation. If substantial chromosomal damage had occurred during isolation, we would expect greater variation between experiments, as different amounts or types of damage could influence the results.
(2) Some control data needs to be included.
We used wild-type prophase I spermatocytes and metaphase I (MI) oocytes as controls. To validate our findings, we compared some of our results with those reported in a previous study and observed consistent outcomes (Biggs et al., 2020).
(3) The paper was not well-written, particularly the Introduction section.
We have revised the paper and improved the overall quality of the manuscript.
(4) How intact were the measured chromosomes? Although the structural preservation of the chromosomes is essential for this kind of measurement, the meiotic chromosomes were isolated in PBS with Triton X-100 and measured at room temperature. It is known that chromosomes are very sensitive to cation concentrations and macromolecular crowding in the environment (PMID: 29358072, 22540018, 37986866). It would be better to discuss this point.
As suggested, we investigated the impact of PBS and Triton X-100 on chromosome stiffness. Our findings indicate that neither PBS nor Triton X-100 caused significant changes in chromosome stiffness (Supplemental Fig. 5).
Recommendations For The Authors:
Major points of Reviewers that the Editor indicated should be addressed
(1) Reviewer's point 3, the effect of the high concentration of etoposide: It would be advisable to use lower concentrations of etoposide to observe the effect of DNA damage on chromosome stiffness more accurately.
The effect of etoposide on oocyte is dose-dependent (Collins et al., 2015). Oocytes are generally not highly sensitive to DNA damage, and even at relatively high concentrations, not all may exhibit a response. To ensure that sufficient DNA damage in the oocytes we isolated, we used relatively high concentration of etoposide for the experiment. This concentration (50 μg/ml) falls within the typical range reported in the literature (Marangos and Carroll, 2012)(Cai et al., 2023)(Lee et al., 2023). As the reviewer suggested, we tested two additional lower concentrations of etoposide (5 μg/ml and 25 μg/ml) (see Fig. 5 C). We did not observe any significant differences in chromosome stiffness in 5 µg/ml etoposide-treated oocytes compared to the control. However, higher concentrations of etoposide (25 μg/ml) significantly reduced oocyte chromosome stiffness compared to the control.
Revision to manuscript:
“Results at lower etoposide concentrations revealed that chromosome stiffness in untreated control oocytes was not significantly different from that in oocytes treated with 5 μg/ml etoposide (3780 ± 700 Pa versus 3930 ± 400 Pa, P = 0.8624). However, chromosome stiffness in untreated oocytes was significantly higher than that in oocytes treated with 25 μg/ml etoposide (3780 ± 700 Pa versus 1640 ± 340 Pa, P = 0.015) (Figure 5C).”
(2) Reviewer's point 3, the effect of Triton X-100: This is related to the concern of the #3 reviewer. It is critical to check whether the detergent does not affect the stiffness indirectly or not.
To demonstrate that the low concentration of Triton X-100 does not influence chromosome stiffness, we conducted additional experiments. First, we isolated chromosomes and measured their stiffness. Then, we treated the chromosomes with 0.05% Triton X-100 via micro-spraying and remeasured the stiffness. The results showed no significant difference (see Supplement Fig. 5 right panel).
Revision to manuscript:
“In addition to past experiments indicating that mitotic chromosomes are stable for long periods after their isolation (Pope et al., 2006), we carried out control experiments on mouse oocyte chromosomes where we incubated them for 1 hour in PBS, or exposed them to a flow of Triton X-100 solution for 10 minutes; there was no change in chromosome stiffness in either case (Methods and Supplementary Fig. 5).”
(3) Reviewer's point 1, the effect of the buffer composition: Please describe how the composition affects the stiffness of the chromosomes.
PBS is an economical and effective buffer solution that closely mimics the osmotic conditions of the cytoplasm, which is crucial for maintaining chromosomal structural integrity. Appropriate ion concentrations are crucial for preserving chromosome integrity, as imbalances—either too high or too low—can alter chromosome morphology (Poirier and Marko, 2002). When chromosomes are stored in PBS, their stiffness remains relatively stable, even with prolonged exposure, ensuring minimal changes to their physical properties. To confirm this, we isolated chromosomes and measured their stiffness. After one-hour incubation in PBS, we remeasured stiffness and observed no significant differences, which demonstrated that chromosomes remain stable in PBS (see Supplement Fig.5 left panel).
Revision to manuscript:
“In this study, we developed a new way to isolate meiotic chromosomes and measure their stiffness. However, one concern is that the measurements were conducted in PBS solution, which is different from the intracellular environment. To address this, we monitored chromosome stiffness overtime in PBS solution and found that it remained stable over a period of one hour (Supplement Fig. 5 Left panel).”
Reviewer #1 (Recommendations For The Authors):
Major points:
(1) Previously, the role of condensin complexes in chromosome stiffness is shown (Sun et al. Chromosome Research, 2018). Thus, at least the authors described the condensin staining on MI and MII chromosomes.
We have added sentences in the discussion to elaborate on the role of condensin.
Revision to manuscript:
“Several factors, including condensin, have been found to affect chromosome stiffness (Sun et al., 2018). Condensin exists in two distinct complexes, condensin I and condensin II, and both are active during meiosis. Published studies indicate that condensin II is more sharply defined and more closely associated with the chromosome axis from anaphase I to metaphase II (Lee et al., 2011). Additionally, condensin II appears to play a more significant role in mitotic chromosome mechanics compared to condensin I (Sun et al., 2018). Thus, condensin II likely contributes more significantly to meiotic chromosome stiffness than condensin I.”
(2) Although the authors nicely showed the difference in the stiffness between MI and MII chromosomes (Figure 2), as known, MI chromosomes are bivalent (with four chromatids) while MII chromosomes are univalent (with two chromatids). The physical property of the chromosomes would be affected by the number of chromatids. It would be essential for the authors to measure the physical properties of a univalent of MI chromosomes from mice defective in meiotic recombination such as Spo11 and/or Mlh3 KO mice.
The reviewer correctly pointed out that the number of chromatids in chromosomes differs between metaphase I (MI) and metaphase II (MII) stages. We have addressed this difference by calculating Young’s modulus (E), a mechanical property that describes the elasticity of a material, independent of its geometry. Young’s modulus describes the intrinsic properties of the material itself, rather than the specific characteristics of the object being tested. It is calculated as E=(F/A)/(∆L/L0), where F was the force given to stretch the chromosome, A was the cross-section area, ∆L was the length change of the chromosome, and L0 was the original length of the chromosome. While an increase in chromosome or chromatid numbers, results in a larger cross-sectional area, leading to a higher doubling force (F). This variation in chromosome number or cross-sectional area does not impact the calculation of chromosome stiffness/Young’s modulus (E). While study of the mutants suggested by the referee would certainly be interesting, it would be likely that the absence of these key recombination factors would impact chromosome stiffness in a more complex way than just changing their thickness; this type of study is beyond the scope of the present manuscript and is an exciting direction for future studies.
(3) In Figure 5, the authors measure the stiffness of etoposide-treated MI chromosomes. The concentration of the drug was 50 ug/ml, which is very high. The authors should analyze the different concentrations of the drug to check the chromosome stiffness. Moreover, etoposide is an inhibitor of Topoisomerase II. The effect of the drug might be caused by the defective Top2 activity, rather than Top2-adducts, thus DNA damage. It is very important to check the other Top2 inhibitors or DNA-damaging agents to generalize the effect of DNA damage on chromosome stiffness. Moreover, DNA damage induces the DNA damage response. It is important to check the effect of DDR inhibitors on the damage-induced change of stiffness.
The reviewer is correct in noting that etoposide can induce DNA damage and inhibit Top2 activity. To address this concern, our previous DNase experiment provided further clarity and supports our results of this study (Biggs et al., 2020). This experiment was conducted in vitro, where DNase treatment caused DNA damage on chromosomes without affecting Top2 activity or triggering DNA damage response. The results demonstrated that DNase treatment led to reduced chromosome stiffness, which aligns with the findings presented in our manuscript.
(4) In the same line as the #3 point, the authors also need to check the effect of etoposide on the stiffness of mitotic chromosomes from MEF.
Experiments on MEF mitotic chromosomes were designed to serve as a reference for the meiotic chromosome studies. The etoposide experiments on meiotic chromosomes specifically aimed to investigate how DNA damage affects meiotic chromosome structure. While it would be interesting to explore the effects of etoposide-induced DNA damage on mitotic chromosomes, it represents a distinct research question that falls outside the scope of the current study.
Minor points:
(1) Line 141-142: Previous studies by the author analyzed the stiffness of mitotic chromosomes from pro-metaphase. Which stage of cell cycles did the authors analyze here?
To ensure consistency in our experiments, we also measured the stiffness of mitotic chromosomes at the prometaphase stage. The precise stage used is very near to metaphase, at the very end of the prometaphase stage. We have modified the manuscript to clarify this point.
Revision to manuscript:
“For comparison with the meiotic case, we measured the chromosome stiffness of Mouse Embryonic Fibroblasts (MEFs) at late pro-metaphase (just slightly before their attachment to the mitotic spindle) and found that the average Young’s modulus was 340 ± 80 Pa (Figure 2B). The value is consistent with our previously published data, where the modulus for MEFs was measured to be 370 ± 70 Pa (Biggs et al., 2020).”
(2) Line 157: Here, the doubling force of MI (and MII) oocytes should be described in addition to those of spermatocytes.
The purpose of this paragraph is to demonstrate the reproductivity and consistency of our experiments. In this section, we compared our data with previously published findings. Published data do not include chromosome stiffness measurement from MI mouse oocytes. Our experiment is the first to assess this. Therefore, we did not include MI mouse oocytes in that comparison. To clarify this, we have added sentences to highlight the comparison of doubling force.
Revision to manuscript:
“Here, we found that the doubling forces of chromosomes from MI and MII oocytes are 3770 ± 940 pN and 510 ± 50 pN, respectively. We conclude that chromosomes from MI oocytes are much stiffer than those from both mitotic cells and MII oocytes (Supplement Fig. 2), in terms of either Young’s modulus or doubling force.”
(3) Line 202: What stage of prophase I do the authors mean by the spermatocyte stage here? Diakinesis, Metaphase I or prometaphase I? I am not sure how the authors can determine a specific stage of prophase I by only looking at the thickness of the chromosomes. Please show the thickness distribution of WT and Rec8<sup>-/-</sup> chromosomes.
We have reworded the sentence and clarified that the spermatocyte stage is prophase I stage. Since Rec8<sup>-/-</sup> spermatocytes cannot progress beyond the pachytene stage of prophase I, the isolated chromosomes must be in prophase I rather than diakinesis, metaphase I, prometaphase I, or any later stages (Xu et al., 2005). Based on the cell size and degree of chromosome condensation (Biggs et al., 2020), it is most likely that the measured chromosomes are at the zygotene-like stage. However, as we cannot definitively determine the exact substage of prophase I, thus, we have referred to them simply as prophase I.
Revision to manuscript:
“We isolated chromosomes from Rec8<sup>-/-</sup> prophase I spermatocytes, which displayed large and round cell size and thick chromosomal threads, indicative of advanced chromosome compaction after stalling at a zygotene-like prophase I stage (Supplement Fig. 3). The combination of large cell size and degree of chromosome compaction allowed us to reliably identify Rec8<sup>-/-</sup> prophase I chromosomes. Using micromanipulation, we measured chromosome stiffness by stretching the chromosomes (Supplement Fig. 3) (Biggs et al., 2019).”
Reviewer #2 (Recommendations For The Authors):
(1) Line 135: that statement is not substantiated; better to show retraction data and full reversibility.
We added a figure showing oocyte chromosome stretching, which showed that the oocyte chromosome is elastic, and that the stretching process is reversible (Supplement Fig.1).
(2) Line 144: the authors claim that the Young Modulus of MII oocytes is "slightly" higher than that of mitotic cells (MEFs). Well, "slightly" means it is rather similar, and therefore the commonly used statement that MII is similar to mitosis is OK - contrary to the authors' claim.
We have removed the word “slightly” in the manuscript. The difference is statistically significant.
Revision to manuscript:
“Surprisingly, despite this reduction, the stiffness of MII oocyte chromosomes was still significantly higher than that for mitotic cells (Figure 2B).”
(3) There are a lot of awkward sentences in this text. Some sentences lack words, are not sufficiently precise in wording and/or logic, and there are numerous typos. Some examples can be found in lines 89 (grammar), 94, 95 ("looked"), 98, 101 ("difference" - between what?), and some are commonplaces or superficial (lines 92/93, 120..., ). Occasionally the present and past tense are mixed (e.g. in M&M). Thus the manuscript is quite poorly written.
Thanks for the comments of the reviewer. We have revised all the sentences highlighted by the reviewer and polished the entire manuscript.
Reviewer #3 (Recommendations For The Authors):
(1) Line 48. "We then investigated the contribution of meiosis-specific cohesin complexes to chromosome stiffness in MI and MII oocytes." There is no data on oocytes with meiosis-specific cohesin KO. This part should be corrected.
We have corrected this error.
Revision to manuscript:
“We examined the role of meiosis-specific cohesin complexes in regulating chromosome stiffness.”
(2) Lines 155-157. The result of MI mouse oocyte chromosomes should also be mentioned here (Supplementary Figure 1).
Please see our response to Reviewer 1 – Minor Point 2.
(3) Line 163. "The stiffness of chromosomes in MI mouse oocytes is significantly higher compared to MII oocytes."<br /> Is this because two homologs are paired in MI chromosomes (but not in MII chromosomes)? The authors may want to discuss the possible mechanism.
Please see our response to Reviewer 1 – Major Point 2.
(4) Line 188: "We hypothesized that MI oocytes... would have higher chromosome stiffness than MII oocytes." Why did the authors measure chromosomes from spermatocytes but not MI oocytes?
Both spermatocytes and oocytes from Rec8<sup>-/-</sup>, Stag3<sup>-/-</sup>, and Rad21l<sup>-/-</sup> mutant mice cannot reach MI stage because cohesin proteins are crucial for germline-cell development. We chose to use spermatocytes in our study because collecting fetal meiotic oocytes is extremely difficult, and genotyping fetal mice adds another layer of complexity to the experiments. In females, all oocytes complete prophase I and progress to the dictyotene stage during the fetal stage. Obtaining individual oocytes at this stage is challenging. In contrast, spermatocytes are continuously generated at all stages in males.
(5) To support the authors' conclusion, verifying the KO of REC8, STAG3, and RAD21L by immunostaining or other methods is essential.
These mice are provided by one of the authors, Dr. Philip Jordan, who has published several papers using these knockout mice (Hopkins et al., 2014)(Ward et al., 2016). The immunostaining of these models has already been well-characterized in those previous studies. In addition to performing double genotyping, we also use the size of the collected testes as an additional verification of the mutant genotype. These knockout mice have significantly smaller testes compared to their wild-type counterparts, providing a clear physical indicator of the mutation.
(6) Some of the cited papers and descriptions in the Introduction are not appropriate and confusing. This part should be improved:
Line 79. Recent studies have revealed that the 30-nm fiber is not considered the basic structure of chromatin (e.g., review, PMID: 30908980; original papers, PMID: 19064912, 22343941, 28751582). This point should be included.
We have corrected the references as needed. Additionally, thank you for the updated information regarding the 30-nm fiber. We have removed all the descriptions about the 30-nm fiber to ensure the information is accurate and up to date.
(7) Line 83. Reviews on mitotic chromosomes, rather than Ref. 9, should be cited here. For instance, PMID: 33836947, 31230958.
We have corrected it and added references according to the review’s suggestion.
(8) Line 85. Refs. 10 and 11 are not on the "Scaffold/Radial-Loop" model. For instance, PMID: 922894, 277351, 12689587. The other popular model is the hierarchical helical folding model (PMID: 98280, 15353545).
We have corrected it and added appropriate references according to the review’s suggestion. Regarding the hierarchical helical folding model, our experiments do not provide data that either support or refute this model. Thus, we have opted not to include any discussion of this model in our manuscript.
(9) Figure legends. There is no description of the statistical test.
We have added the description of the statistical test at the end of the figure legends for clarity.
(10) Line 156. The authors should mention which stages in spermatocyte prophase I (pachytene?) were used for their measurement.
We cannot precisely determine the substage of prophase I in the spermatocytes although it is most likely in the pachytene stage.
(11) Line 241. "DNA damage reduces chromosome stiffness in oocytes." It would be better to show how much damage was induced in aged and etoposide-treated chromosomes, for example, by gamma-H2AX immunostaining. In addition, there are some papers that show DNA damage makes chromatin/chromosomes softer (e.g., PMID: 33330932). The authors need to cite these papers.
The effects of etoposide and age on meiotic oocytes has been published (Collins et al., 2015)(Marangos et al., 2015)(Winship et al., 2018).
We are grateful for the citation information provided by the reviewer and have added it to our manuscript.
Revision to manuscript:
“Overall, these findings suggest that DNA damage reduces chromosome stiffness in oocytes instead of increasing it, which aligns with other studies showing that DNA damage can make chromosomes softer (Dos Santos et al., 2021). These results suggest that the increased chromosome stiffness observed in aged oocytes is not due to DNA damage.”
(12) Line 328. Senescence?
This error is corrected in the revised manuscript.
Revision to manuscript:
“Defective chromosome organization is often related to various diseases, such as cancer, infertility, and senescence (Thompson and Compton, 2011; Harton and Tempest, 2012; He et al., 2018).”
References:
Biggs, R., P.Z. Liu, A.D. Stephens, and J.F. Marko. 2019. Effects of altering histone posttranslational modifications on mitotic chromosome structure and mechanics. Mol. Biol. Cell. 30:820–827. doi:10.1091/mbc.E18-09-0592.
Biggs, R.J., N. Liu, Y. Peng, J.F. Marko, and H. Qiao. 2020. Micromanipulation of prophase I chromosomes from mouse spermatocytes reveals high stiffness and gel-like chromatin organization. Commun. Biol. 3:1–7. doi:10.1038/s42003-020-01265-w.
Cai, X., J.M. Stringer, N. Zerafa, J. Carroll, and K.J. Hutt. 2023. Xrcc5/Ku80 is required for the repair of DNA damage in fully grown meiotically arrested mammalian oocytes. Cell Death Dis. 14:1–9. doi:10.1038/s41419-023-05886-x.
Collins, J.K., S.I.R. Lane, J.A. Merriman, and K.T. Jones. 2015. DNA damage induces a meiotic arrest in mouse oocytes mediated by the spindle assembly checkpoint. Nat. Commun. 6. doi:10.1038/ncomms9553.
Harton, G.L., and H.G. Tempest. 2012. Chromosomal disorders and male infertility. Asian J. Androl. 14:32–39. doi:10.1038/aja.2011.66.
He, Q., B. Au, M. Kulkarni, Y. Shen, K.J. Lim, J. Maimaiti, C.K. Wong, M.N.H. Luijten, H.C. Chong, E.H. Lim, G. Rancati, I. Sinha, Z. Fu, X. Wang, J.E. Connolly, and K.C. Crasta. 2018. Chromosomal instability-induced senescence potentiates cell non-autonomous tumourigenic effects. Oncogenesis. 7. doi:10.1038/s41389-018-0072-4.
Hopkins, J., G. Hwang, J. Jacob, N. Sapp, R. Bedigian, K. Oka, P. Overbeek, S. Murray, and P.W. Jordan. 2014. Meiosis-Specific Cohesin Component, Stag3 Is Essential for Maintaining Centromere Chromatid Cohesion, and Required for DNA Repair and Synapsis between Homologous Chromosomes. PLoS Genet. 10:e1004413. doi:10.1371/journal.pgen.1004413.
Lee, C., J. Leem, and J.S. Oh. 2023. Selective utilization of non-homologous end-joining and homologous recombination for DNA repair during meiotic maturation in mouse oocytes. Cell Prolif. 56:1–12. doi:10.1111/cpr.13384.
Lee, J., S. Ogushi, M. Saitou, and T. Hirano. 2011. Condensins I and II are essential for construction of bivalent chromosomes in mouse oocytes. Mol. Biol. Cell. 22:3465–3477. doi:10.1091/mbc.E11-05-0423.
Marangos, P., and J. Carroll. 2012. Oocytes progress beyond prophase in the presence of DNA damage. Curr. Biol. 22:989–994. doi:10.1016/j.cub.2012.03.063.
Marangos, P., M. Stevense, K. Niaka, M. Lagoudaki, I. Nabti, R. Jessberger, and J. Carroll. 2015. DNA damage-induced metaphase i arrest is mediated by the spindle assembly checkpoint and maternal age. Nat. Commun. 6:1–10. doi:10.1038/ncomms9706.
Poirier, M.G., and J.F. Marko. 2002. Mitotic chromosomes are chromatin networks without a mechanically contiguous protein scaffold. Proc. Natl. Acad. Sci. U. S. A. 99:15393–15397. doi:10.1073/pnas.232442599.
Pope, L.H., C. Xiong, and J.F. Marko. 2006. Proteolysis of Mitotic Chromosomes Induces Gradual and Anisotropic Decondensation Correlated with a Reduction of Elastic Modulus and Structural Sensitivity to Rarely Cutting Restriction Enzymes. Mol. Biol. Cell. 17:104. doi:10.1091/MBC.E05-04-0321.
Dos Santos, Á., A.W. Cook, R.E. Gough, M. Schilling, N.A. Olszok, I. Brown, L. Wang, J. Aaron, M.L. Martin-Fernandez, F. Rehfeldt, and C.P. Toseland. 2021. DNA damage alters nuclear mechanics through chromatin reorganization. Nucleic Acids Res. 49:340–353. doi:10.1093/nar/gkaa1202.
Sun, M., R. Biggs, J. Hornick, and J.F. Marko. 2018. Condensin controls mitotic chromosome stiffness and stability without forming a structurally contiguous scaffold. Chromosom. Res. 26:277–295. doi:10.1007/s10577-018-9584-1.
Thompson, S.L., and D.A. Compton. 2011. Chromosomes and cancer cells. Chromosom. Res. 19:433–444. doi:10.1007/s10577-010-9179-y.
Ward, A., J. Hopkins, M. Mckay, S. Murray, and P.W. Jordan. 2016. Genetic Interactions Between the Meiosis-Specific Cohesin Components, STAG3, REC8, and RAD21L. G3 (Bethesda). 6:1713–24. doi:10.1534/g3.116.029462.
Winship, A.L., J.M. Stringer, S.H. Liew, and K.J. Hutt. 2018. The importance of DNA repair for maintaining oocyte quality in response to anti-cancer treatments, environmental toxins and maternal ageing. Hum. Reprod. Update. 24:119–134. doi:10.1093/humupd/dmy002.
Xu, H., M.D. Beasley, W.D. Warren, G.T.J. van der Horst, and M.J. McKay. 2005. Absence of Mouse REC8 Cohesin Promotes Synapsis of Sister Chromatids in Meiosis. Dev. Cell. 8:949–961. doi:10.1016/j.devcel.2005.03.018.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the previous reviews.
Reviewer #1:<br /> (1) I still think that the authors need to set the importance of the differences in aggregation in the context of toxicity arising from protein misfolding/aggregation. While the authors state the limitation in the response, and I agree that a single manuscript cannot complete a field of investigation I still think that this is an important point missing from this manuscript.
We thank the reviewer for the comments, we are working to address this issue and will elucidate in our future studies.
(2) I retain my reservations about the fluorescence intensity data shown for Rho123, DCF, Jc1, and MitoSox. The errors are much lower than what we typically achieve in biological experiments in our as well as our collaborator's lab. A glimpse at published literature would also support our statement. Specifically, RHO123 shows a large difference in errors between Figure 5 and Figure 5 Supplement 2. The point to note is that the absolute intensities do not vary between these figures, but the errors are the order of magnitude lower in the main figures. I, therefore, accept these figures in good faith without further interrogation.
We really value these comments from the reviewer and also do not want to cause any potential misleading interpretations of the data. We have therefore asked a more experienced author to redo all the experiments on the physiological indicators (Rho123, JC1 and MitoSox) that directly reflect mitochondrial function, and left out the DCF data. The new experimental data are in line with our previous results. We have clearly described these changes in the Results, Materials and Methods and Figure legends sections.
The new data from the redo experiments are: Rho123 fluorescence intensity data in Figure 5A, B and C; Figure 6B; JC1 staining in Figure 6E; JC1 staining in Figure 7A, B and D.
-
-
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
Summary:
This paper introduces a new approach to modeling human behavioral responses using image-computable models. They create a model (VAM) that is a combination of a standard CNN coupled with a standard evidence accumulation model (EAM). The combined model is then trained directly on image-level data using human behavioral responses. This approach is original and can have wide applicability. However, many of the specific findings reported are less compelling.
Strengths:
(1) The manuscript presents an original approach to fitting an image-computable model to human behavioral data. This type of approach is sorely needed in the field.
(2) The analyses are very technically sophisticated.
(3) The behavioral data are large both in terms of sample size (N=75) and in terms of trials per subject.
Weaknesses:
Major
(1) The manuscript appears to suggest that it is the first to combine CNNs with evidence accumulation models (EAMs). However, this was done in a 2022 preprint
(https://www.biorxiv.org/content/10.1101/2022.08.23.505015v1) that introduced a network called RTNet. This preprint is cited here, but never really discussed. Further, the two unique features of the current approach discussed in lines 55-60 are both present to some extent in RTNet. Given the strong conceptual similarity in approach, it seems that a detailed discussion of similarities and differences (of which there are many) should feature in the Introduction.
Thanks for pointing this out—we agree that the novel contributions of our model (the VAM) with respect to prior related models (including RTNet) should be clarified, and have revised the Introduction accordingly. We include the following clarifications in the Introduction:
“The key feature of the VAM that distinguishes it from prior models is that the CNN and EAM parameters are jointly fitted to the RT, choice, and visual stimulus data from individual participants in a unified Bayesian framework. Thus, both the visual representations learned by the CNN and the EAM parameters are directly constrained by behavioral data. In contrast, prior models first optimize the CNN to perform the behavioral task, then separately fit a minimal set of high-level CNN parameters [RTNet, Rafiei et al., 2024] and/or the EAM parameters to behavioral data [Annis et al., 2021; Holmes et al., 2020; Trueblood et al., 2021]. As we will show, fitting the CNN with human data—rather than optimizing the model to perform a task—has significant consequences for the representations learned by the model.”
E.g. in the case of RTNet, the variability of the Bayesian CNN weight distribution, the decision threshold, and the magnitude of the noise added to the images are adjusted to match the average human accuracy (separately for each task condition). RTNet is an interesting and useful model that we believe has complementary strengths to our own work.
Since there are several other existing models in addition to the VAM and RTNet that use CNNs to generate RTs or RT proxies (by our count, at least six that we cite earlier in the Introduction), we felt it was inappropriate to preferentially include a detailed comparison of the VAM and RTNet beyond the passage quoted above.
(2) In the approach here, a given stimulus is always processed in the same way through the core CNN to produce activations v_k. These v_k's are then corrupted by Gaussian noise to produce drift rates d_k, which can differ from trial to trial even for the same stimulus. In other words, the assumption built into VAM appears to be that the drift rate variability stems entirely from post-sensory (decisional) noise. In contrast, the typical interpretation of EAMs is that the variability in drift rates is sensory. This is also the assumption built into RTNet where the core CNN produces noisy evidence. Can the authors comment on the plausibility of VAM's assumption that the noise is post-sensory?
In our view, the VAM is compatible with a model in which the drift rate variability for a given stimulus is due to sensory noise, since we do not specify the origin of the Gaussian noise added to the drift rates. As the reviewer notes, the CNN component of the VAM processes a given stimulus deterministically, yielding the mean drift rates. This does not preclude us from imagining an additional (unmodeled) sensory process that adds variability to the drift rates. The VAM simply represents this and other hypothetical sources of variability as additive Gaussian noise. We agree however that it is worthwhile to think about the origin of the drift rate variability, though it is not a focus of our work.
(3) Figure 2 plots how well VAM explains different behavioral features. It would be very useful if the authors could also fit simple EAMs to the data to clarify which of these features are explainable by EAMs only and which are not.
In our view, fitting simple EAMs to the data would not be especially informative and poses a number of challenges for the particular task we study (LIM) that are neatly avoided by using the VAM. In particular, as we show in Figure 2, the stimuli vary along several dimensions that all appear to influence behavior: horizontal position, vertical position, layout, target direction, and flanker direction. Since the VAM is stimulus-computable, fitting the VAM automatically discovers how all of these stimulus features influence behavior (via their effect on the drift rates outputted by the CNN). In contrast, fitting a simple EAM (e.g. the LBA model) necessitates choosing a particular parameterization that specifies the relationship between all of the stimulus features and the EAM model parameters. This raises a number of practical questions. For example, should we attempt to fit a separate EAM for each stimulus feature, or model all stimulus features simultaneously?
Moreover, while we could in principle navigate these issues and fit simple EAMs to the data, we do not intend to claim that simple EAMs fail to explain the relationship between stimulus features and behavior as well as the VAM. Rather, the key strength of the VAM relative to simple EAMs is that it includes a detailed and biologically plausible model of human vision. The majority of the paper capitalizes on this strength by showing how behavioral effects of interest (namely congruency effects) can be explained in terms of the VAM’s visual representations.
(4) VAM is tested in two different ways behaviorally. First, it is tested to what extent it captures individual differences (Figure 2B-E). Second, it is tested to what extent it captures average subject data (Figure 2F-J). It wasn't clear to me why for some metrics only individual differences are examined and for other metrics only average human data is examined. I think that it will be much more informative if separate figures examine average human data and individual difference data. I think that it's especially important to clarify whether VAM can capture individual differences for the quantities plotted in Figures 2F-J.
We would like to clarify that Fig. 2J in fact already shows how well the VAM captures individual differences for the average subject data shown in Fig. 2H (stimulus layout) and Fig. 2I (stimulus position). For a given participant and stimulus feature, we calculated the Pearson's r between model/participant mean RTs across each stimulus feature value. Fig. 2J shows the distribution of these Pearson’s r values across all participants for stimulus layout and horizontal/vertical position.
Fig. 2G also already shows how well the VAM captures individual differences in behavior. Specifically, this panel shows individual differences in mean RT attributable to differences in age. For Fig. 2F, which shows how the model drift rates differ on congruent vs. incongruent trials, there is no sensible way to compare the models to the participants at any level of analysis (since the participants do not have drift rates).
(5) The authors look inside VAM and perform many exploratory analyses. I found many of these difficult to follow since there was little guidance about why each analysis was conducted. This also made it difficult to assess the likelihood that any given result is robust and replicable. More importantly, it was unclear which results are hypothesized to depend on the VAM architecture and training, and which results would be expected in performance-optimized CNNs. The authors train and examine performance-optimized CNNs later, but it would be useful to compare those results to the VAM results immediately when each VAM result is first introduced.
Thanks for pointing this out—we apologize for any confusion caused by our presentation of the CNN analyses. We have added in additional motivating statements, methodological clarifications, and relevant references to our Results, particularly for Figure 3 in which we first introduce the analyses of the CNN representations/activity. In general, each analysis is prefaced by a guiding question or specific rationale, e.g. “How do the models' visual representations enable target selectivity for stimuli that vary along several irrelevant dimensions?” We also provide numerous references in which these analysis techniques have been used to address similar questions in CNNs or the primate visual cortex.
We chose to maintain the current organization of our results in which the comparison between the VAM and the task-optimized models are presented in a separate figure. We felt that including analyses of both the VAM and task-optimized models in the initial analyses of the CNN representations would be overwhelming for many readers. As the reviewer acknowledges, some readers may already find these results challenging to follow.
(6) The authors don't examine how the task-optimized models would produce RTs. They say in lines 371-2 that they "could not examine the RT congruency effect since the task-optimized models do not generate RTs." CNNs alone don't generate RTs, but RTs can easily be generated from them using the same EAM add-on that is part of VAM. Given that the CNNs are already trained, I can't see a reason why the authors can't train EAMs on top of the already trained CNNs and generate RTs, so these can provide a better comparison to VAM.
We appreciate this suggestion, but we judge the suggestion to “train EAMs on top of the already trained CNNs and generate RTs” to be a significant expansion of the scope of the paper with multiple possible roads forward. In particular, one must specify how the outputs of the task-optimized CNN (logits for each possible response) relate to drift rates, and there is no widely-accepted or standard way to do this. Previously proposed methods include transforming representation distances in the last layer to drift rates (https://doi.org/10.1037/xlm0000968), fitting additional subject-specific parameters that map the logits to drift rates
(https://doi.org/10.1007/s42113-019-00042-1), or using the softmax-scored model outputs as drift rates directly (https://doi.org/10.1038/s41562-024-01914-8), though in the latter case the RTs are not on the same scale as human data. In our view, evaluating these different methods is beyond the scope of this paper. An advantage of the VAM is that one does not have to fit two separate models (a CNN and a EAM) to generate RTs.
Nonetheless, we agree that it would be informative to examine something like RTs in the task-optimized models. Our revised Results section now includes an analysis of the confidence of the task-optimized models’ decisions, which we use a proxy for RTs:
“Since the task-optimized models do not generate RTs, it is not possible to directly measure RT congruency effects in these models without making additional assumptions about how the CNN's classification decisions relate to RTs. However, as a coarse proxy for RT, we can examine the confidence of the CNN's decisions, defined as the softmax-scored logit (probability) of the most probable direction in the final CNN layer. This choice of RT proxy is motivated by some prior studies that have combined CNNs with EAMs [Annis et al., 2021; Holmes et al., 2020; Trueblood et al., 2021]. These studies explicitly or implicitly derive a measure of decision confidence from the activity of the last CNN layer. The confidence measure is then mapped to the EAM drift rates, such that greater decision confidence generally corresponds to higher drift rates (and therefore shorter RTs).
We calculated the average confidence of each task-optimized CNN separately for congruent vs. incongruent trials. On average, the task-optimized models showed higher confidence on congruent vs. incongruent trials (W = 21.0, p < 1e-3, Wilcoxon signed-rank test; Cohen's d = 0.99; n = 75 models). These analyses therefore provide some evidence that task-optimized CNNs have the capacity to exhibit congruency effects, though an explicit comparison of the magnitude of these effects with human data requires additional modeling assumptions (e.g., fitting a separate EAM).”
(7) The Discussion felt very long and mostly a summary of the Results. I also couldn't shake the feeling that it had many just-so stories related to the variety of findings reported. I think that the section should be condensed and the authors should be clearer about which explanations are speculations and which are air-tight arguments based on the data.
We have shortened the Discussion modestly and we have added in some clarifying language to help clarify which arguments are more speculative vs. directly supported by our data.
Specifically, we added in the phrase “we speculate that…” for two suggestions in the Discussion (paragraphs 3 and 5), and we ensured that any other more speculative suggestions contain such clarifying language. We have also added in subheadings in the Discussion to help readers navigate this section.
(8) In one of the control analyses, the authors train different VAMs on each RT quantile. I don't understand how it can be claimed that this approach can serve as a model of an individual's sensory processing. Which of the 5 sets of weights (5 VAMs) captures a given subject's visual processing? Are the authors saying that the visual system of a given subject changes based on the expected RT for a stimulus? I feel like I'm missing something about how the authors think about these results.
We agree that these particular analyses may cause confusion and have removed them from our revised manuscript.
Reviewer #2 (Public Review):
In an image-computable model of speeded decision-making, the authors introduce and fit a combined CCN-EAM (a 'VAM') to flanker-task-like data. They show that the VAM can fit mean RTs and accuracies as well as the congruency effect that is present in the data, and subsequently analyze the VAM in terms of where in the network congruency effects arise.
Overall, combining DNNs and EAMs appears to be a promising avenue to seriously model the visual system in decision-making tasks compared to the current practice in EAMs. Some variants have been proposed or used before (e.g., doi.org/10.1016/j.neuroimage.2017.12.078 , doi.org/10.1007/s42113-019-00042-1), but always in the context of using task-trained models, rather than models trained on behavioral data. However, I was surprised to read that the authors developed their model in the context of a conflict task, rather than a simpler perceptual decision-making task. Conflict effects in human behavior are particularly complex, and thereby, the authors set a high goal for themselves in terms of the to-be-explained human behavior. Unfortunately, the proposed VAM does not appear to provide a great account of conflict effects that are considered fundamental features of human behavior, like the shape of response time distributions, and specifically, delta plots (doi.org/10.1037/0096-1523.20.4.731). The authors argue that it is beyond the scope of the presented paper to analyze delta plots, but as these are central to studies of human conflict behavior, models that aim to explain conflict behavior will need to be able to fit and explain delta plots.
Theories on conflict often suggest that negative/positive-trending delta plots arise through the relative timing of response activation related to relevant and irrelevant information.
Accumulation for relevant and irrelevant information would, as a result, either start at different points in time or the rates vary over time. The current VAM, as a feedforward neural network model, does not appear to be able to capture such effects, and perhaps fundamentally not so: accumulation for each choice option is forced to start at the same time, and rates are a static output of the CNN.
The proposed solution of fitting five separate VAMs (one for each of five RT quantiles) is not satisfactory: it does not explain how delta plots result from the model, for the same reason that fitting five evidence accumulation models (one per RT quantile) does not explain how response time distributions arise. If, for example, one would want to make a prediction about someone's response time and choice based on a given stimulus, one would first have to decide which of the five VAMs to use, which is circular. But more importantly, this way of fitting multiple models does not explain the latent mechanism that underlies the shape of the delta plots.
As such, the extensive analyses on the VAM layers and the resulting conclusions that conflict effects arise due to changing representations across layers (e.g., "the selection of task-relevant information occurs through the orthogonalization of relevant and irrelevant representations") - while inspiring, they remain hard to weigh, as they are contingent on the assumption that the VAM can capture human behavior in the conflict task, which it struggles with. That said, the promise of combining CNNs and EAMs is clearly there. A way forward could be to either adjust the proposed model so that it can explain delta plots, which would potentially require temporal dynamics and time-varying evidence accumulation rates, or perhaps to start simpler and combine CCNs-EAMs that are able to fit more standard perceptual decision-making tasks without conflict effects.
We thank the reviewer for their thoughtful comments on our work. However, we note that the
VAM does in fact capture the positive-trending RT delta plot observed in the participant data (Fig. S4A), though the intercepts for models/participants differ somewhat. On the other hand, the conditional accuracy functions (Fig. S4B) reveal a more pronounced difference between model and participant behavior. As the reviewer points out, capturing these effects is likely to require a model that can produce time-varying drift rates, whereas our model produces a fixed drift rate for a given stimulus. We also agree that fitting a separate VAM to each RT quantile is not a satisfactory means of addressing this limitation and have removed these analyses from our revised manuscript.
However, while we agree that accurately capturing these dynamic effects is a laudable goal, it is in our view also worthwhile to consider explanations for the mean behavioral effect (i.e. the accuracy congruency effect), which can occur independently of any consideration of dynamics. One of our main findings is that across-model variability in accuracy congruency effects is better attributed to variation in representation geometry (target/flanker subspace alignment) vs.
variation in the degree of flanker suppression. This finding does not require any consideration of dynamics to be valid at the level of explanation we pursue (across-user variability in congruency effects), but also does not preclude additional dynamic processes that could give rise to more specific error patterns. Our revised discussion now includes a section where we summarize and elaborate on these ideas:
“It is not difficult to imagine how the orthogonalization mechanism described above, which explains variability in accuracy congruency effects across individuals, could act in concert with other dynamic processes that explain variability in congruency effects within individuals (e.g., as a function of RT). In general, any process that dynamically gates the influence of irrelevant sensory information on behavioral outputs could accomplish this, for example ramping inhibition of incorrect response activation [https://doi.org/10.3389/fnhum.2010.00222], a shrinking attention spotlight [https://doi.org/10.1016/j.cogpsych.2011.08.001], or dynamics in neural population-level geometry [https://doi.org/10.1038/nn.3643]. To pursue these ideas, future work may aim to incorporate dynamics into the visual component and decision component of the VAM with recurrent CNNs [https://doi.org/10.48550/arXiv.1807.00053, https://doi.org/10.48550/arXiv.2306.11582] and the task-DyVA model [https://doi.org/10.1038/s41562-022-01510-8], respectively.”
Reviewer #3 (Public Review):
Summary:
In this article, the authors combine a well-established choice-response time (RT) model (the Linear Ballistic Accumulator) with a CNN model of visual processing to model image-based decisions (referred to as the Visual Accumulator Model - VAM). While this is not the first effort to combine these modeling frameworks, it uses this combination of approaches uniquely.
Specifically, the authors attempt to better understand the structure of human information representations by fitting this model to behavioral (choice-RT) data from a classic flanker task. This objective is made possible by using a very large (by psychological modeling standards) industry data set to jointly fit both components of this VAM model to individual-level data. Using this approach, they illustrate (among other results) (1) how the interaction between target and flanker representations influence the presence and strength of congruency effects, (2) how the structure of representations changes (distributed versus more localized) with depth in the CNN model component, and (3) how different model training paradigms change the nature of information representations. This work contributes to the ML literature by demonstrating the value of training models with richer behavioral data. It also contributes to cognitive science by demonstrating how ML approaches can be integrated into cognitive modeling. Finally, it contributes to the literature on conflict modeling by illustrating how information representations may lead to some of the classic effects observed in this area of research.
Strengths:
(1) The data set used for this analysis is unique and is made publicly available as part of this article. Specifically, they have access to data for 75 participants with >25,000 trials per participant. This scale of data/individual is unusual and is the foundation on which this research rests.
(2) This is the first time, to my knowledge, that a model combining a CNN with a choice-RT model has been jointly fit to choice-RT data at the level of individual people. This type of model combination has been used before but in a more restricted context. This joint fitting, and in particular, learning a CNN through the choice-RT modeling framework, allows the authors to probe the structure of human information representations learned directly from behavioral data.
(3) The analysis approaches used in this article are state-of-the-art. The training of these models is straightforward given the data available. The interesting part of this article (opinion of course) is the way in which they probe what CNN has learned once trained. I find their analysis of how distractor and target information interfere with each other particularly compelling as well as their demonstration that training on behavioral data changes the structure of information representations when compared to training models on standard task-optimized data.
Weaknesses:
(1) Just as the data in this article is a major strength, it is also a weakness. This type of modeling would be difficult, if not impossible to do with standard laboratory data. I don't know what the data floor would be, but collecting tens of thousands of decisions for a single person is impractical in most contexts. Thus this type of work may live in the realm of industry. I do want to re-iterate that the data for this study was made publicly available though!
We suspect (but have not systematically tested) that the VAMs can be fitted with substantially less data. We use data augmentation techniques (various randomized image transformations) during training to improve the generalization capabilities of the VAMs, and these methods are likely to be particularly important when training on smaller datasets. One could consider increasing the amount of image data augmentation when working with smaller datasets, or pursuing other forms of data augmentation like resampling from estimated RT distributions (see https://doi.org/10.1038/s41562-022-01510-8 for an example of this). In general, we don’t think that prospective users of our approach should be discouraged if they have only a few hundred trials per subject (or less) - it’s worth trying!
(2) While this article uses choice-RT data it doesn't fully leverage the richness of the RT data itself. As the authors point out, this modeling framework, the LBA component in particular, does not account for some of the more nuanced but well-established RT effects in this data. This is not a big concern given the already nice contributions of this article and it leads to an opportunity for ongoing investigation.
We agree that fully capturing the more nuanced behavioral effects you mention (e.g. RT delta plots and conditional accuracy functions) is a worthwhile goal for future research—see our response to Reviewer #2 for a more detailed discussion. ----------
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
(1) The phrase in the Abstract "convolutional neural network models of visual processing and traditional EAMs are jointly fitted" made me initially believe that the two models were fitted independently. You may want to re-word to clarify.
We think that the phrase “jointly fitted” already makes it clear that both the CNN and EAM parameters are estimated simultaneously, in agreement with how this term is usually used. But we have nonetheless appended some additional clarifying language to that sentence (“in a unified Bayesian framework”).
(2) Lines 27-28: EAMs "are the most successful and widely-used computational models of decision-making." This is only true for the specific type of decision-making examined here, namely joint modeling of choice and response times. Signal detection theory is arguably more widely-used when response times are not modeled.
Thanks for pointing this out - we have revised the referenced sentence accordingly.
(3) Could the authors clarify what is plotted in Figure 2F?
Fig. 2F shows the drift rates for the target, flanker, and “other” (non-target/non-flanker) accumulators averaged over trials and models for congruent vs. incongruent trials. In case this was a source of confusion, we do not show the value of the flanker drift rates on congruent trials because the flanker and target accumulators are identical (i.e. the flanker/congruent drift rates are equivalent to the target/congruent drift rates).
(4) Lines 214-7: "The observation that single-unit information for target direction decreased between the fourth and final convolutional layers while population-level decoding remained high is especially noteworthy in that it implies a transition from representing target direction with specialized "target neurons" to a more distributed, ensemble-level code." Can the authors clarify why this is the only reasonable explanation for these results? It seems like many other explanations could be construed.
We have added additional clarification to this section and now use more tentative language:
“The observation that single-unit information for target direction decreased between the fourth and final convolutional layers indicates that the units become progressively less selective for particular target directions. Since population-level decoding remained high in these layers, this suggests a transition from representing target direction with specialized "target neurons" to a more distributed, ensemble-level code.”
(5) Lines 372-376: "Thus, simply training the model to perform the task is not sufficient to reproduce a behavioral phenomenon widely-observed in conflict tasks. This challenges a core (but often implicit) assumption of the task-optimized training paradigm, namely that to do a task well, a training model will result in model representations that are similar to those employed by humans." While I agree with the general sentiment, I feel that its application here is strange. Unless I'm missing something, in the context of the preceding sentence, the authors seem to be saying that researchers in the field expect that CNNs can produce a behavioral phenomenon (RTs) that is completely outside of their design and training. I don't think that anyone actually expects that.
We moved the discussion/analyses of RTs to the next paragraph. It should now be clear that this statement refers specifically to the absence of an accuracy congruency effect in the task-optimized models.
(6) Lines 387-389: "As a result, the VAMs may learn richer representations of the stimuli, since a variety of stimulus features-layout, stimulus position, flanker direction-influence behavior (Figure 2)." That is certainly true of tasks like this one where an optimal model would only focus on a tiny part of the image, whereas humans are distracted by many features. I'm not sure that this distractibility is the same as "richer representations". When CNNs classify images based on the background, would the authors claim that they have richer representations than humans?
We agree that “richer” may not be the best way to characterize these representations, and have changed it to “more complex”.
(7) Is it possible that drift rate d_k for each response happens to be negative on a given trial? If so, how is the decision given on such trials (since presumably none of the accumulators will ever reach the boundary)?
It is indeed possible for all of the drift rates to be negative, though we found that this occurred for a vanishingly small number of trials (mean ± s.e.m. percent trials/model: 0.080 ± 0.011%, n = 75 models), as reported in the Methods. These trials were excluded from analyses.
(8) Can the authors comment on how they chose the CNN architecture and whether they expect that different architectures will produce similar results?
Before establishing the seven-layer CNN architecture used throughout the paper, we conducted some preliminary experiments using other architectures that differed primarily in the number of CNN layers. We found that models with significantly fewer than seven layers typically failed to reach human-level accuracy on the task while larger models achieved human-level accuracy but (unsurprisingly) took longer to train.
Reviewer #3 (Recommendations For The Authors):
- In the introduction to this paper (particularly the paragraph beginning in line 33), the authors note that EAMs have typically been used in simplified settings and that they do not provide a means to account for how people extract information from naturalistic stimuli. While I agree with this, the idea of connecting CNNs of visual processing with EAMs for a joint modeling framework has been done. I recommend looking at and referencing these two articles as well as adjusting the tenor of this part of an introduction to better reflect the current state of the literature. For full disclosure, I am one of the authors on these articles. https://link.springer.com/article/10.1007/s42113-019-00042-1 https://www.sciencedirect.com/science/article/abs/pii/S0010027721001323
We agree—thanks for pointing this out. The revised Introduction now discusses prior related models in more detail (including those referenced above) and better clarifies the novel contributions of our model. We specifically highlight that a novel contribution of the VAM is that “the CNN and EAM parameters are jointly fitted to the RT, choice, and visual stimulus data from individual participants in a unified Bayesian framework.”
- The statement in lines 56-58 implies that this is the first article to glue CNNs together with EAMs. I would edit this accordingly based on the prior comment here and references provided. I will note that the second feature of the approach in this paper is still novel and really nice, namely the fact that the CNN and the EAM are jointly fitted. In the aforementioned references, the CNN is trained on the image set, and individual level Bayesian estimation was only applied to the EAM. Thus, it may be useful to highlight the joint estimation aspect of this investigation as well as how the uniqueness of the data available makes it possible.
Agreed—see above.
- Figure 3c and associated text. I understand the MI analysis you are performing here, however it is difficult to interpret as it stands. In the figure, what does a MI of 0.1 mean?? Can you give some context to that scale? I do find the interpretation of the hunchback shape in lines 210-222 to be somewhat of a stretch. The discussion that precedes (lines 199-209) this is clear and convincing. Can this discussion be strengthened more? And more interpretability of Figure 3c would be helpful; entropic scales can be hard to interpret without some context or scale associated.
The MI analyses in Fig. 3C (and also Figs. 4C and 6E) show normalized MI, in which the raw MI has been divided by the entropy of the stimulus feature distribution. This normalization facilitates comparing the MI for different stimulus features, which is relevant for Figs. 4C and 6E. The normalized MI has a possible range of [0, 1], where 1 indicates perfect correlation between the two variables and 0 indicates complete independence. We now note in the legend of these figures that the possible normalized MI range is [0, 1], which should help with interpreting these values. Our revised results section for Fig. 3C now also includes some additional remarks on our interpretation of the hunchback shape of the MI.
- Lines 244-248 and the analyses in Figure 3 suggest a change in the behavior of the CNN around layer 4. This is just a musing, but what would happen if you just used a 4 layer CNN, or even a 3 layer? This is not just a methods question. Your analysis suggests a transition from localized to distributed information representation. Right now, the EAM only sees the output of the distributed representation. What if it saw the results the more local representations from early layers? Of course, a shallower network may just form the distributed representations earlier, but it would interesting if there were a way to tease out not just the presence of distributed vs local representations, but the utility of those to the EAM.
Thanks for this interesting suggestion. We did do some preliminary experiments in models with fewer layers, though we only examined the outputs of these models and did not assess their representations. We found that models with 3–5 layers generally failed to achieve human-level accuracy on the task. In principle, one could relate this observation to the representations of these models as a means of assessing the relative utility of distributed/local representations. However, there are confounding factors that one would ideally control for in order to compare models with different numbers of layers in this fashion (namely, the number of parameters).
- Section Line 359 (Task optimized models) - It would be helpful to clarify here what these task-optimized models are being trained to do. As I understand it, they are being trained to directly predict the target direction. But are you asking them to learn to predict the true target direction? Or are you training them to predict what each individual responds? I think it is the second (since you have 75 of these), but it's not clear. I looked at the methods and still couldn't get a clear description of this. Also, are you just stripping the LBA off of the end of the CNN and then essentially putting a softmax in its place? If so, it would be helpful to say so.
The task-optimized models were actually trained to output the true target direction in each stimulus, rather than trained to match the decisions of the human participants. We trained 75 such models since we wanted to use exactly the same stimuli as were used to train each VAM. The task-optimized CNNs were identical to those used in the VAMs, except that the outputs of the last layer were converted to softmax-scored probabilities for each direction rather than drift rates. The Results and Methods section now included additional commentary that clarifies these points.
- Line 373-376: This statement is pretty well established at this point in the similarity judgement literature. I recommend looking at and referencing https://onlinelibrary.wiley.com/doi/full/10.1111/cogs.13226 https://www.nature.com/articles/s41562-020-00951-3 https://link.springer.com/article/10.1007/s42113-020-00073-z
Thanks for pointing this out. For reference, the statement in question is “Thus, simply training the model to perform the task is not sufficient to reproduce a behavioral phenomenon widely-observed in conflict tasks. This challenges a core (but often implicit) assumption of the task-optimized training paradigm, namely that training a model to do a task well will result in model representations that are similar to those employed by humans.”
We agree that the first and third reference you mention are relevant, and we now cite them along with some other relevant work. In our view, the second reference you mention is not particularly relevant (that paper introduces a new computational model for similarity judgements that is fit to human data, but does not comment on training models to perform tasks vs. fitting to human data).
- Line 387-388: "VAMs may learn richer representations". This is a bit of a philosophical point, but I'll go ahead and mention it. The standard VAM does not necessarily learn "richer" feature representations. Rather, you are asking the VAM and task-optimized models to do different things. As a result, they learn different representations. "Better" or "richer" is in the eye of the beholder. In one view, you could view the VAM performance as sub-par since it exhibits strange artifacts (congruency effects) and the expansion of dimensionality in the VAM representations is merely a side-effect of poor performance. I'm not advocating this view, just playing devils advocate and suggesting a more nuanced discussion of the difference between the VAM and task-optimized models.
We agree—this is a great point. We have changed this statement to read “the VAMs may learn more complex [rather than richer] representations of the stimuli”.
- Lines 567-570: Here you discuss how the LBA backend of the VAM can't account for shrinking spotlight-like RT effects but that fitting models to different RT quantiles helps overcome this. I find this to be one of the weakest points of the paper (the whole process of fitting RT quantiles separately to begin with). This is just a limitation of the RT component of the model. This is a great paper but this is just a limitation inherent in the model. I don't see a need to qualify this limitation and think it would be better to just point out that this is a limitation of the LBA itself (be more clear that it is the LBA that is the limiting factor here) and that this leaves room for future research. From your last sentence of this paragraph, I agree that recurrent CNNs would be interesting. I will note that RNN choice-RT models are out there (though not with CNNs as part of the model).
We agree and have revised this section of the Discussion accordingly (see our response to Reviewer #2 for more detail). We also removed the analyses of models trained on separate RT quantiles.
Tags
Annotators
URL
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the current reviews.
eLife Assessment
The study presents a potentially valuable approach to genetically modify cells to produce extracellular matrices with altered compositions, termed cell-laid, engineered extracellular matrices (eECM). The evidence supporting the authors' conclusions regarding the utility of eECM for endogenous repair is solid, although there are some disagreements on the chondrogenicity of lyophilized constructs which was viewed as lacking robust evidence for endochondral ossification.
We thank the reviewers for the assessment of our work. We however strongly contest the lack of evidence for chondrogenicity and endochondral ossification. This is robustly demonstrated and a clear strength of our study.
Public Reviews:
Reviewer #1 (Public review):
Summary:
The authors aimed to modify the characteristics of the extracellular matrix (ECM) produced by immortalized mesenchymal stem cells (MSCs) by employing the CRISPR/Cas9 system to knock out specific genes. Initially, they established VEGF-KO cell lines, demonstrating that these cells retained chondrogenic and angiogenic properties. Additionally, lyophilized carriage tissues produced by these cells exhibited retained osteogenic properties.
Subsequently, the authors established RUNX2-KO cell lines, which exhibited reduced COLX expression during chondrogenic differentiation and notably diminished osteogenic properties in vitro. Transplantation of lyophilized carriage tissues produced by RUNX2-KO cell lines into osteochondral defects in rat knee joints resulted in the regeneration of articular cartilage tissues as well as bone tissues, a phenomenon not observed with tissues derived from parental cells. This suggests that gene-edited MSCs represent a valuable cell source for producing ECM with enhanced quality.
Strengths:
The enhanced cartilage regeneration observed with ECM derived from RUNX2-KO cells supports the authors' strategy of creating gene-edited MSCs capable of producing ECM with superior quality. Immortalized cell lines offer a limitless source of off-the-shelf material for tissue regeneration.
Weaknesses:
Most of the data align with anticipated outcomes, offering limited novelty to advance scientific understanding. Methodologically, the chondrogenic differentiation properties of immortalized MSCs appeared deficient, evidenced by Safranin-O staining of 3D tissues and histological findings lacking robust evidence for endochondral differentiation. This presents a critical limitation, particularly as authors propose the implantation of cartilage tissues for in vivo experiments. Instead, the bulk of data stemmed from type I collagen scaffold with factors produced by MSCs stimulated by TGFβ.
We thank the reviewer for the thorough evaluation. We appreciate the highlighted novelty but overall disagree with key points from the provided assessment. The most important one being non the contested in vitro cartilage and endochondral ossification by engineered ECMs, for which we have provided compelling evidence. Of note, the reviewer points the “osteogenic” properties of our tissues; the wording is incorrect since cells are absent from the final grafts. Here, the term ”osteoinductivity” should be employed, in line with the model of ectopic ossification used to demonstrate de novo bone formation.
In the revised version, the authors presented Safranin-O staining results of pellets prior to lyophilization. The inset of figures showing entire pellets revealed that Safranin-O-positive areas were limited, suggesting that cells in the negative regions had not differentiated into chondrocytes. In Figure 3F, DAPI staining showed devitalized cells in the outer layer but was negative in the central part, indicating the absence of cells in these areas and incomplete differentiation induction.
We strongly disagree with the reviewer on the lack of demonstrated chondrogenicity. We have provided evidence of Safranin-O positivity, GAGs quantification, as well as collagen type 2 and collagen type X stainings (also quantified). Frankly, those are gold standard assays in the field and we do not understand the reviewer point of view. We however agree that our grafts are not entirely composed of cartilage matrix. There are areas where cartilage is absent, in particular in the core of the tissues. This is expected from in vitro engineered cartilage pellets even from primary BM-MSCs donors. By selecting primary donors it is possible to obtain a superior cartilage formation. Our MSOD-B cells remain to-the-best-of-our -knowledge, the only human line capable of in vitro chondrogenesis, even if considered moderate.
We agree with the absence of cells in the core area of our tissues, as correctly pointed out by the reviewer. This has been reported in other studies whereby the lack of media diffusion can lead to necrotic core formation.
The rationale for establishing VEGF-KO cell lines remains unclear, and the authors' explanation in the revised manuscript is still equivocal. While they mention that VEGF is a late marker for endochondral ossification, the data in Figures 1D and 1E clearly show that VEGF-KO affects the early phase of endochondral ossification.
We feel that the rationale for a VEGF-KO is sufficiently conveyed. In our study, VEGF-KO affects GAGs content in the tissue, but not the efficiency of ossification.
Insufficient depth was given to elucidate the disparity in osteogenic properties between those observed in ectopic bone formation and those observed in transplantation into osteochondral defects.
We here agree with the reviewer on the limited depth of our osteochondral assessment. However, this was performed as a proof-of-concept and we clearly conveyed both limitations and need of a follow-up study to demonstrate the repair efficacy of our tissue in such defect context.
In the ectopic bone formation study, most of the collagenous matrix observed at 2 weeks was resorbed by 6 weeks, with only a small amount contributing to bone formation in MSOD-B cells (Figs. 2I and 4C). This finding does not align with the micro-CT data presented in Figures 2H and 4B. For the micro-CT experiments, it would be more appropriate to use a standard window for bone and present the data accordingly.
Stainings report the deposition of collagens and may be misleading as not only indicating frank bone formation. This is the reason why we provided microCT data, offering a quantitative assessment of the full grafts and more reliably evaluating mineralized/bone tissue. We feel that our results matched our conclusions.
While the regeneration of articular cartilage in RUNX2-KO ECM presents intriguing results, the study lacked an exploration into underlying mechanisms, such as histological analyses at earlier time points.
We do agree with the reviewer regarding this limitation. In addition to mechanisms and early timepoints, we are also interested in longer in vivo evaluation. This represents a significant amount of work which is beyond the scope of our present manuscript.
Reviewer #3 (Public review):
Summary:
In this study, the authors have started off using an immortalized human cell line and then gene edited it to decrease the levels of VEGF1 (in order to influence vascularization), and the levels of Runx2 (to decrease osteogenesis). They first transplanted these cells with a collagen scaffold. The modified cells showed a decrease in vascularization when VEGF1 was decreased, and suggested an increase in cartilage formation.
In another study, matrix generated by these cells subsequently remodeled into a bone marrow organ. When RUNX2 was decreased, the cells did not mineralize in vitro, and their matrices expressed types I and II collagen but not type X collagen in vitro, in comparison with unedited cells. In vivo, the author claims that remodeling of the matrices into bone was somewhat inhibited. Lastly, they utilized matrices generated by RUNX2-edited cells to regenerate chondro-osteal defects. They suggest that the edited cells regenerated cartilage in comparison with unedited cells.
Strengths:
- The notion that inducing changes in the ECM by genetically editing the cells is a novel one, as it has long been thought that ECM composition influences cell activity.
- If successful, it may be possible to make off the shelf ECMS to carry out different types of tissue repair.
Weaknesses:
- The authors have not demonstrated robust cartilage formation (quantitation would be useful).
- Measuring total GAG content does not prove the presence of cartilage
- There are numerous overstatements about forming and implanting cartilage.
- Although it is implied, RUNX2 deletion did not improve cartilage formation by the modified cells.
- In the control line, MSOD-B there were variability in the amount of safranin O positive material in various histological panels in the figures.; more quantitation is needed.
- In the in vivo articular defect experiments, an untreated injured joint is needed as a negative control.
- Statements about bone generation are often not reflective of the microCT data presented.<br /> - The discussion over-interprets the results.
We thank the reviewer for the further assessment of our work. We respectfully disagree with most of the provided statements. The chondrogenicity of our graft is robustly demonstrated using multiple readouts, including quantitative ones. Beyond GAGs, we provided clear Safranin-O stainings, as well as collagen type 2 and X indicating presence of hypertrophic cartilage matrix. Those are the gold standards in the field and we thus do not understand the reviewer scepticism. We do agree that our grafts are fully composed of cartilage matrix, with areas (in the core) deprived of cartilage. This does not impact the core findings of our study and its conclusions, and we strongly feel our statements about forming in vitro cartilage fully stand.
We do not claim in the manuscript an increased cartilage formation following RUNX2 deletion. We report in vitro an impaired hypertrophy (collagen type X) and maintenance of collagen type 2 and GAGs content.
We are confident on our data regarding de novo bone formation bi priming endochondral ossification, confirmed both by stainings and microCT. We feel that our claims are well-supported.
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
Summary:
The authors aimed to modify the characteristics of the extracellular matrix (ECM) produced by immortalized mesenchymal stem cells (MSCs) by employing the CRISPR/Cas9 system to knock out specific genes. Initially, they established VEGF-KO cell lines, demonstrating that these cells retained chondrogenic and angiogenic properties. Additionally, lyophilized carriage tissues produced by these cells exhibited retained osteogenic properties.
Subsequently, the authors established RUNX2-KO cell lines, which exhibited reduced COLX expression during chondrogenic differentiation and notably diminished osteogenic properties in vitro. Transplantation of lyophilized carriage tissues produced by RUNX2-KO cell lines into osteochondral defects in rat knee joints resulted in the regeneration of articular cartilage tissues as well as bone tissues, a phenomenon not observed with tissues derived from parental cells. This suggests that gene-edited MSCs represent a valuable cell source for producing ECM with enhanced quality.
Strengths:
The enhanced cartilage regeneration observed with ECM derived from RUNX2-KO cells supports the authors' strategy of creating gene-edited MSCs capable of producing ECM with superior quality. Immortalized cell lines offer a limitless source of off-the-shelf material for tissue regeneration.
We thank the reviewer for the interest in our work. We however want to clarify that the present manuscript does not report the generation of ECM with “superior quality”, but rather of modulated composition and thus function.
Weaknesses:
Most data align with anticipated outcomes, offering limited novelty to advance scientific understanding. Methodologically, the chondrogenic differentiation properties of immortalized MSCs appeared deficient, evidenced by Safranin-O staining of 3D tissues and histological findings lacking robust evidence for endochondral differentiation. This presents a critical limitation, particularly as authors propose the implantation of cartilage tissues for in vivo experiments. Instead, the bulk of data stemmed from type I collagen scaffold with factors produced by MSCs stimulated by TGFβ.
The chondrogenic differentiation of our MSOD-B line and their capacity of undergoing endochondral ossification has been robustly demonstrated in previous studies (Pigeot et al., Advanced Materials 2021 and Grigoryan et al., Science Translational Medicine 2022). In the present manuscript, we thus compare the chondrogenic capacity of newly established VEGF-KO and RUNX-KO lines to those of MSOD-B cells. We demonstrate by qualitative (Safranin-O staining, Collagen type 2 and Collagen type X immuno-stainings) and quantitative (glycosaminoglycans assay) assays that the generated tissues consist in cartilage grafts of similar quality than the MSOD-B counterpart. Of note, the safranin-O stainings were performed on lyophilized tissues, which can alter the staining quality/intensity. We now provide additional stainings of generated tissues pre-lyophilization. This is implemented in Figure 1D, Figure 3D.
The rationale behind establishing VEGF-KO cell lines remains unclear. What specific outcomes did the authors anticipate from this modification?
VEGF is a known master regulator of angiogenesis and a key mediator of endochondral ossification. It has also been extensively used in bone tissue engineering studies as a supplemented factor – primarily in the form of VEGFα – to increase the vascularization and thus outcome of bone formation of engineered grafts (https://www.nature.com/articles/s42003-020-01606-9, https://www.sciencedirect.com/science/article/pii/S8756328216301752). In our study, it was thus identified as a natural candidate to demonstrate the possibility to generate VEGF-KO cartilage and subsequently assess the functional impact on both the angiogenic and osteogenic potential of resulting cartilage tissue. This is now clarified in the manuscript (page 3, paragraph 4).
Insufficient depth was given to elucidate the disparity in osteogenic properties between those observed in ectopic bone formation and those observed in transplantation into osteochondral defects. While the regeneration of articular cartilage in RUNX2-KO ECM presents intriguing results, the study lacked an exploration into underlying mechanisms, such as histological analyses at earlier time points.
Using RUNX2-KO ECM, we aimed at demonstrating the impact on cartilage remodeling and bone formation. This was performed ectopically but also in the rat osteochondral defect as a regenerative set-up of higher clinical relevance. We agree with the reviewer that additional experimental groups and time-points (not only earlier but also longer ones) would offer a better mechanistic understanding of the ECM contribution to the joint repair. However, as stated in our manuscript this is a proof-of-concept study that successfully demonstrated the influence of the cartilage ECM modification on the in vivo skeletal regeneration. A follow-up study would need to be performed to complement existing evidence and strengthen the relevance of our approach for cartilage repair. This is now further emphasized in the discussion (page 11, paragraph 3).
Reviewer #2 (Public Review):
The manuscript submitted by Sujeethkumar et al. describes an alternative approach to skeletal tissue repair using extracellular matrix (ECM) deposited by genetically modified mesenchymal stromal/stem cells. Here, they generate a loss of function mutations in VEGF or RUNX2 in a BMP2overexpressing MSC line and define the differences in the resulting tissue-engineered constructs following seeding onto a type I collagen matrix in vitro, and following lyophilization and subcutaneous and orthotopic implantation into mice and rats. Some strengths of this manuscript are the establishment of a platform by which modifications in cell-derived ECM can be evaluated both in vitro and in vivo, the demonstration that genetic modification of cells results in complexity of in vitro cell-derived ECM that elicits quantifiable results, and the admirable goal to improve endogenous cartilage repair. However, I recommend the authors clarify their conclusions and add more information regarding reproducibility, which was one limitation of primary-cell-derived ECMs.
We thank the reviewer for the positive evaluation of our work.
Overcoming the limitations of native/autologous/allogeneic ECMs such as complete decellularization and reduction of batch-to-batch variability was not specifically addressed in the data provided herein. For the maintenance of ECM organization and complexity following lyophilization, evidence of complete decellularization was not addressed, but could be easily evaluated using polarized light microscopy and quantification of human DNA for example in constructs pre and post-lyophilization.
We appreciate the reviewer comments and acknowledge the lack of information in the first version of our manuscript. In line with our previous study (Pigeot et al., Advanced Materials 2021), the ectopic evaluation of our cartilage pellets was strictly done with lyophilized tissues using immunocompromised animals. Lyophilized tissues are thus considered devitalized, and not decellularized. Instead, the osteochondral defect experiment was performed with decellularized tissues in order to be able to implant the grafts in the rat immuno-competent model. This is now specified consistently throughout the manuscript. The decellularization process is also now incorporated accordingly in the method section (page 14, paragraph 2). We also provide quantifications of GAGs and DNAs from tissue pre- and post-decellularization (Supplementary figure 6A and 6B), described in the result section of the manuscript (page 9, paragraph 1). The decellularization step led to 97-98% of DNA removal.
Importantly, we do not claim full maintenance of ECM integrity following lyophilization nor decellularization. This is now clarified in the discussion (page 12, paragraph 2). However, we report their capacity to instruct skeletal regeneration in multiple contexts despite extensive processing.
It would be ideal to see minimization of batch-to-batch variability using this approach, as mitigation of using a sole cell line is likely not sufficient (considering that the sole cell line-derived Matrigel does exhibit batch-to-batch and manufacturer-to-manufacturer variability). I recommend adding details regarding experimental design and outcomes not initially considered. Inter- and intraexperimental reproducibility was not adequately addressed. The size of in vitro-derived cartilage pellets was not quantified, and it is not clear that more than one independent 'differentiation' was performed from each gene-edited MSC line to generate in vitro replicates and constructs that were implanted in vivo.
We thank the Reviewer for the comment on variability/reproducibility concern. Using a cell line does confer higher robustness but indeed does not grant unlimited consistency of batch production. We now temper our claims in the discussion and mention the need to regularly recharacterize cell lines properties upon passages (page 12, paragraph 2). Using our edited lines, we have generated multiple batches of cartilage grafts for their in vitro characterization or in vivo performance assessment. We have now compiled batch variations of GAG content and pellet volume, provided as Supplementary figure 5. This revealed that batches are indeed not identical (nor each pellets), but the production remains consistent.
The use of descriptive language in describing conclusions may mislead the reader and should be modified accordingly throughout the manuscript. For example, although this reviewer agrees with the comparative statements made by the authors regarding parental and gene-edited MSC lines, non-quantifiable terms such as 'frank' 'superior' (example, line 242) are inappropriate and should rather be discussed in terms of significance. Another example is 'rich-collagenous matrix,' which was not substantiated by uniform immunostaining for type II collagen (line 189).
We thank the Reviewer for the constructive suggestions. We have revised the language accordingly throughout the manuscript.
I have similar recommendations regarding conclusive statements from the rat implantation model, which was appropriately used for the purpose of evaluating the response of native skeletal cells to the different cell-derived ECMs. Interpretations of these results should be described with more accuracy. For example, increased TRAP staining does not indicate reduced active bone formation (line 237). Many would not conclude that GAGs were retained in the RUNX2-KO line graft subchondral region based on the histology. Quantification of % chondral regeneration using histology is not accurate as it is greatly influenced by the location in the defect from which the section was taken. Chondral regeneration is usually semi-quantified from gross observations of the cartilage surface immediately following excision. The statements regarding integration (example line 290) are not founded by histological evidence, which should show high magnification of the periphery of the graft adjacent to the native tissue.
We have revised our language relative to the TRAP staining description (page 9, paragraph 2). We also agree with the reviewer on the semi-quantitative approach of our methodology, which we transparently disclosed both in the main text (page 9, paragraph 3) and method section (page 18, paragraph 2). The sectioning location does influence the analysis, but to prevent this we performed an assessment at different depth (top, middle, bottom for each sample). This is now implemented in our method section (page 18, paragraph 3). On the tissue integration, we now provide higher magnification images of the implant/host tissue area (Figure 5F).
Reviewer #3 (Public Review):
Summary:
In this study, the authors have started off using an immortalized human cell line and then geneedited it to decrease the levels of VEGF1 (in order to influence vascularization), and the levels of Runx2 (to decrease chondro/osteogenesis). They first transplanted these cells with a collagen scaffold. The modified cells showed a decrease in vascularization when VEGF1 was decreased, and suggested an increase in cartilage formation.
In another study, the matrix generated by these cells was subsequently remodeled into a bone marrow organ. When RUNX2 was decreased, the cells did not mineralize in vitro, and their matrices expressed types I and II collagen but not type X collagen in vitro, in comparison with unedited cells. In vivo, the author claims that remodeling of the matrices into bone was somewhat inhibited. Lastly, they utilized matrices generated by RUNX2 edited cells to regenerate chondro-osteal defects. They suggest that the edited cells regenerated cartilage in comparison with unedited cells.
Strengths:
- The notion that inducing changes in the ECM by genetically editing the cells is a novel one, as it has long been thought that ECM composition influences cell activity.
- If successful, it may be possible to make off-the-shelf ECMS to carry out different types of tissue repair.
We thank the Reviewer for the critical evaluation of our work and the highlighted novelty of it.
Weaknesses:
- The authors have not generated histologically identifiable cartilage or bone in their transplants of the cells with a type I scaffold.
The chondrogenic differentiation of our MSOD-B line and their capacity of undergoing endochondral ossification has been robustly demonstrated in previous studies (Pigeot et al., Advanced Materials 2021 and Grigoryan et al., Science Translational Medicine 2022). In the present manuscript, we thus compare the chondrogenic capacity of newly established VEGF-KO and RUNX-KO lines to those of MSOD-B. We demonstrate by qualitative (Safranin-O staining, Collagen type 2 and Collagen type X immuno-stainings) and quantitative (glycosaminoglycans assay) assays that the generated tissues consist in cartilage tissue of similar quality than the MSOD-B. Of note, the safranin-O stainings were performed on lyophilized tissues, which can alter the staining quality/intensity. We now provide here additional stainings of generated tissues pre-lyophilization. This is implemented in Figure 1D and Figure 3D.
On the contested formation of bone in vivo by our ECMs grafts, we have provided compelling qualitative evidence via Masson´s Trichrome stainings and quantification of mineralized volume by µCT. Both cortical bone and trabecular structures were identified ectopically. Those are standard evaluation methods in the field, we would be happy to receive additional suggestions by the Reviewer.
- In many cases, they did not generate histologically identifiable cartilage with their cell-free-edited scaffold. They did generate small amounts of bone but this is most likely due to BMPs that were synthesized by the cells and trapped in the matrix.
We now appreciate that the Reviewer agrees on the successful formation of bone induced by our engineered grafts. We however still respectfully disagree with the “small amount of bone” statement since our MSOD-B and MSOD-B VEGF KO cartilage grafts led to the full generation of a mature ectopic bone organ (that is, also composed of extensive marrow). This has been assessed qualitatively and quantitatively.
We agree with the Reviewer on the key role of BMP-2 in the remodeling process into bone and bone marrow, which we have extensively described in our previous publication (Pigeot et al., Advanced Materials 2021). However, the low amount of BMP-2 (in the dozens of nanogram/tissue range) embedded in the matrix is not sufficient per se to induce ectopic endochondral ossification. It is the combined presence of GAGs in the matrix -thus cartilage- that allows the success of bone formation.
- There is a great deal of missing detail in the manuscript.
We have incorporated additional methodological details describing the lyophilization/decellularization process of our tissues prior to evaluation (see Material and Methods section). We also have included a description of the MSOD-B line and implemented genetic elements (Supplementary Figure 1A).
- The in vivo study is underpowered, the results are not well documented pictorially, and are not convincing.
We believe our group size supports our conclusions confirmed by statistical assessment. We have provided additional stainings and images of higher magnifications (Figure 5) for both the ectopic and orthotopic in vivo evaluation.
- Given the fact that they have genetically modified cells, they could have done analyses of ECM components to determine what was different between the lines, both at the transcriptome and the protein level. Consequently, the study is purely descriptive and does not provide any mechanistic understanding of what mixture of matrix components and growth factors works best for cartilage or bone. But this presupposes that they actually induced the formation of bona fide cartilage, at least.
We thank the Reviewer for the suggestion. However, our study did not aim at understanding what ECM graft composition work best for cartilage nor bone regeneration respectively. Instead, we propose the exploitation of our cellular tools to interrogate the function of key ECM constituents and their impact in skeletal regeneration. We once more confirm that we generated cartilage grafts which is now better supported by additional histological assessment before lyophilization.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Public Reviews:
Reviewer #1 (Public review):
Summary:
In their previous publication (Dong et al. Cell Reports 2024), the authors showed that citalopram treatment resulted in reduced tumor size by binding to the E380 site of GLUT1 and inhibiting the glycolytic metabolism of HCC cells, instead of the classical citalopram receptor. Given that C5aR1 was also identified as the potential receptor of citalopram in the previous report, the authors focused on exploring the potential of the immune-dependent anti-tumor effect of citalopram via C5aR1. C5aR1 was found to be expressed on tumor-associated macrophages (TAMs) and citalopram administration showed potential to improve the stability of C5aR1 in vitro. Through macrophage depletion and adoptive transfer approaches in HCC mouse models, the data demonstrated the potential importance of C5aR1-expressing macrophage in the anti-tumor effect of citalopram in vivo. Mechanistically, their in vitro data suggested that citalopram may regulate the phagocytosis potential and polarization of macrophages through C5aR1. Next, they tried to investigate the direct link between citalopram and CD8+T cells by including an additional MASH-associated HCC mouse model. Their data suggest that citalopram may upregulate the glycolytic metabolism of CD8+T cells, probability via GLUT3 but not GLUT1-mediated glucose uptake. Lastly, as the systemic 5-HT level is down-regulated by citalopram, the authors analyzed the association between a low 5-HT and a superior CD8+T cell function against a tumor. Although the data is informative, the rationale for working on additional mechanisms and logical links among different parts is not clear. In addition, some of the conclusion is also not fully supported by the current data.
Thanks very much for your insightful evaluation and the constructive suggestions. We have thoroughly studied the comments and a provisional point-to-point response is shown as follows.
Strengths:
The idea of repurposing clinical-in-used drugs showed great potential for immediate clinical translation. The data here suggested that the anti-depression drug, citalopram displayed an immune regulatory role on TAM via a new target C5aR1 in HCC.
Thank you for your constructive comments. We believe that further investigation into the mechanisms by which citalopram modulates TAM function could provide valuable insights into its potential role in HCC therapy.
Weaknesses:
(1) The authors concluded that citalopram had a 'potential immune-dependent effect' based on the tumor weight difference between Rag-/- and C57 mice in Figure 1. However, tumor weight differences may also be attributed to a non-immune regulatory pathway. In addition, how do the authors calculate relative tumor weight? What is the rationale for using relative one but not absolute tumor weight to reflect the anti-tumor effect?
We appreciate your insights into the potential contributions of non-immune regulatory pathways to the observed tumor weight differences between Rag-/- and C57 mice, and we will further address this issue in our discussion. The relative tumor weight was calculated by assigning an arbitrary value of 1 to the Rag1<sup>-/-</sup> mice in the DMSO treatment group, with all other tumor weights expressed relative to this baseline. As suggested, we will include absolute tumor weight data in our revised manuscript.
(2) The authors used shSlc6a4 tumor cell lines to demonstrate that citalopram's effects are independent of the conventional SERT receptor (Figure 1C-F). However, this does not entirely exclude the possibility that SERT may still play a role in this context, as it can be expressed in other cells within the tumor microenvironment. What is the expression profiling of Slc6a4 in the HCC tumor microenvironment? In addition, in Figure 1F, the tumor growth of shSlc6a4 in C57 mice displayed a decreased trend, suggesting a possible role of Slc6a4.
To identify the expression patterns of Slc6a4 in different cellular contexts within the HCC tumor microenvironment, we will conduct a thorough screening of HCC datasets that include single-cell sequencing analysis. The possible role of Slc6a4 on tumor growth will be verified with in vitro loss-of-function experiments.
(3) Why did the authors choose to study phagocytosis in Figures 3G-H? As an important player, TAM regulates tumor growth via various mechanisms.
Thank you for your question. We focused on this aspect because citalopram targets C5aR1-expressing TAM. C5aR1 is a receptor for complement component C5a, and complement components play a significant role in mediating the phagocytosis process in macrophages. In the revised manuscript, we will emphasize this rationale clearly.
(4) The information on unchanged deposition of C5a has been mentioned in this manuscript (Figures 3D and 3F), the authors should explain further in the manuscript, for example, C5a could bind to receptors other than C5aR1 and/or C5a bind to C5aR1 by different docking anchors compared with citalopram.
Thank you for your insightful comment. First, we will investigate the docking anchors involved in the binding of C5a to C5aR1 and compare these interactions with those of C5aR1 and citalopram. Additionally, we will discuss the potential binding of C5a to other receptors, providing a broader perspective on the signaling mechanisms.
(5) Figure 3I-M - the flow cytometry data suggested that citalopram treatment altered the proportions of total TAM, M1 and M2 subsets, CD4+ and CD8+T cells, DCs, and B cells. Why does the author conclude that the enhanced phagocytosis of TAM was one of the major mechanisms of citalopram? As the overall TAM number was regulated, the contribution of phagocytosis to tumor growth may be limited.
As suggested, we will restate the conclusion to enhance clarity and better articulate the relationship between citalopram treatment, TAM populations, and their phagocytic activity. Thank you for your valuable input.
(6) Figure 4 - what is the rationale for using the MASH-associated HCC mouse model to study metabolic regulation in CD8+T cells? The tumor microenvironment and tumor growth would be quite different. In addition, how does this part link up with the mechanisms related to C5aR1 and TAM? The authors also brought GLUT1 back in the last part and focused on CD8+T cell metabolism, which was totally separated from previous data.
We chose the MASH-associated HCC mouse model because it closely mimics the etiology of metabolic-associated fatty liver disease (MAFLD), which is a significant contributor to the development of cirrhosis and HCC. The inclusion of CD8<sup>+</sup> T cells in our study is based on the understanding that citalopram targets GLUT1, which plays a crucial role in glucose uptake. CD8<sup>+</sup> T cell function is heavily reliant on glycolytic metabolism, making it essential to investigate how citalopram’s effects on GLUT1 influence the metabolic pathways and functionality of these immune cells. The data presented in this section primarily aim to demonstrate how citalopram influences peripheral 5-HT levels, which subsequently affects CD8<sup>+</sup> T cell functionality. By linking these findings, we will clarify how citalopram impacts both TAM and CD8<sup>+</sup> T cells. In the revised manuscript, we will enhance the background information and provide relevant data support to avoid any gaps.
(7) Figure 5, the authors illustrated their mechanism that citalopram regulates CD8+T cell anti-tumor immunity through proinflammatory TAM with no experimental evidence. Using only CD206 and MHCII to represent TAM subsets obviously is not sufficient.
As suggested, more relevant experimental data will be included in the revised manuscript to better characterize the TAM populations and their roles in mediating the effects of citalopram on CD8<sup>+</sup> T cells.
Reviewer #2 (Public review):
Summary:
Dong et al. present a thorough investigation into the potential of repurposing citalopram, an SSRI, for hepatocellular carcinoma (HCC) therapy. The study highlights the dual mechanisms by which citalopram exerts anti-tumor effects: reprogramming tumor-associated macrophages (TAMs) toward an anti-tumor phenotype via C5aR1 modulation and suppressing cancer cell metabolism through GLUT1 inhibition while enhancing CD8+ T cell activation. The findings emphasize the potential of drug repurposing strategies and position C5aR1 as a promising immunotherapeutic target. However, certain aspects of experimental design and clinical relevance could be further developed to strengthen the study's impact.
Thank you for your thoughtful review and constructive feedback, and we look forward to improving our manuscript accordingly.
Strength:
It provides detailed evidence of citalopram's non-canonical action on C5aR1, demonstrating its ability to modulate macrophage behavior and enhance CD8+ T cell cytotoxicity. The use of DARTS assays, in silico docking, and gene signature network analyses offers robust validation of drug-target interactions. Additionally, the dual focus on immune cell reprogramming and metabolic suppression presents a thorough strategy for HCC therapy. By emphasizing the potential for existing drugs like citalopram to be repurposed, the study also underscores the feasibility of translational applications.
Your insights reinforce the significance of our findings, and we will ensure that these points are clearly articulated in the revised manuscript to enhance its impact.
Major weaknesses/suggestions:
The dataset and signature database used for GSEA analyses are not clearly specified, limiting reproducibility. The manuscript does not fully explore the potential promiscuity of citalopram's interactions across GLUT1, C5aR1, and SERT1, which could provide a deeper understanding of binding selectivity. The absence of GLUT1 knockdown or knockout experiments in macrophages prevents a complete assessment of GLUT1's role in macrophage versus tumor cell metabolism. Furthermore, there is minimal discussion of clinical data on SSRI use in HCC patients. Incorporating survival outcomes based on SSRI treatment could strengthen the study's translational relevance.
By addressing these limitations, the manuscript could make an even stronger contribution to the fields of cancer immunotherapy and drug repurposing.
We appreciate your valuable suggestions. As suggested, we will take the following actions:
(1) GSEA analysis: we will clearly specify the datasets and signature databases used for the GSEA in the revised manuscript.
(2) Exploration of binding selectivity: we recognize the importance of exploring the potential promiscuity of citalopram’s interactions across GLUT1, C5aR1, and SERT1. As suggested, we will include a more detailed analysis of these interactions, which will help elucidate binding selectivity and its implications for therapeutic outcomes.
(3) GLUT1 knockdown in macrophages: to address the gap in our assessment of GLUT1’s role in macrophages, we will incorporate GLUT1 knockdown or knockout experiments in macrophages upon citalopram treatment. Moreover, a DARTS assay for GLUT1 in THP-1 cells will be conducted.
(4) Clinical data on SSRI use in HCC patients: Related data have been reported previously in PMID: 39388353 (Cell Rep. 2024 Oct 22;43(10):114818.). As detailed below:
“SSRIs use is associated with reduced disease progression in HCC patients
We determined whether SSRIs for alleviating HCC are supported by real-world data. A total of 3061 patients with liver cancer were extracted from the Swedish Cancer Register. Among them, 695 patients had been administrated with post-diagnostic SSRIs. The Kaplan-Meier survival analysis suggested that patients who utilized SSRIs exhibited a significantly improved metastasis-free survival compared to those who did not use SSRIs, with a P value of log-rank test at 0.0002. Cox regression analysis showed that SSRI use was associated with a lower risk of metastasis (HR = 0.78; 95% CI, 0.62-0.99).”
Author response image 1.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Public Reviews:
Reviewer #1 (Public review):
Summary:
In a previous work, Prut and colleagues had shown that during reaching, high-frequency stimulation of the cerebellar outputs resulted in reduced reach velocity. Moreover, they showed that the stimulation produced reaches that deviated from a straight line, with the shoulder and elbow movements becoming less coordinated. In this report, they extend their previous work by the addition of modeling results that investigate the relationship between the kinematic changes and torques produced at the joints. The results show that the slowing is not due to reductions in interaction torques alone, as the reductions in velocity occur even for movements that are single joints. More interestingly, the experiment revealed evidence for the decomposition of the reaching movement, as well as an increase in the variance of the trajectory.
Strengths:
This is a rare experiment in a non-human primate that assessed the importance of cerebellar input to the motor cortex during reaching.
Weaknesses:
My major concerns are described below.
If I understand the task design correctly, the monkeys did not need to stop their hand at the target. I think this design may be suboptimal for investigating the role of the cerebellum in control of reaching because a number of earlier works have found that the cerebellum's contributions are particularly significant as the movement ends, i.e., stopping at the target. For example, in mice, interposed nucleus neurons tend to be most active near the end of the reach that requires extension, and their activation produces flexion forces during the reach (Becker and Person 2019). Indeed, the inactivation of interposed neurons that project to the thalamus results in overshooting of reaching movements (Low et al. 2018). Recent work has also found that many Purkinje cells show a burst-pause pattern as the reach nears its endpoint, and stimulation of the mossy fibers tends to disrupt endpoint control (Calame et al. 2023). Thus, the fact that the current paper has no data regarding endpoint control of the reach is puzzling to me.
We appreciate the reviewer’s point that cerebellar contributions can be particularly critical near the endpoint of a reach. In our current task design, monkeys were indeed required to hold at the target briefly—100 ms for Monkeys S and P, and 150 ms for Monkeys C and M—before receiving a reward. However, given the size of the targets and the velocity of movements, it often happened that the monkey didn’t have to stop its movement to obtain a reward. Importantly, we relaxed the task’s requirements (by increasing target size and reducing temporal constraints) to allow monkeys to perform the task under cerebellar block conditions as we found that the strict criteria in these conditions yield a low success rate. This design is suboptimal for studying endpoint accuracy which, as we now appreciate, is an important aspect of cerebellar control. In our revision, we will clarify these aspects of the task design and acknowledge that it is sub-optimal for examining the role of cerebellum in end-point control. Future studies will explicitly address this point more carefully.
Because stimulation continued after the cursor had crossed the target, it is interesting to ask whether this disruption had any effects on the movements that were task-irrelevant. The reason for asking this is because we have found that whereas during task-relevant eye or tongue movements the Purkinje cells are strongly modulated, the modulations are much more muted when similar movements are performed but are task-irrelevant (Pi et al., PNAS 2024; Hage et al. Biorxiv 2024). Thus, it is interesting to ask whether the effects of stimulation were global and affected all movements, or were the effects primarily concerned with the task-relevant movements.
This is a very interesting suggestion. Although our main analysis focused on target-directed reaching movements, we have the data for the between-trial movements under continuous stimulation (e.g., return to center movements). In our revised supplementary material, we will examine the effect of cerebellar block on endpoint velocities in inter-trial movements versus task-related movements.
If the schematic in Figure 1 is accurate, it is difficult for me to see how any of the reaching movements can be termed single joint. In the paper, T1 is labeled as a single joint, and T2-T4 are labeled as dual-joint. The authors should provide data to justify this.
The is reviewer right and movements to all targets engages shoulder and elbow but the single joint participation varied in a target-specific manner. In the manuscript, we used the term “single-joint” to indicate a target direction in which one joint remains stationary, resulting in minimal coupling torque at the adjacent joint. Specifically, for Targets 1 and 5 in our experiments, the net torque (and thus acceleration) at the elbow was negligible, and hence the shoulder experienced correspondingly low coupling torque (as illustrated in Figure 3c of our manuscript). To avoid confusion, we will use the term ‘predominantly single-joint’ movements in our revised manuscript to indicate targets with low coupling torques. We will also include an additional figure in the revised supplementary material displaying the net torques at the shoulder and elbow, similar to Figures 2c and 3c. Our goal is to demonstrate that movements to targets 1 and 5 are characterized by predominantly one-joint engagement (i.e., the elbow is stationary with low net torque) and low coupling torques, rather than implying a purely isolated, single-joint motion.
Because at least part of this work was previously analyzed and published, information should be provided regarding which data are new.
We will include a clear statement in the Methods section specifying which components of the dataset and analyses are entirely new. While some of the same animals and stimulation protocol were presented in prior work, the inverse-dynamics modeling, analyses of progressive movement changes across trials under stimulation and invariance of motor noise to movement velocity are newly reported in this manuscript.
Reviewer #2 (Public review):
This manuscript asks an interesting and important question: what part of 'cerebellar' motor dysfunction is an acute control problem vs a compensatory strategy to the acute control issue? The authors use a cerebellar 'blockade' protocol, consisting of high-frequency stimuli applied to the cerebellar peduncle which is thought to interfere with outflow signals. This protocol was applied in monkeys performing center outreaching movements and has been published from this laboratory in several preceding studies. I found the take-home-message broadly convincing and clarifying - that cerebellar block reduces muscle activation acutely particularly in movements that involve multiple joints and therefore invoke interaction torques, and that movements progressively slow down to in effect 'compensate' for these acute tone deficits. The manuscript was generally well written, and the data was clear, convincing, and novel. My comments below highlight suggestions to improve clarity and sharpen some arguments.
Primary comments:
(1) Torque vs. tone: Is it known whether this type of cerebellar blockade is reducing muscle tone or inducing any type of acute co-contraction that could influence limb velocity through mechanisms different than 'atonia'? If so, the authors should discuss this information in the discussion section starting around line 336, and clarify that this motivates (if it does) the focus on 'torques' rather than muscle activation. Relatedly, besides the fact that there are joints involved, is there a reason there is so much emphasis on torque per se? If the muscle is deprived of sufficient drive, it would seem that it would be more straightforward to conceptualize the deficit as one of insufficient timed drive to a set of muscles than joint force. Some text better contextualizing the choices made here would be sufficient to address this concern. I found statements like those in the introduction "hand velocity was low initially, reflecting a primary muscle torque deficit" to be lacking in substance. Either that statement is self-evident or the alternative was not made clear. Finally, emphasize that it is a loss of self-generated torque at the shoulder that accounts for the velocity deficits. At times the phrasing makes it seem that there is a loss of some kind of passive torque.
We appreciate the reviewer’s emphasis on distinguishing reduced muscle tone and altered co-contraction patterns as possible explanations for decreased limb velocity. Our focus on torques arises from previous studies suggesting that the core deficit in cerebellar ataxia is impaired prediction of coupling torques. This point will be added in the discussion section of our revised manuscript where we will explain why we prioritize muscle torques and how muscle-level activation collectively contributes to net joint torques. Also, we will underscore that the observed velocity deficits primarily reflect a reduction of self-generated torque at the shoulder (whether acute or adaptive), rather than any reduction in passive torques.
(2) Please clarify some of the experimental metrics: Ln 94 RESULTS. The success rate is used as a primary behavioral readout, but what constitutes success is not clearly defined in the methods. In addition to providing a clear definition in the methods section, it would also be helpful for the authors to provide a brief list of criteria used to determine a 'successful' movement in the results section before the behavioral consequences of stimulation are described. In particular, the time and positional error requirements should be clear.
Successful trials were trials in which monkeys didn’t leave the center position before the go signal and reached the peripheral target within a specific time criteria. These values varied in different monkeys. We will include detailed definitions of our success criteria in the revised methods section of our manuscript. Specifically, we will update our methods section to include (i) the timing criteria of each phase of the trials and (ii) the size of the peripheral targets indicating the tolerance for endpoint accuracy.
(3) Based on the polar plot in Figure 1c, it seemed odd to consider Targets 1-4 outward and 5-8 inward movements, when 1 and 5 are side-to-side. Is there a rationale for this grouping or might results be cleaner by cleanly segregating outward (targets 2-4) and inward (targets 6-8) movements? Indeed, by Figure 3 where interaction torques are measured, this grouping would seem to align with the hypothesis much more cleanly since it is with T2,T3,and T4 where clear coupling torques deficits are seen with cerebellar block.
We acknowledge the reviewer’s observation regarding Targets 1 and 5 being side-to-side rather than strictly “outward” or “inward.” In the first section of our results, we grouped the targets in this way to emphasize the notably stronger effect of the cerebellar block on targets involving shoulder flexion (‘outward’) as compared to those involving shoulder extension (‘inwards’). For subsequent analyses we focused on the effects of cerebellar block on outward targets where movements were single-joint (Target 1) vs. multi-joint (Targets 2-4). To clarify this aspect, in our revised manuscript we will explain the rationale for grouping T1–T4 as “outward” and T5–T8 as “inward,” including how we defined them.
(4) I did not follow Figure 3d. Both the figure axis labels and the description in the main text were difficult to follow. Furthermore, the color code per animal made me question whether the linear regression across the entire dataset was valid, or would be better performed within animal, and the regressions summarized across animals. The authors should look again at this section and figure.
We will revise the figure labels and legend to clarify how each axis is defined. Please note that pooling the data was done after confirming that data from each animal expressed a similar trend. Specifically, the correlation coefficients were all positive but statistically significant in 3 out of the 4 monkeys. Moreover, following the reviewers’ feedback, we also did a partial correlation analysis (which controls for the variability across monkeys) and found a significant correlation (r = 0.33, p < 0.001). These points will be described in the revised manuscript.
(5) Line 206+ The rationale for examining movement decomposition with a cerebellar block is presented as testing the role of the cerebellum in timing. Yet it is not spelled out what movement decomposition and trajectory variability have to do with motor timing per se.
The reviewer is right and the relations between timing, decomposition and variability need to be explicitly presented. In our revision, we will explain how decomposed movements may reflect impaired temporal coordination across multiple joints—a critical cerebellar function. We will also clarify how increased variability in joint coordination can result in increased trial-to-trial variability of trajectories.
Reviewer #3 (Public review):
Summary:
In their manuscript, "Disentangling acute motor deficits and adaptive responses evoked by the loss of cerebellar output," Sinha and colleagues aim to identify distinct causes of motor impairments seen when perturbing cerebellar circuits. This goal is an important one, given the diversity of movement-related phenotypes in patients with cerebellar lesions or injuries, which are especially difficult to dissect given the chronic nature of the circuit damage. To address this goal, the authors use high-frequency stimulation (HFS) of the superior cerebellar peduncle in monkeys performing reaching movements. HFS provides an attractive approach for transiently disrupting cerebellar function previously published by this group. First, they found a reduction in hand velocities during reaching, which was more pronounced for outward versus inward movements. By modeling inverse dynamics, they find evidence that shoulder muscle torques are especially affected. Next, the authors examine the temporal evolution of movement phenotypes over successive blocks of HFS trials. Using this analysis, they find that in addition to the acute, specific effects on muscle torques in early HFS trials, there was an additional progressive reduction in velocity during later trials, which they interpret as an adaptive response to the inability to effectively compensate for interaction torques during cerebellar block. Finally, the authors examine movement decomposition and trajectory, finding that even when low-velocity reaches are matched to controls, HFS produces abnormally decomposed movements and higher than expected variability in trajectory.
Strengths:
Overall, this work provides important insight into how perturbation of cerebellar circuits can elicit diverse effects on movement across multiple timescales.
The HFS approach provides temporal resolution and enables analysis that would be hard to perform in the context of chronic lesions or slow pharmacological interventions. Thus, this study describes an important advance over prior methods of circuit disruption, and their approach can be used as a framework for future studies that delve deeper into how additional aspects of sensorimotor control are disrupted (e.g., response to limb perturbations).
In addition, the authors use well-designed behavioral approaches and analysis methods to distinguish immediate from longer-term adaptive effects of HFS on behavior. Moreover, inverse dynamics modeling provides important insight into how movements with different kinematics and muscle dynamics might be differentially disrupted by cerebellar perturbation.
Weaknesses:
The argument that there are acute and adaptive effects to perturbing cerebellar circuits is compelling, but there seems to be a lost opportunity to leverage the fast and reversible nature of the perturbations to further test this idea and strengthen the interpretation. Specifically, the authors could have bolstered this argument by looking at the effects of terminating HFS - one might hypothesize that the acute impacts on muscle torques would quickly return to baseline in the absence of HFS, whereas the longer-term adaptive component would persist in the form of aftereffects during the 'washout' period. As is, the reversible nature of the perturbation seems underutilized in testing the authors' ideas.
We agree that our approach could more explicitly exploit the rapid reversibility of high-frequency stimulation (HFS) by examining post-stimulation ‘washout’ periods. However, for the present dataset, we ended the session after the set of cerebellar block trials. We plan to study the effect of cerebellar block on immediate post-block washout trials in the future.
The analysis showing that there is a gradual reduction in velocity during what the authors call an adaptive phase is convincing. That said, the argument is made that this is due to difficulty in compensating for interaction torques. Even if the inward targets (i.e., targets 6-8) do not show a deficit during the acute phase, these targets still have significant interaction torques (Figure 3c). Given the interpretation of the data as presented, it is not clear why disruption of movement during the adaptive phase would not be seen for these targets as well since they also have large interaction torques. Moreover, it is difficult to delve into this issue in more detail, as the analyses in Figures 4 and 5 omit the inward targets.
The reviewer is right and movements to Targets 6–8 (inward) were seemingly unaffected despite also involving significant interaction torques. In fact, we have already attempted to address this issue in the discussion section of the version 1 of our manuscript. Specifically, we note that while outward targets (2–4) tend to involve higher coupling torque impulses on average, this alone does not fully explain the differential impact of cerebellar block, as illustrated by discrepancies at the individual target level (e.g., target 7 vs. target 1). We proposed two possible explanations: (1) a bias toward shoulder flexion in the effect of cerebellar block—consistent with earlier studies showing ipsilateral flexor activation or tone changes following stimulation or lesioning of the deep cerebellar nuclei; and (2) a posture-related facilitation of inward (shoulder extension) movements from the central starting position.
The text in the Introduction and in the prior work developing the HFS approach overstates the selectivity of the perturbations. First, there is an emphasis on signals transmitted to the neocortex. As the authors state several times in the Discussion, there are many subcortical targets of the cerebellar nuclei as well, and thus it is difficult to disentangle target-specific behavioral effects using this approach. Second, the superior cerebellar peduncle contains both cerebellar outputs and inputs (e.g., spinocerebellar). Therefore, the selectivity in perturbing cerebellar output feels overstated. Readers would benefit from a more agnostic claim that HFS affects cerebellar communication with the rest of the nervous system, which would not affect the major findings of the study.
The reviewer is right that the superior cerebellar peduncle carries both descending and ascending fibers, and that cerebellar nuclei project to subcortical as well as cortical targets. However, it is also important to note that in primates the cerebellar-thalamo-cortical (CTC) pathway greatly expanded (on the expanse of the cerbello-rubro-spinal tract) in mediating cerebellar control of voluntary movements (Horne and Butler, 1995). The cerebello-subcortical pathways lost its importance over the course of evolution (Nathan and Smith, 1982, Padel et al., 1981, ten Donkelaar, 1988). In our previous study we found that the ascending spinocerebellar axons which enter the cerebellum through the SCP are weakly task-related and the descending system is quite small (Cohen et al, 2017). However, we cannot rule out an effect of HFS mediated in part through other systems. In the revised introduction section, we will clarify this point and use more careful language about the scope of our stimulation, emphasizing that HFS disrupts cerebellar communication broadly, rather than solely the cerebello-thalamo-cortical pathway.
The text implies that increased movement decomposition and variability must be due to noise. However, this assumption is not tested. It is possible that the impairments observed are caused by disrupted commands, independent of whether these command signals are noisy. In other words, commands could be low noise but still faulty.
We recognize the reviewer’s concern about linking movement decomposition and trial-to-trial trajectory variability with motor noise. As presented in our discussion section, we interpret these motor abnormalities as a form of motor noise in the sense that they are generated by faulty motor commands. We draw our interpretation from the findings of previous research work which show that the cerebellum aids in the state estimation of the limb and subsequent generation of accurate feedforward commands. Therefore, disruption of the cerebellar output may lead to faulty motor commands resulting in the observed asynchronous joint activations (i.e., movement decomposition) and unpredictable trajectories (i.e., increased trial-to-trial variability). Both observed deficits resemble increased motor noise.
Throughout the text, the use of the term 'feedforward control' seems unnecessary. To dig into the feedforward component of the deficit, the authors could quantify the trajectory errors only at the earliest time points (e.g., in Figure 5d), but even with this analysis, it is difficult to disentangle feedforward- and feedback-mediated effects when deficits are seen throughout the reach. While outside the scope of this study, it would be interesting to explore how feedback responses to limb perturbation are affected in control versus HFS conditions. However, as is, these questions are not explored, and the claim of impaired feedforward control feels overstated.
We agree that to strictly focus on feedforward control, we could have examined the measured variables in the first 50-100 ms of the movement which has been shown to be unaffected by feedback responses (Pruszynski et al. 2008, Todorov and Jordan 2002, Pruszynski and Scott 2012, Crevecoeur et al. 2013). However, in our task the amplitude of movements made by our monkeys was small and therefore the response measures we used were too small in the first 50-100 ms for a robust estimation. Also, fixing a time window led to an unfair comparison between control and cerebellar block trials, in which velocity was significantly reduced and therefore movement time was longer. Therefore, we used the peak velocity, torque-impulse at the peak velocity and maximum deviation of the hand trajectory as response measures. We will acknowledge this point in the discussion section of our revised manuscript. We will also tone down references to feedforward control throughout the text of our revised manuscript as suggested by the reviewer.
The terminology 'single-joint' movement is a bit confusing. At a minimum, it would be nice to show kinematics during different target reaches to demonstrate that certain targets are indeed single joint movements. More of an issue, however, is that it seems like these are not actually 'single-joint' movements. For example, Figure 2c shows that target 1 exhibits high elbow and shoulder torques, but in the text, T1 is described as a 'single-joint' reach (e.g. lines 155-156). The point that I think the authors are making is that these targets have low interaction torques. If that is the case, the terminology should be changed or clarified to avoid confusion.
Indeed, as reviewer #1 also noted, movements to target 1 and 5 are not purely single-joint but rather have relatively low coupling torques. Our intention while using the term “single-joint” was to indicate a target direction in which one joint remains stationary, resulting in minimal coupling torque at the adjacent joint. Specifically, for Targets 1 and 5 in our experiments, the net torque (and thus acceleration) at the elbow was negligible, and hence the shoulder experienced correspondingly low coupling torque (as illustrated in Figure 3c of our manuscript). ). To avoid confusion, we will use the term ‘predominantly single-joint’ movements in our revised manuscript to indicate targets with low coupling torques. We will also include an additional figure in the revised supplementary material displaying the net torques at the shoulder and elbow, similar to Figures 2c and 3c. Our goal is to demonstrate that movements to targets 1 and 5 are characterized by predominantly one-joint engagement (i.e., the elbow is stationary with low net torque) and low coupling torques, rather than implying a purely isolated, single-joint motion.
The labels in Figure 3d are confusing and could use more explanation in the figure legend.
In Figure 3d, it is stated that data from all monkeys is pooled. However, if there is a systematic bias between animals, this could generate spurious correlations. Were correlations also calculated for each animal separately to confirm the same trend between velocity and coupling torques holds for each animal?
We will revise the figure legend and main-text explanation for Figure 3d. Please note that pooling the data was done after confirming that data from each animal expressed a similar trend. Specifically, the correlation coefficients were positive but significant for 3 out of the 4 monkeys. Moreover, following the reviewers’ feedback, we also did a partial correlation analysis (which controls for the variability across monkeys) and found a significant correlation (r = 0.33, p < 0.001). These points will be described in the revised manuscript.
In Table S1, it would be nice to see target-specific success rates. The data would suggest that targets with the highest interaction torques will have the largest reduction in success rates, especially during later HFS trials. Is this the case?
We will provide a breakdown of the success rates as a function of targets. However, one should note that success/failure may depend on several factors beyond impaired limb dynamics. In a previous study (Nashef et al. 2019) we identified several causes of failure such as (i) not entering the central target in time, (ii) moving out too early from the peripheral target, (iii) Reaction time longer than permitted, or (iv) premature exit from the central target before permitted.
-
-
osf.io osf.io
-
Author response:
eLife Assessment
This valuable short paper is an ingenious use of clinical patient data to address an issue in imaging neuroscience. The authors clarify the role of face-selectivity in human fusiform gyrus by measuring both BOLD fMRI and depth electrode recordings in the same individuals; furthermore, by comparing responses in different brain regions in the two patients, they suggested that the suppression of blood oxygenation is associated with a decrease in local neural activity. While the methods are compelling and provide a rare dataset of potentially general importance, the presentation of the data in its current form is incomplete.
We thank the Reviewing editor and Senior editor at eLife for their positive assessment of our paper. After reading the reviewers’ comments – to which we reply below - we agree that the presentation of the data could be completed. We provide additional presentation of data in the responses below and we will slightly modify Figure 2 of the paper. However, in keeping the short format of the paper, the revised version will have the same number of figures, which support the claims made in the paper.
Reviewer #1 (Public review):
Summary:
Measurement of BOLD MR imaging has regularly found regions of the brain that show reliable suppression of BOLD responses during specific experimental testing conditions. These observations are to some degree unexplained, in comparison with more usual association between activation of the BOLD response and excitatory activation of the neurons (most tightly linked to synaptic activity) in the same brain location. This paper finds two patients whose brains were tested with both non-invasive functional MRI and with invasive insertion of electrodes, which allowed the direct recording of neuronal activity. The electrode insertions were made within the fusiform gyrus, which is known to process information about faces, in a clinical search for the sites of intractable epilepsy in each patient. The simple observation is that the electrode location in one patient showed activation of the BOLD response and activation of neuronal firing in response to face stimuli. This is the classical association. The other patient showed an informative and different pattern of responses. In this person, the electrode location showed a suppression of the BOLD response to face stimuli and, most interestingly, an associated suppression of neuronal activity at the electrode site.
Strengths:
Whilst these results are not by themselves definitive, they add an important piece of evidence to a long-standing discussion about the origins of the BOLD response. The observation of decreased neuronal activation associated with negative BOLD is interesting because, at various times, exactly the opposite association has been predicted. It has been previously argued that if synaptic mechanisms of neuronal inhibition are responsible for the suppression of neuronal firing, then it would be reasonable
Weaknesses:
The chief weakness of the paper is that the results may be unique in a slightly awkward way. The observation of positive BOLD and neuronal activation is made at one brain site in one patient, while the complementary observation of negative BOLD and neuronal suppression actually derives from the other patient. Showing both effects in both patients would make a much stronger paper.
We thank reviewer #1 for their positive evaluation of our paper. Obviously, we agree with the reviewer that the paper would be much stronger if BOTH effects – spike increase and decrease – would be found in BOTH patients in their corresponding fMRI regions (lateral and medial fusiform gyrus) (also in the same hemisphere). Nevertheless, we clearly acknowledge this limitation in the (revised) version of the manuscript (p.8: Material and Methods section).
In the current paper, one could think that P1 shows only increases to faces, and P2 would show only decreases (irrespective of the region). However, that is not the case since 11% of P1’s face-selective units are decreases (89% are increases) and 4% of P2’s face-selective units are increases. This has now been made clearer in the manuscript (p.5).
As the reviewer is certainly aware, the number and position of the electrodes are based on strict clinical criteria, and we will probably never encounter a situation with two neighboring (macro-micro hybrid electrodes), one with microelectrodes ending up in the lateral MidFG, the other in the medial MidFG, in the same patient. If there is no clinical value for the patient, this cannot be done.
The only thing we can do is to strengthen these results in the future by collecting data on additional patients with an electrode either in the lateral or the medial FG, together with fMRI. But these are the only two patients we have been able to record so far with electrodes falling unambiguously in such contrasted regions and with large (and comparable) measures.
While we acknowledge that the results may be unique because of the use of 2 contrasted patients only (and this is why the paper is a short report), the data is compelling in these 2 cases, and we are confident that it will be replicated in larger cohorts in the future.
Reviewer #2 (Public review):
Summary:
This is a short and straightforward paper describing BOLD fMRI and depth electrode measurements from two regions of the fusiform gyrus that show either higher or lower BOLD responses to faces vs. objects (which I will call face-positive and facenegative regions). In these regions, which were studied separately in two patients undergoing epilepsy surgery, spiking activity increased for faces relative to objects in the face-positive region and decreased for faces relative to objects in the face-negative region. Interestingly, about 30% of neurons in the face-negative region did not respond to objects and decreased their responses below baseline in response to faces (absolute suppression).
Strengths:
These patient data are valuable, with many recording sessions and neurons from human face-selective regions, and the methods used for comparing face and object responses in both fMRI and electrode recordings were robust and well-established. The finding of absolute suppression could clarify the nature of face selectivity in human fusiform gyrus since previous fMRI studies of the face-negative region could not distinguish whether face < object responses came from absolute suppression, or just relatively lower but still positive responses to faces vs. objects.
Weaknesses:
The authors claim that the results tell us about both 1) face-selectivity in the fusiform gyrus, and 2) the physiological basis of the BOLD signal. However, I would like to see more of the data that supports the first claim, and I am not sure the second claim is supported.
(1) The authors report that ~30% of neurons showed absolute suppression, but those data are not shown separately from the neurons that only show relative reductions. It is difficult to evaluate the absolute suppression claim from the short assertion in the text alone (lines 105-106), although this is a critical claim in the paper.
We thank reviewer #2 for their positive evaluation of our paper. We understand the reviewer’s point, and we partly agree. Where we respectfully disagree is that the finding of absolute suppression is critical for the claim of the paper: finding an identical contrast between the two regions in terms of RELATIVE increase/decrease of face-selective activity in fMRI and spiking activity is already novel and informative. Where we agree with the reviewer is that the absolute suppression could be more documented: it wasn’t, due to space constraints (brief report). We provide below an example of a neuron showing absolute suppression to faces. In the frequency domain, there is only a face-selective response (1.2 Hz and harmonics) but no significant response at 6 Hz (common general visual response). In the time-domain, relative to face onset, the response drops below baseline level. It means that this neuron has baseline (non-periodic) spontaneous spiking activity that is actively suppressed when a face appears.
Author response image 1.
(2) I am not sure how much light the results shed on the physiological basis of the BOLD signal. The authors write that the results reveal "that BOLD decreases can be due to relative, but also absolute, spike suppression in the human brain" (line 120). But I think to make this claim, you would need a region that exclusively had neurons showing absolute suppression, not a region with a mix of neurons, some showing absolute suppression and some showing relative suppression, as here. The responses of both groups of neurons contribute to the measured BOLD signal, so it seems impossible to tell from these data how absolute suppression per se drives the BOLD response.
It is a fact that we find both kinds of responses in the same region. We cannot tell with this technique if neurons showing relative vs. absolute suppression of responses are spatially segregated for instance (e.g., forming two separate sub-regions) or are intermingled. And we cannot tell from our data how absolute suppression per se drives the BOLD response. In our view, this does not diminish the interest and originality of the study, but the statement "that BOLD decreases can be due to relative, but also absolute, spike suppression in the human brain” will be rephrased in the revised manuscript, in the following way: "that BOLD decreases can be due to relative, or absolute (or a combination of both), spike suppression in the human brain”.
Reviewer #3 (Public review):
In this paper the authors conduct two experiments an fMRI experiment and intracranial recordings of neurons in two patients P1 and P2. In both experiments, they employ a SSVEP paradigm in which they show images at a fast rate (e.g. 6Hz) and then they show face images at a slower rate (e.g. 1.2Hz), where the rest of the images are a variety of object images. In the first patient, they record from neurons over a region in the mid fusiform gyrus that is face-selective and in the second patient, they record neurons from a region more medially that is not face selective (it responds more strongly to objects than faces). Results find similar selectivity between the electrophysiology data and the fMRI data in that the location which shows higher fMRI to faces also finds face-selective neurons and the location which finds preference to non faces also shows non face preferring neurons.
Strengths:
The data is important in that it shows that there is a relationship between category selectivity measured from electrophysiology data and category-selective from fMRI. The data is unique as it contains a lot of single and multiunit recordings (245 units) from the human fusiform gyrus - which the authors point out - is a humanoid specific gyrus.
Weaknesses:
My major concerns are two-fold:
(i) There is a paucity of data; Thus, more information (results and methods) is warranted; and in particular there is no comparison between the fMRI data and the SEEG data.
We thank reviewer #3 for their positive evaluation of our paper. If the reviewer means paucity of data presentation, we agree and we provide more presentation below, although the methods and results information appear as complete to us. The comparison between fMRI and SEEG is there, but can only be indirect (i.e., collected at different times and not related on a trial-by-trial basis for instance). In addition, our manuscript aims at providing a short empirical contribution to further our understanding of the relationship between neural responses and BOLD signal, not to provide a model of neurovascular coupling.
(ii) One main claim of the paper is that there is evidence for suppressed responses to faces in the non-face selective region. That is, the reduction in activation to faces in the non-face selective region is interpreted as a suppression in the neural response and consequently the reduction in fMRI signal is interpreted as suppression. However, the SSVEP paradigm has no baseline (it alternates between faces and objects) and therefore it cannot distinguish between lower firing rate to faces vs suppression of response to faces.
We understand the concern of the reviewer, but we respectfully disagree that our paradigm cannot distinguish between lower firing rate to faces vs. suppression of response to faces. Indeed, since the stimuli are presented periodically (6 Hz), we can objectively distinguish stimulus-related activity from spontaneous neuronal firing. The baseline corresponds to spikes that are non-periodic, i.e., unrelated to the (common face and object) stimulation. For a subset of neurons, even this non-periodic baseline activity is suppressed, above and beyond the suppression of the 6 Hz response illustrated on Figure 2. We mention it in the manuscript, but we agree that we do not present illustrations of such decrease in the time-domain for SU, which we did not consider as being necessary initially (please see below for such presentation).
(1) Additional data: the paper has 2 figures: figure 1 which shows the experimental design and figure 2 which presents data, the latter shows one example neuron raster plot from each patient and group average neural data from each patient. In this reader's opinion this is insufficient data to support the conclusions of the paper. The paper will be more impactful if the researchers would report the data more comprehensively.
We answer to more specific requests for additional evidence below, but the reviewer should be aware that this is a short report, which reaches the word limit. In our view, the group average neural data should be sufficient to support the conclusions, and the example neurons are there for illustration. And while we cannot provide the raster plots for a large number of neurons, the anonymized data will be made available upon publication of the final version of the paper.
(a) There is no direct comparison between the fMRI data and the SEEG data, except for a comparison of the location of the electrodes relative to the statistical parametric map generated from a contrast (Fig 2a,d). It will be helpful to build a model linking between the neural responses to the voxel response in the same location - i.e., estimate from the electrophysiology data the fMRI data (e.g., Logothetis & Wandell, 2004).
As mentioned above the comparison between fMRI and SEEG is indirect (i.e., collected at different times and not related on a trial-by-trial basis for instance) and would not allow to make such a model.
(b) More comprehensive analyses of the SSVEP neural data: It will be helpful to show the results of the frequency analyses of the SSVEP data for all neurons to show that there are significant visual responses and significant face responses. It will be also useful to compare and quantify the magnitude of the face responses compared to the visual responses.
The data has been analyzed comprehensively, but we would not be able to show all neurons with such significant visual responses and face-selective responses.
(c) The neuron shown in E shows cyclical responses tied to the onset of the stimuli, is this the visual response?
Correct, it’s the visual response at 6 Hz.
If so, why is there an increase in the firing rate of the neuron before the face stimulus is shown in time 0?
Because the stimulation is continuous. What is displayed at 0 is the onset of the face stimulus, with each face stimulus being preceded by 4 images of nonface objects.
The neuron's data seems different than the average response across neurons; This raises a concern about interpreting the average response across neurons in panel F which seems different than the single neuron responses
The reviewer is correct, and we apologize for the confusion. This is because the average data on panel F has been notch-filtered for the 6 Hz (and harmonic responses), as indicated in the methods (p.11): ‘a FFT notch filter (filter width = 0.05 Hz) was then applied on the 70 s single or multi-units time-series to remove the general visual response at 6 Hz and two additional harmonics (i.e., 12 and 18 Hz)’.
Here is the same data without the notch-filter (the 6Hz periodic response is clearly visible):
Author response image 2.
For sake of clarity, we prefer presenting the notch-filtered data in the paper, but the revised version will make it clear in the figure caption that the average data has been notch-filtered.
(d) Related to (c) it would be useful to show raster plots of all neurons and quantify if the neural responses within a region are homogeneous or heterogeneous. This would add data relating the single neuron response to the population responses measured from fMRI. See also Nir 2009.
We agree with the reviewer that this is interesting, but again we do not think that it is necessary for the point made in the present paper. Responses in these regions appear rather heterogenous, and we are currently working on a longer paper with additional SEEG data (other patients tested for shorter sessions) to define and quantify the face-selective neurons in the MidFusiform gyrus with this approach (without relating it to the fMRI contrast as reported here).
(e) When reporting group average data (e.g., Fig 2C,F) it is necessary to show standard deviation of the response across neurons.
We agree with the reviewer and have modified Figure 2 accordingly in the revised manuscript.
(f) Is it possible to estimate the latency of the neural responses to face and object images from the phase data? If so, this will add important information on the timing of neural responses in the human fusiform gyrus to face and object images.
The fast periodic paradigm to measure neural face-selectivity has been used in tens of studies since its original reports:
- in EEG: Rossion et al., 2015: https://doi.org/10.1167/15.1.18
- in SEEG: Jonas et al., 2016: https://doi.org/10.1073/pnas.1522033113
In this paradigm, the face-selective response spreads to several harmonics (1.2 Hz, 2.4 Hz, 3.6 Hz, etc.) (which are summed for quantifying the total face-selective amplitude). This is illustrated below by the averaged single units’ SNR spectra across all recording sessions for both participants.
Author response image 3.
There is no unique phase-value, each harmonic being associated with a phase-value, so that the timing cannot be unambiguously extracted from phase values. Instead, the onset latency is computed directly from the time-domain responses, which is more straightforward and reliable than using the phase. Note that the present paper is not about the specific time-courses of the different types of neurons, which would require a more comprehensive report, but which is not necessary to support the point made in the present paper about the SEEG-fMRI sign relationship.
g) Related to (e) In total the authors recorded data from 245 units (some single units and some multiunits) and they found that both in the face and nonface selective most of the recoded neurons exhibited face -selectivity, which this reader found confusing: They write “ Among all visually responsive neurons, we found a very high proportion of face-selective neurons (p < 0.05) in both activated and deactivated MidFG regions (P1: 98.1%; N = 51/52; P2: 86.6%; N = 110/127)’. Is the face selectivity in P1 an increase in response to faces and P2 a reduction in response to faces or in both it’s an increase in response to faces
Face-selectivity is defined as a DIFFERENTIAL response to faces compared to objects, not necessarily a larger response to faces. So yes, face-selectivity in P1 is an increase in response to faces and P2 a reduction in response to faces.
(1) Additional methods
(a) it is unclear if the SSVEP analyses of neural responses were done on the spikes or the raw electrical signal. If the former, how is the SSVEP frequency analysis done on discrete data like action potentials?
The FFT is applied directly on spike trains using Matlab’s discrete Fourier Transform function. This function is suitable to be applied to spike trains in the same way as to any sampled digital signal (here, the microwires signal was sampled at 30 kHz, see Methods).
In complementary analyses, we also attempted to apply the FFT on spike trains that had been temporally smoothed by convolving them with a 20ms square window (Le Cam et al., 2023, cited in the paper ). This did not change the outcome of the frequency analyses in the frequency range we are interested in.
(b) it is unclear why the onset time was shifted by 33ms; one can measure the phase of the response relative to the cycle onset and use that to estimate the delay between the onset of a stimulus and the onset of the response. Adding phase information will be useful.
The onset time was shifted by 33ms because the stimuli are presented with a sinewave contrast modulation (i.e., at 0ms, the stimulus has 0% contrast). 100% contrast is reached at half a stimulation cycle, which is 83.33ms here, but a response is likely triggered before reaching 100% contrast. To estimate the delay between the start of the sinewave (0% contrast) and the triggering of a neural response, we tested 7 SEEG participants with the same images presented in FPVS sequences either as a sinewave contrast (black line) modulation or as a squarewave (i.e. abrupt) contrast modulation (red line). The 33ms value is based on these LFP data obtained in response to such sinewave stimulation and squarewave stimulation of the same paradigm. This delay corresponds to 4 screen refresh frames (120 Hz refresh rate = 8.33ms by frame) and 35% of the full contrast, as illustrated below (please see also Retter, T. L., & Rossion, B. (2016). Uncovering the neural magnitude and spatio-temporal dynamics of natural image categorization in a fast visual stream. Neuropsychologia, 91, 9–28).
Author response image 4.
(2) Interpretation of suppression:
The SSVEP paradigm alternates between 2 conditions: faces and objects and has no baseline; In other words, responses to faces are measured relative to the baseline response to objects so that any region that contains neurons that have a lower firing rate to faces than objects is bound to show a lower response in the SSVEP signal. Therefore, because the experiment does not have a true baseline (e.g. blank screen, with no visual stimulation) this experimental design cannot distinguish between lower firing rate to faces vs suppression of response to faces.
The strongest evidence put forward for suppression is the response of non-visual neurons that was also reduced when patients looked at faces, but since these are non-visual neurons, it is unclear how to interpret the responses to faces.
We understand this point, but how does the reviewer know that these are non-visual neurons? Because these neurons are located in the visual cortex, they are likely to be visual neurons that are not responsive to non-face objects. In any case, as the reviewer writes, we think it’s strong evidence for suppression.
We thank all three reviewers for their positive evaluation of our paper and their constructive comments.
Tags
Annotators
URL
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
We thank all reviewers for the highly detailed review and the time and effort which has been invested in this review. We have read their perspectives, questions and suggested improvements with great interest. We have reflected on the public review in detail and have made the first provisional responses which are outlined below. First, we would like to respond to four main issues pointed out by the editor and reviewers:
(1) Lack of yield data in the manuscript: There have been yield data collected in most of the sites and years of our study, and these have already been published and cited in our manuscript. In the appendix of our manuscript, we included a table with yield data for the sites and years in which the beetle diversity was studied. These data show that strip cropping does not cause a systematic yield reduction.
(2) Sampling design clarification: Our paper combines data from trials conducted at different locations and years. On the one hand this allows an analysis of a comprehensive dataset, but on the other hand in some cases there were slight inconsistencies in how data were collected or processed (e.g. taxonomic level of species identification). We will explain the sampling design and data analysis in more detail to increase clarity and transparency.
(3) Additional data analysis: In the revised manuscript we will present an analysis on the responses of abundances of the 12 most common ground beetle genera to strip cropping. This will give better insight of the variation in responses among ground beetle taxa.
(4) Restrict findings to our system: We will nuance our findings further and will focus more strongly on the implications of our data on ground beetle communities, rather than on agrobiodiversity in a broader sense.
We will further work on improving the manuscript based on reviewers feedback in the coming weeks, aiming to submit a revised version of the manuscript at the end of February.
Detailed response to editor and reviewers:
Editor Comments:
(1) You only have analyzed ground beetle diversity, it would be important to add data on crop yields, which certainly must be available (note that in normal intercropping these would likely be enhanced as well).
Most yield data have been published in three previous papers, which we already cited or will cite (one was not yet published at the time of submission). Our argumentation is based on these studies. We had also already included a table in the appendix that showed the yield data that relates specifically to our locations and years of measurement. The finding that strip cropping does not majorly affect yield is based on these findings. We will consider changing the title of our manuscript to remove the explicit focus on yield.
(2) Considering the heterogeneous data involving different experiments it is particularly important to describe the sampling design in detail and explain how various hierarchical levels were accounted for in the analysis.
We agree that some important details to our analysis were not described in sufficient detail. Especially reviewer 2 pointed out several relevant points that we did account for in our analyses, but which were not clear from the text in the methods section. We are convinced that our data analyses are robust and that our conclusions are supported by the data. We will revise the methods section to make our approach clearer and more transparent.
(3) In addition to relative changes in richness and density of ground beetles you should also present the data from which these have been derived. Furthermore, you could also analyze and interpret the response of the different individual taxa to strip cropping.
With our heterogeneous dataset it was quite complicated to show overall patterns of absolute changes in ground beetle abundance and richness, especially for the field-level analyses. As the sampling design was not always the same and occasionally samples were missing, the number of year series that made up a datapoint were different among locations and years. However, we always made sure that for the comparison of a paired monoculture and strip cropping field, the number of year series was always made equal through rarefaction. That is, the number of ground beetle(s) (species) are always expressed as the number per 2 to 6 samples. Therefore, we prefer to stick to relative changes as we are convinced that this gives a fairer representation of our complex dataset.
We agree with the second point that both the editor and several reviewers pointed out. The indicator species analyses that we used were biased by rare species, and we now omit this analysis. Instead, we will include a GLM analysis on the responses of abundances of the 12 most common ground beetle genera to strip cropping. We chose for genera here (and not species) as we could then include all locations and years within the analysis, and in most cases a genus was dominated by a single species (but notable exceptions were Amara and Harpalus, which were made up of several species). We will illustrate these findings still in a similar fashion as we did for the indicator species analysis.
(4) Keep to your findings and don't overstate them but try to better connect them to basic ecological hypotheses potentially explaining them.
After careful consideration of the important points that reviewers point out, we decided to nuance our points about biodiversity conservation along two key lines: (1) the extent to which ground beetles can be indicators of wider biodiversity changes; and (2) our findings that are not as straightforward positive as our narrative suggests. We still believe that strip cropping contributes positively to carabid communities, and will carefully check the text to avoid overstatements.
Reviewer 1:
Summary:
This study demonstrates that strip cropping enhances the taxonomic diversity of ground beetles across organically-managed crop systems in the Netherlands. In particular, strip cropping supported 15% more ground beetle species and 30% more individuals compared to monocultures.
Strengths:
A well-written study with well-analyzed data of a complex design. The data could have been analyzed differently e.g. by not pooling samples, but there are pros and cons for each type of analysis and I am convinced this will not affect the main findings. A strong point is that data were collected for 4 years. This is especially strong as most data on biodiversity in cropping systems are only collected for one or two seasons. Another strong point is that several crops were included.
We thank reviewer 1 for their kind words and agree with this strength of the paper. The paper combines data from trials conducted at different locations and years. On the one hand this allows an analysis of a comprehensive dataset, but on the other hand in some cases there were slight inconsistencies in how data were collected or processed (e.g. taxonomic level of species identification).
Weaknesses:
This study focused on the biodiversity of ground beetles and did not examine crop productivity. Therefore, I disagree with the claim that this study demonstrates biodiversity enhancement without compromising yield. The authors should present results on yield or, at the very least, provide a stronger justification for this statement.
We acknowledge that we indeed did not formally analyze yield in our study, but we have good reason for this. The claim that strip cropping does not compromise yield comes from several extensive studies (Juventia et al., 2024; Ditzler et al., 2023; Carillo-Reche et al., 2023) that were conducted in nearly all the sites and years that we included in our study. We chose not to include formal analyses of productivity for two key reasons: (1) a yield analysis would duplicate already published analyses, and (2) we prefer to focus more on the ecology of ground beetles and the effect of strip cropping on biodiversity, rather than diverging our focus also towards crop productivity. Nevertheless, we have shown the results on yield in Table S6 and refer extensively to the studies that have previously analyzed this data.
Reviewer 2:
Summary:
The authors aimed to investigate the effects of organic strip cropping on carabid richness and density as well as on crop yields. They find on average higher carabid richness and density in strip cropping and organic farming, but not in all cases.
Strengths:
Based on highly resolved species-level carabid data, the authors present estimates for many different crop types, some of them rarely studied, at the same time. The authors did a great job investigating different aspects of the assemblages (although some questions remain concerning the analyses) and they present their results in a visually pleasing and intuitive way.
We appreciate the kind words of reviewer 2 and their acknowledgement of the extensiveness of our dataset. In our opinion, the inclusion of many different crops is indeed a strength, rarely seen in similar studies; and we are happy that the figures are appreciated.
Weaknesses:
The authors used data from four different strip cropping experiments and there is no real replication in space as all of these differed in many aspects (different crops, different areas between years, different combinations, design of the strip cropping (orientation and width), sampling effort and sample sizes of beetles (differing more than 35 fold between sites; L 100f); for more differences see L 237ff). The reader gets the impression that the authors stitched data from various places together that were not made to fit together. This may not be a problem per se but it surely limits the strength of the data as results for various crops may only be based on small samples from one or two sites (it is generally unclear how many samples were used for each crop/crop combination).
The paper indeed combines data from trials conducted at different locations and years. On the one hand this allows an analysis of a comprehensive dataset, but on the other hand in some cases there were slight differences in the experimental design. At the time that we did our research, there were only a handful of farmers that were employing strip cropping within the Netherlands, which greatly reduced the number of fields for our study. Therefore, we worked in the sites that were available and studied as many crops on these sites. Since there was variation in the crops grown in the sites, for some crops we have limited replication. In the revision we will explain this more clearly.
One of my major concerns is that it is completely unclear where carabids were collected. As some strips were 3m wide, some others were 6m and the monoculture plots large, it can be expected that carabids were collected at different distances from the plot edge. This alone, however, was conclusively shown to affect carabid assemblages dramatically and could easily outweigh the differences shown here if not accounted for in the models (see e.g. Boetzl et al. (2024) or Knapp et al. (2019) among many other studies on within field-distributions of carabids).
Point well taken and we will present a more detailed description of the sampling design in the methods. Samples were always taken at least 10 meters into the field, and always in the middle of the strip. This would indeed mean that there is a small difference between the 3- and 6m wide strips regarding distance from another strip, but this was then only a difference of 1.5 to 3 meters from the edge. A difference that, based on our own extensive experience with ground beetle communities, will not have a large impact on the findings of ground beetles. The distance from field/plot edges was similar between monocultures and strip cropped fields.
The authors hint at a related but somewhat different problem in L 137ff - carabid assemblages sampled in strips were sampled in closer proximity to each other than assemblages in monoculture fields which is very likely a problem. The authors did not check whether their results are spatially autocorrelated and this shortcoming is hard to account for as it would have required a much bigger, spatially replicated design in which distances are maintained from the beginning. This limitation needs to be stated more clearly in the manuscript.
This is a limitation that is hard to avoid in comparisons between strip cropping and monoculture systems because the use of a statistically robust design with sufficient replication and still using field sizes that are representative for farming practice are often not possible. We will acknowledge this limitation in the revised manuscript. To allow a fair comparison based on sufficient number of replications, we chose to combine data from several years and locations (despite this not being the ideal experimental design). This approach has the drawback that ground beetle communities are difficult to compare. Therefore, we chose to further investigate two years of data from Wageningen as the factorial design allowed a fair comparison between monocultures and strip cropping. We analyzed three crop combinations during two years, but we still cannot exclude a potential influence of spatial autocorrelation. We acknowledged this limitation in our original submission, and we will clarify this point further in the revision.
Similarly, we know that carabid richness and density depend strongly on crop type (see e.g. Toivonen et al. (2022)) which could have biased results if the design is not balanced (this information is missing but it seems to be the case, see e.g. Celeriac in Almere in 2022).
The samples size ranges between 2 and 6 per combination of cropping design, crop, location and year. We believe that this will allow a meaningful analysis. Moreover, our main focus is the comparison between monoculture and strip cropping, and not the comparison between different crops. Even though we show that crop types have different ground beetle communities, we are most interested in the contrast of ground beetle communities in strip cropping and monoculture systems.
A more basic problem is that the reader neither learns where traps were located, how missing traps were treated for analyses how many samples there were per crop or crop combination (in a simple way, not through Table S7 - there has to have been a logic in each of these field trials) or why there are differences in the number of samples from the same location and year (see Table S7). This information needs to be added to the methods section.
Point well taken. We will clarify this further in the revised manuscript. As we combined data from several experimental designs that originally had slightly different research questions, this in part caused differences between numbers of rounds or samples per crop, location or year.
As carabid assemblages undergo rapid phenological changes across the year, assemblages that are collected at different phenological points within and across years cannot easily be compared. The authors would need to standardize for this and make sure that the assemblages they analyze are comparable prior to analyses. Otherwise, I see the possibility that the reported differences might simply be biased by phenology.
We agree and we dealt with this issue by using year series instead of using individual samples of different rounds. While this approach is not perfect, it allows us to get the best possible impression of the entire ground beetle community across seasons. For our analyses we had the choice to only include data from sampling rounds that were conducted at the same time, or to include all available data. We chose to analyze all data, and made sure that the number of samples between strip cropping and monoculture fields per location, year and crop was always the same by pooling and rarefaction. In this way we have analyzed a complex multi-year, multi-crop and multi-location dataset as good as we could.
Surrounding landscape structure is known to affect carabid richness and density and could thus also bias observed differences between treatments at the same locations (lower overall richness => lower differences between treatments). Landscape structure has not been taken into account in any way.
We did not include landscape structure as there are only 4 sites, which does not allow a meaningful analysis of potential effects landscape structure. Studying how landscape interacts with strip cropping to influence insect biodiversity would require at least, say 15 to 20 sites, which was not feasible for this study. However, such an analysis may be possible in an ongoing project (CropMix) which includes many farms that work with strip cropping.
In the statistical analyses, it is unclear whether the authors used estimated marginal means (as they should) - this needs to be clarified.
In the revised manuscript we will further clarify this point.
In addition, and as mentioned by Dr. Rasmann in the previous round (comment 1), the manuscript, in its current form, still suffers from simplified generalizations that 'oversell' the impact of the study and should be avoided. The authors restricted their analyses to ground beetles and based their conclusions on a design with many 'heterogeneities' - they should not draw conclusions for farmland biodiversity but stick to their system and report what they found. Although I understand the authors have previously stated that this is 'not practically feasible', the reason for this comment is simply to say that the authors should not oversell their findings.
In the revised manuscript, we will nuance our findings by explaining that strip cropping is a potentially useful tool to support ground beetle biodiversity in agricultural fields, but the effects on other taxa still needs to be further explored.
Reviewer 3:
Summary:
In this paper, the authors made a sincere effort to show the effects of strip cropping, a technique of alternating crops in small strips of several meters wide, on ground beetle diversity. They state that strip cropping can be a useful tool for bending the curve of biodiversity loss in agricultural systems as strip cropping shows a relative increase in species diversity (i.e. abundance and species richness) of the ground beetle communities compared to monocultures. Moreover, strip cropping has the added advantage of not having to compromise on agricultural yields.
Strengths:
The article is well written; it has an easily readable tone of voice without too much jargon or overly complicated sentence structure. Moreover, as far as reviewing the models in depth without raw data and R scripts allows, the statistical work done by the authors looks good. They have well thought out how to handle heterogenous, yet spatially and temporarily correlated field data. The models applied and the model checks performed are appropriate for the data at hand. Combining RDA and PCA axes together is a nice touch.
We thank reviewer 3 for their kind words and appreciation for the simple language and analysis that we used.
Weaknesses:
The evidence for strip cropping bringing added value for biodiversity is mixed at best. Yes, there is an increase in relative abundance and species richness at the field level, but it is not convincingly shown this difference is robust or can be linked to clear structural and hypothesised advantages of the strip cropping system. The same results could have been used to conclude that there are only very limited signs of real added value of strip cropping compared to monocultures.
Point well taken. We agree that the effect of strip cropping on carabid beetle communities are subtle and we will nuance the text in the revised version to reflect this.
There are a number of reasons for this:
(1) Significant differences disappear at crop level, as the authors themselves clearly acknowledge, meaning that there are no differences between pairs of similar crops in the strip cropping fields and their respective monoculture. This would mean the strips effectively function as "mini-monocultures".
This is indeed in line with our conclusions. Based on our data and results, the advantages of strip cropping seem mostly to occur because crops with different communities are now on a same field, rather than that within the strips you get mixtures of communities related to different crops. We discussed this in the first paragraph of the discussion in the original submission.
The significant relative differences at the field level could be an artifact of aggregation instead of structural differences between strip cropping and monocultures; with enough data points things tend to get significant despite large variance. This should have been elaborated further upon by the authors with additional analyses, designed to find out where differences originate and what it tells about the functioning of the system. Or it should have provided ample reason for cautioning in drawing conclusions about the supposed effectiveness of strip cropping based on these findings.
We believe that this is a misunderstanding of our approach. In the field-level analyses we pooled samples from the same field (i.e. pseudo-replicates were pooled), resulting in a relatively small sample size of 50 samples. We will explain this better in the methods section. Therefore, the statement “with enough data points things tend to get significant” is not applicable here.
(2) The authors report percentages calculated as relative change of species richness and abundance in strip cropping compared to monocultures after rarefaction. This is in itself correct, however, it can be rather tricky to interpret because the perspective on actual species richness and abundance in the fields and treatments is completely lost; the reported percentages are dimensionless. The authors could have provided the average cumulative number of species and abundance after rarefaction. Also, range and/or standard error would have been useful to provide information as to the scale of differences between treatments. This could provide a new perspective on the magnitude of differences between the two treatments which a dimensionless percentage cannot.
We agree that this would be the preferred approach if we would have had a perfectly balanced dataset. However, this approach is not feasible with our unbalanced design and differences in sampling effort. While we acknowledge the limitation of the interpretation of percentages, it does allow reporting relative changes for each combination of location, year and crop. The number of samples on which the percentages were based were always kept equal (through rarefaction) between the cropping systems (for each combination of location, year and crop), but not among crops, years and location. The reason for this is that we did not always have an equal number of samples available between both cropping systems, and this approach allowed us to make a better estimation whenever more samples were available. For example, sometimes we had 2 samples from a strip cropped field and 6 from the monoculture, here we would use rarefaction up to 2 samples (where we would just have a better estimation from the monoculture). In other cases, we had 4 samples in both strip cropped and monoculture field, here we chose to use rarefaction to 4 samples to get a better estimation altogether. Adding a value for actual richness or abundance to the figures would have distorted these findings, as the variation would be huge (as it would represent the number of ground beetle(s) species per 2 to 6 pitfall samples). Furthermore, the dimension that reviewer 3 describes would thus be “The number of ground beetle species / individuals per 2 to 6 samples”, not a very informative unit either. We chose to trade-off better estimations of difference between cropping systems over a more readily interpretable unit.
(3) The authors appear to not have modelled the abundance of any of the dominant ground beetle species themselves. Therefore it becomes impossible to assess which important species are responsible (if any) for the differences found in activity density between strip cropping and monocultures and the possible life history traits related reasons for the differences, or lack thereof, that are found. A big advantage of using ground beetles is that many life history traits are well studied and these should be used whenever there is reason, as there clearly is in this case. Moreover, it is unclear which species are responsible for the difference in species richness found at the field level. Are these dominant species or singletons? Do the strip cropping fields contain species that are absent in the monoculture fields and are not the cause of random variation or sampling? Unfortunately, the authors do not report on any of these details of the communities that were found, which makes the results much less robust.
Thank you for raising this point. We have reconsidered our indicator species analysis and found that it is rather sensitive for rare species and insensitive for changes in common species. Therefore, we will replace the indicator species analyses with a GLM analysis for the 12 most common genera of ground beetles In the revised manuscript. This will allow us to go more in depth on specific traits of the genera which abundances change depending on the cropping system. In the revised manuscript, we will also discuss these common genera more in depth, rather than focusing on rarer species. Furthermore, we will add information on rarity and habitat preference to the table that shows species abundances per location (Table S2).
(4) In the discussion they conclude that there is only a limited amount of interstrip movement by ground beetles. Otherwise, the results of the crop-level statistical tests would have shown significant deviation from corresponding monocultures. This is a clear indication that the strips function more like mini-monocultures instead of being more than the sum of its parts.
This is in line with our point in the first paragraph of the discussion and an important message of our manuscript.
(5) The RDA results show a modelled variable of differences in community composition between strip cropping and monoculture. Percentages of explained variation of the first RDA axis are extremely low, and even then, the effect of location and/or year appear to peak through (Figure S3), even though these are not part of the modelling. Moreover, there is no indication of clustering of strip cropping on the RDA axis, or in fact on the first principal component axis in the larger RDA models. This means the explanatory power of different treatments is also extremely low. The crop level RDA's show some clustering, but hardly any consistent pattern in either communities of crops or species correlations, indicating that differences between strip cropping and monocultures are very small.
We agree and we make a similar point in the first paragraph of the discussion.
Furthermore, there are a number of additional weaknesses in the paper that should be addressed:
The introduction lacks focus on the issues at hand. Too much space is taken up by facts on insect decline and land sharing vs. land sparing and not enough attention is spent on the scientific discussion underlying the statements made about crop diversification as a restoration strategy. They are simply stated as facts or as hypotheses with many references that are not mentioned or linked to in the text. An explicit link to the results found in the large number of references should be provided.
We will streamline the introduction by omitting the land sharing vs. land sparing topic and better linking references to our research findings.
The mechanistic understanding of strip cropping is what is at stake here. Does strip cropping behave similarly to intercropping, a technique that has been proven to be beneficial to biodiversity because of added effects due to increased resource efficiency and greater plant species richness? This should be the main testing point and agenda of strip cropping. Do the biodiversity benefits that have been shown for intercropping also work in strip cropping fields? The ground beetles are one way to test this. Hypotheses should originate from this and should be stated clearly and mechanistically.
We agree with the reviewer and will clarify this research direction clearer in the introduction of the revised manuscript.
One could question how useful indicator species analysis (ISA) is for a study in which predominantly highly eurytopic species are found. These are by definition uncritical of their habitat. Is there any mechanistic hypothesis underlying a suspected difference to be found in preferences for either strip cropping or monocultures of the species that were expected to be caught? In other words, did the authors have any a priori reasons to suspect differences, or has this been an exploratory exercise from which unexplained significant results should be used with great caution?
Point well taken. We agree that the indicator species analysis has limitations and therefore now replaced this with GLM analysis for the 12 most common ground beetle genera.
However, setting these objections aside there are in fact significant results with strong species associations both with monocultures and strip cropping. Unfortunately, the authors do not dig deeper into the patterns found a posteriori either. Why would some species associate so strongly with strip cropping? Do these species show a pattern of pitfall catches that deviate from other species, in that they are found in a wide range of strips with different crops in one strip cropping field and therefore may benefit from an increased abundance of food or shelter? Also, why would so many species associate with monocultures? Is this in any way logical? Could it be an artifact of the data instead of a meaningful pattern? Unfortunately, the authors do not progress along these lines in the methods and discussion at all.
We thank reviewer 3 for these valuable perspectives. In the revised manuscript, we will further explore the species/genera that respond to cropping systems and discuss these findings in more detail.
A second question raised in the introduction is whether the arable fields that form part of this study contain rare species. Unfortunately, the authors do not elaborate further on this. Do they expect rare species to be more prevalent in the strip cropping fields? Why? Has it been shown elsewhere that intercropping provides room for additional rare species?
The answer is simply no, we did not find more rare species in strip cropping. In the revised manuscript, we will add a column for rarity (according to waarneming.nl) in the table showing abundances of species per location. We only found two rare species, one of which we only found a single individual and one that was more related to the open habitat created by a failed wheat field. We will discuss this more in depth in the discussion.
Considering the implications the results of this research can have on the wider discussion of bending the curve and the effects of agroecological measures, bold claims should be made with extreme restraint and be based on extensive proof and robust findings. I am not convinced by the evidence provided in this article that the claim made by the authors that strip cropping is a useful tool for bending the curve of biodiversity loss is warranted.
We believe that strip cropping can be a useful tool because farmers readily adopt it and it can result in modest biodiversity gains without yield loss. However, strip cropping is indeed not a silver bullet (which we also don’t claim). We will nuance the implications of our study in the revised manuscript.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Public Reviews:
Reviewer #1 (Public review):
Summary:
Goal: Find downstream targets of cmk-1 phosphorylation, identify one that also seems to act in thermosensory habituation, test for genetic interactions between cmk-1 and this gene, and assess where these genes are acting in the thermosensory circuit during thermosensory habituation.
Methods: Two in vitro analyses of cmk-1 phosphorylation of C. elegans proteins. Thermosensory habituation of cmk-1 and tax-6 mutants and double mutants was assessed by measuring the rate of heat-evoked reversals (reversal probability) of C. elegans before and after 20s ISI repeated heat pulses over 60 minutes.
Conclusions: cmk-1 and tax-6 act in separate habituation processes, primarily in AFD, that interact complexly, but both serve to habituate the thermosensory reversal response. They found that cmk-1 primarily acts in AFD and tax-6 primarily acts in RIM (and FLP for naïve responses). They also identified hundreds of potential cmk-1 phosphorylation substrates in vitro.
Strengths:
The effect size in the genetic data is quite strong and a large number of genetic interaction experiments between cmk-1 and tax-1 demonstrate a complex interaction.
Thanks a lot for these positive remarks.
Weaknesses:
The major concern about this manuscript is the assumption that the process they are observing is habituation. The two previously cited papers using this (or a very similar) protocol, Lia and Glauser 2020 and Jordan and Glauser 2023, both use the word 'adaptation' to describe the observed behavioral decrement. Jordan and Glauser 2023 use the words 'habituation' or 'habituation-like' 10 times, however, they use 'adaptation' over 100 times. It is critical to distinguish habituation from sensory adaptation (or fatigue) in this thermal reversal protocol. These processes are often confused/conflated, however, they are very different; sensory adaptation is a process that decreases how much the nervous system is activated by a repeated stimulus, therefore it can even occur outside of the nervous system. Habituation is a learning process where the nervous system responds less to a repeated stimulus, despite (at least part of the nervous system) the nervous system still being similarly activated by the stimulus. Habituation is considered an attentional process, while adaptation is due to the fatigue of sensory transduction machinery. Control experiments such as tests for dishabituation (where the application of a different stimulus causes recovery of the decremented response) or rate of spontaneous recovery (more rapid recovery after short inter-stimulus intervals) are required to determine if habituation or sensory adaptation are occurring. These experiments will allow the results to be interpreted with clarity, without them, it isn't actually clear what biological process is actually being studied.
Thanks for the comment. As this reviewer points out, “adaptation” and “habituation” are often conflated. Many scientists (maybe not the majority though) use a less stringent definition for the word habituation, than the one presented by this reviewer. More particularly, the term habituation is used in human pain research to refer solely to the reduction of response to repeated stimuli, in the absence of a detailed assessment of the more stringent criteria mentioned here. In addition to the practice in pain research, the main reason why we steered toward ‘habituation’ from our previous publication is because it immediately conveys the idea of a response reduction, whereas ‘adaptation’ could in principle be either an up-regulation or a down-regulation of the response (again, based on various definitions). But we agree that using the word “habituation” came at the cost of triggering a confusion about the exact nature of the process, for those considering the stricter definition of the word “habituation”. In the manuscript under revision, we are changing this terminology to “adaptation”. Also following suggestions from Reviewer 2, we are strengthening the description of the protocol in the Result section and clarifying, why the adaptation phenomenon is not a ‘thermal damage’ effect or ‘fatigue’ effect in the neuro-muscular circuit controlling reversal.
While the discrepancy between the in vitro phosphorylation experiments and the in silico predictions was discussed, the substantial discrepancy (over 85% of the substrates in the smaller in vitro dataset were not identified in the larger dataset) between the two different in vitro datasets was not discussed. This is surprising, as these approaches were quite similar, and it may indicate a measure of unreliability in the in vitro datasets (or high false negative rates).
Thanks for the comment. This is an important aspect which we will more extensively cover in the Discussion section of the revised manuscript.
The strong consistency of the CMK-1 recognition consensus sequences across the two in vitro dataset speaks against the unreliability of the analyses. Instead, there are a few points to highlight that explain the somewhat low degree of overlap between the two datasets, which indeed relate to the false negative rates as this reviewer suggests.
(1) In the peptide library analysis, Trypsin cleavage prior to kinase treatment will leave a charged N-term or C- terminus and in addition remove part of the protein context required for efficient kinase recognition. This will have a variable effect across the different substrates in the peptide library, depending on the distance between the cleavage site and the phosphosite, but will not affect the native protein library. This effect increases the false negative rate in the peptide library.
(2) The number and distribution of “available substrate phosphosites” diverge in the two libraries. Indeed, the peptide library is expected to contain a markedly larger diversity of potential CMK-1 substrate sites than the protein library (because the Trypsin digestion will reveal substrates that are normally buried in a native protein), but the depth of MS analysis is the same for the two libraries. In somewhat simplistic terms, the peptide-library analysis is prone to be saturated with abundant phosphorylated peptides, which prevent detecting all phosphosites. If the peptide analysis could have been made deeper, we would probably have increased the overlap (at the cost of increasing the number of false positive too).
(3) We have chosen quite strict criteria and applied them separately to define each hit list; therefore, we know we have many false negatives in each list, which will naturally reduce the expected overlap.
As we will clarify in the revised manuscript, we tend to give more trust to the protein-library dataset (since substrates are in a configuration closer to that in vivo), with those hits also present in the peptide dataset (like TAX-6 was) as the most convincing hits, as they could be validated in a second type of experiment.
Additionally, the rationale for, and distinction between, the two separate in vitro experiments is not made clear.
We reasoned that both substrate types have their own benefits and limitations (as discussed in the manuscript), so it was an added value to run both. We proposed that the subset of targets present in both datasets to be the most solid list of candidates. We will also reinforce our point in the revised discussion that the protein-library is likely to contain much less false positives.
Line 207: After reporting that both tax-6 and cnb-1 mutants have high spontaneous reversals, it is not made clear why cnb-1 is not further explored in the paper. Additionally, this spontaneous reversal data should be in a supplementary figure.
We kept the focus of the article primarily on TAX-6, because it was identified as CMK-1 target in vitro; CNB-1 was not. Moreover, we didn’t have cnb-1(gf) mutants to pursue the analysis, and we were stuck by the cnb-1(lf) constitutive high reversal rate for any further follow up. We have added a supplementary file to present the spontaneous reversals rates.
Figure 3 -S1: This model doesn't explain why the cmk-1(gf) group and the cmk-1(gf) +cyclo A group cause enhanced response decrement (presumably by reducing the inhibition by tax-6) but the +cyclo A group (inhibited tax-6) showed weaker response decrement, as here there is even further weakened inhibition of tax-6 on this process. Also, the cmk-1(lf) +cyclo A group is labeled as constitutive habituation, however, this doesn't appear to be the case in Figure 3 (seems like a similar initial level and response decrement phenotype to wildtype).
Thanks a lot for the comment. We are glad that the presentation of our complex dataset was clear enough to bring the reader to that level of detailed reflection and interpretation on the proposed model. To address the two points raised in this reviewer’s comment, we are modifying to the model presentation and provide additional clarifications below, where we use the term adaptation instead of habituation (as in the revised Figure):
Regarding the first point, “why the cmk-1(gf) group and the cmk-1(gf) +cyclo A group cause enhanced response decrement … but the +cyclo A group showed weaker response decrement”. This is really a very good point, that cannot be easily explained if all the branches (arrows) in the model have the same weight or work as ON/OFF switches. We tried to convey the relative importance of the regulation effect via the thickness of the arrow lines (which we will clarified in the legend in the revised ms). The main ‘quantitative’ nuances to take into consideration here originate from 2 assuption of the model (which we are clarifying in the revised manuscript):
Assumption 1: the inhibitory effect of TAX-6 on the CMK-1 anti-adaptation branch and the inhibitory effect of TAX-6 on the CMK-1 pro-adaptation branch are not of the same magnitude (we have further enhanced the line thickness differences in the revised model, top left panel for wild type).
Assumption 2: the two antagonistic direct effects of CMK-1 on adaptation are not of the same magnitude, most strikingly in the context of CMK-1(gf) mutants.
In our model, the cyclosporin A treatment alone (bottom left panel) causes a strong boost on the CMK-1 inhibitory branch and a less marked boost on the CMK-1 activator branch (following assumption 1). This causes an imbalance between the two antagonist direct CMK-1-dependent drives, which reduces (but doesn’t fully block) adaptation. Indeed, we don’t observe a total block of adaptation with cyclosporin A in wild type, the effect being significantly milder than the totally non-adapting phenotypes seen, e.g., in TAX-6(gf) mutants. From there, the question is what happen in CMK-1(gf) background that would mask the anti-adaptation effect of Cyclosporin A? Here assumption 2 is relevant, and the CMK-1(gf) pro-adaptation direct branch is always prevalent and imbalance the regulation toward faster adaptation (the role of TAX-6 becoming negligible in the CMK-1(gf) background and ipso facto that of Cyclosporin A).
Regarding the second point, “the cmk-1(lf) +cyclo A group is labeled as constitutive habituation”. We regret a confusing word choice in the first version of the manuscript; we intended to mean “normal habituation phenotype” but in the joint absence of antagonistic CMK-1 and TAX-6 regulatory signaling (so the regulation is not like in wild-type, but the phenotype ends up like in wild type). We are modifying the label to “normal adaptation” and will leave a note in the legend that an apparently normal adaptation phenotype seems to be the “default” situation when the two antagonistic regulatory pathways are shut off.
More discussion of the significance of the sites of cmk-1 and tax-6 function in the neural circuit should take place. Additionally, incorporating the suspected loci of cmk-1 and tax-6 in the neural circuit into the model would be interesting (using proper hypothetical language). For example, as it seems like AFD is not required for the naïve reversal response but just its reduction, cmk-1 activity in AFD might be generating inhibition of the reversal response by AFD. It certainly would be understandable if this isn't workable, given extrasynaptic signaling and other unknowns, but it potentially could also be helpful in generating a working model for these complex interactions. For example, cmk-1 induces AIZ inhibition of AVA (AIZ is electrically coupled to AFD), and tax-6 reduces RIM activation of AVA (these neurons are also electrically coupled according to the diagram). RIM is also a neuropeptide-rich neuron, so this could allow it to interact with the cmk-1-related process(es) in AFD. Some discussion of possibilities like this could be informative.
Thanks for the comment. These hypothetical inter-cellular communication pathways are indeed nice possibilities. On the other hand, we could envision several additional pathways. Following this helpful suggestion, we will expand the discussion of hypothetical models in the revised manuscript-
Provide an explanation for why some of the experiments in Figure 4 have such a high N, compared to other experiments.
The conditions with the highest n correspond to conditions which we have also used as ‘control’ condition for other type of experiments in the lab and as part of side projects, but which could be gathered for the present article. We have been working with cmk-1(lf) and tax-6(gf) mutants for many years… and the robust non-adapting phenotype was a reference point and a quality control when analyzing other non-adapting mutants.
Because the loss of function and gain of function mutations in cmk-1 have a similar effect, it is likely that this thermosensory plasticity phenotype is sensitive to levels of cmk-1 activity. Therefore, it is not surprising that the cmk-1 promoter failed to rescue very well as these plasmid-driven rescues often result in overexpression. Given this and that the cmk-1p rescue itself was so modest, these rescue experiments are not entirely convincing (and very hard to interpret; for example, is the AFD rescue or the ASER rescue more complete? The ASER one is actually closer to the cmk-1p rescue). Given the sensitivity to cmk-1 activity levels, a degradation strategy would be more likely to deliver clear results (or perhaps even the overactivation approach used for tax-6).
Thanks for the comment. We respectfully disagree with this reviewer’s statement “the loss of function and gain of function mutations in cmk-1 have a similar effect”. We suspect a confusion here, because our data clearly show that these two mutant types have an opposite phenotype. That being said, we interpret the weak rescue effect with cmk-1p as a probable result of overexpression or incomplete/imbalanced expression across neurons (as the promoter used might not include all the relevant regulatory regions). We dedicated considerable efforts to establish an endogenous CMK-1::degron knock in, for tissue-specific auxin-induced degradation (AID), but we were unfortunately not able to obtain consistent results. Unfortunately, the only useful data regarding CMK-1 place-of-action are the cell-specific rescue data already included in the report.
Reviewer #2 (Public review):
Summary:
The reduction in a response to a specific stimulus after repeated exposures is called habituation. Alterations in habituation to noxious stimuli are associated with chronic pain in humans, however, the underlying molecular mechanisms involved are not clear. This study uses the nematode C. elegans to study genes and mechanisms that underlie habituation to a form of noxious stimuli based on heat, termed thermo-noxious stimuli. The authors previously showed that the Calcium/Calmodulin-dependent protein kinase (CMK-1) regulates thermo-nociceptive habituation in the nematode C. elegans. Although CMK-1 is a kinase with many known substrates, the downstream targets relevant for thermo-nociceptive habituation are not known. In this study, the authors use two different kinase screens to identify phosphorylation targets of CMK-1. One of the targets they identify is Calcineurin (TAX-6). The authors show that CMK-1 phosphorylates a regulatory domain of Calcineurin at a highly conserved site (S443). In a series of elegant experiments, the authors use genetic and pharmacological approaches to increase or decrease CMK-1 and Calcineurin signaling to study their effects on thermo-nociceptive habituation in C. elegans. They also combine these various approaches to study the interactions between these two signaling proteins. The authors use specific promoters to determine in which neurons CMK-1 and Calcineurin function to regulate thermo-nociceptive habituation. The authors propose a model based on their findings illustrating that CMK-1 and Calcineurin act mostly in different neurons to antagonistically regulate habituation to thermo-nociceptive stimuli in a complex manner.
Strengths:
(1) Given the conservation of habituation across phylogeny, identifying genes and mechanisms that underlie nociceptive habituation in C. elegans may be relevant for understanding chronic pain in humans.
(2) The identification of canonical CaM Kinase phosphorylation motifs in the substrates identified in the CMK-1 substrate screen validates the screen.
(3) The use of loss and gain of function approaches to study the effects of CMK-1 and Calcineurin on thermo-nociceptive responses and habituation is elegant.
(4) The ability to determine the cellular place of action of CMK-1 and Calcineurin using neuron-specific promoters in the nematode is a clear strength of the genetic model system.
Thanks a lot for these positive remarks.
Weaknesses:
(1) The manuscript begins by identifying Calcineurin as a direct substrate of CMK-1 but ends by showing that CMK-1 and Calcineurin mostly act in different neurons to regulate nociceptive habituation which disrupts the logical flow of the manuscript.
We understand this point and we have carefully considered and (re-considered) the way to articulate the report. However, we could not present the story much differently as we would have no justification to investigate the role of TAX-6 and its interaction with CMK-1, if we would not have first identified it a phospho-target in vitro. Carefully considering this point, we found that the abstract of the first manuscript version was probably too cursory and susceptible to trigger wrong expectations among readers. We will extensively revise the abstract to clarify this point. Furthermore, we will reinforce this point in the last paragraph of the introduction.
(2) The physiological relevance of CMK-1 phosphorylation of Calcineurin is not clear.
We do agree and will explicitly discuss this aspect in the revised Discussion section, and make is also clear from the abstract on.
(3) It is not clear if Calcineurin is already a known substrate of CaM Kinases in other systems or if this finding is new.
We are not aware of any studies having shown Calcineurin is a direct target of CaM kinase I. But it was found to be substrate of CaM kinase II as well as of other kinases, as we explicitly presented in the discussion section. We will complement the text mentioning we are not aware of Calcineurin having so far been reported to by a CaM kinase I substrate.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Public Reviews:
Reviewer #1 (Public review):
Summary:
The paper by Lee and Ouellette explores the role of cyclic-d-AMP in chlamydial developmental progression. The manuscript uses a collection of different recombinant plasmids to up- and down-regulate cdAMP production, and then uses classical molecular and microbiological approaches to examine the effects of expression induction in each of the transformed strains.
Strengths:
This laboratory is a leader in the use of molecular genetic manipulation in Chlamydia trachomatis and their efforts to make such efforts mainstream is commendable. Overall, the model described and defended by these investigators is thorough and significant.
Weaknesses:
The biggest weakness in the document is their reliance on quantitative data that is statistically not significant, in the interpretation of results. These challenges can be addressed in a revision by the authors.
Thank you for these comments. We have generated new data, which we hope the reviewer will find more compelling. These will be included in a revised manuscript.
Reviewer #2 (Public review):
Summary:
This manuscript describes the role of the production of c-di-AMP on the chlamydial developmental cycle. Chlamydia are obligate intracellular bacterial pathogens that rely on eukaryotic host cells for growth. The chlamydial life cycle depends on a cell form developmental cycle that produces phenotypically distinct cell forms with specific roles during the infectious cycle. The RB cell form replicates amplifying chlamydia numbers while the EB cell form mediates entry into new host cells disseminating the infection to new hosts. Regulation of cell form development is a critical question in chlamydia biology and pathogenesis. Chlamydia must balance amplification (RB numbers) and dissemination (EB numbers) to maximize survival in its infection niche. The main findings In this manuscript show that overexpression of the dacA-ybbR operon results in increased production of c-di-AMP and early expression of the transitionary gene hctA and late gene omcB. The authors also knocked down the expression of the dacA-ybbR operon and reported a reduction in the expression of both hctA and omcB. The authors conclude with a model suggesting the amount of c-di-AMP determines the fate of the RB, continued replication, or EB conversion. Overall, this is a very intriguing study with important implications however the data is very preliminary and the model is very rudimentary and is not well supported by the data.
Thank you for your comments. Chlamydia is not an easy experimental system, but we will do our best to address the reviewer’s concerns in a revised submission.
Describing the significance of the findings:
The findings are important and point to very exciting new avenues to explore the important questions in chlamydial cell form development. The authors present a model that is not quantified and does not match the data well.
Describing the strength of evidence:
The evidence presented is incomplete. The authors do a nice job of showing that overexpression of the dacA-ybbR operon increases c-di-AMP and that knockdown or overexpression of the catalytically dead DacA protein decreases the c-di-AMP levels. However, the effects on the developmental cycle and how they fit the proposed model are less well supported.
dacA-ybbR ectopic expression:
For the dacA-ybbR ectopic expression experiments they show that hctA is induced early but there is no significant change in OmcB gene expression. This is problematic as when RBs are treated with Pen (this paper) and (DOI 10.1128/MSYSTEMS.00689-20) hctA is expressed in the aberrant cell forms but these forms do not go on to express the late genes suggesting stress events can result in changes in the developmental expression kinetic profile. The RNA-seq data are a little reassuring as many of the EB/Late genes were shown to be upregulated by dacA-ybbR ectopic expression in this assay.
As the reviewer notes, we also generated RNAseq data, which validates that late gene transcripts (including sigma28 and sigma54 regulated genes) are statistically significantly increased earlier in the developmental cycle in parallel to increased c-di-AMP levels. The lack of statistical significance in the RT-qPCR data for omcB, which shows a trend of higher transcripts, is less concerning given the statistically significantly RNAseq dataset. We have reported the data from three replicates for the RT-qPCR and do not think it would be worthwhile to attempt more replicates in an attempt to “achieve” statistical significance.
The authors also demonstrate that this ectopic expression reduces the overall growth rate but produces EBs earlier in the cycle but overall fewer EBs late in the cycle. This observation matches their model well as when RBs convert early there is less amplification of cell numbers.
dacA knockdown and dacA(mut)
The authors showed that dacA knockdown and ectopic expression of the dacA mutant both reduced the amount of c-di-AMP. The authors show that for both of these conditions, hctA and omcB expression is reduced at 24 hpi. This was also partially supported by the RNA-seq data for the dacA knockdown as many of the late genes were downregulated. However, a shift to an increase in RB-only genes was not readily evident. This is maybe not surprising as the chlamydial inclusion would just have an increase in RB forms and changes in cell form ratios would need more time points.
Thank you for this comment. We agree that it is not surprising given the shift in cell forms. The reduction in hctA transcripts argues against a stress state as noted above by the reviewer, and the RNAseq data from dacA-KD conditions indicates at least that secondary differentiation has been delayed. We will try to clarify this in a revision.
Interestingly, the overall growth rate appears to differ in these two conditions, growth is unaffected by dacA knockdown but is significantly affected by the expression of the mutant. In both cases, EB production is repressed. The overall model they present does not support this data well as if RBs were blocked from converting into EBs then the growth rate should increase as the RB cell form replicates while the EB cell form does not. This should shift the population to replicating cells.
We agree that it seems that perturbing c-di-AMP production, whether by knockdown or overexpressing the mutant DacA(D164N), has an overall negative impact on chlamydial growth. We have generated new data, which we think will address this. These new data will be included in a revised manuscript.
Overall this is a very intriguing finding that will require more gene expression data, phenotypic characterization of cell forms, and better quantitative models to fully interpret these findings.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Reviewer #1 (Public review):
Summary:
In this manuscript, the authors Eapen et al. investigated the peptide inhibitors of Cdc20. They applied a rational design approach, substituting residues found in the D-box consensus sequences to better align the peptides with the Cdc20-degron interface. In the process, the authors designed and tested a series of more potent binders, including ones that contain unnatural amino acids, and verified binding modes by elucidating the Cdc-20-peptide structures. The authors further showed that these peptides can engage with Cdc20 in the cellular context, and can inhibit APC/C<sup>Cdc20</sup> ubiquitination activity. Finally, the authors demonstrated that these peptides could be used as portable degron motifs that drive the degradation of a fused fluorescent protein.
Strengths:
This manuscript is clear and straightforward to follow. The investigation of different peptide variations was comprehensive and well-executed. This work provided the groundwork for the development of peptide drug modalities to inhibit degradation or apply peptides as portable motifs to achieve targeted degradation. Both of which are impactful.
Weaknesses:
A few minor comments:
(1) In my opinion, more attention to the solubility issue needs to be discussed and/or tested. On page 10, what is the solubility of D2 before a modification was made? The authors mentioned that position 2 is likely solvent exposed, it is not immediately clear to me why the mutation made was from one hydrophobic residue to another. What was the level of improvement in solubility? Are there any affinity data associated with the peptide that differ with D2 only at position 2?
The reviewer is correct that we have not done any detailed solubility characterisation; we refer only to observations rather than quantitative analysis. We wrote that we reverted from Leu to Ala due to solubility - we will clarify this statement to say that that we reverted to Ala, as it was the residue present in D1, for which we observed a measurable affinity by SPR and saw a concentration-dependent response in the thermal shift analysis. We do not have any peptides or affinity data that explore single-site mutations with the parental peptide of D2. D2 is included in the paper because of its link to the consensus D-box sequence and thus was the logical path to the investigations into positions 3 and 7 that come later in the manuscript.
(2) I'm not entirely convinced that the D19 density not observed in the crystal structure was due to crystal packing. This peptide is peculiar as it also did not induce any thermal stabilization of Cdc20 in the cellular thermal shift assay. Perhaps the binding of this peptide could be investigated in more detail (i.e., NMR?) Or at least more explanation could be provided.
This section will be clarified. The lack of observed density was likely due to the relatively low affinity of D19 and also to the lack of binding of the three C-terminal residues in the crystal, and consequently it has a further reduced affinity. The current wording in the manuscript puts greater emphasis on this second aspect being a D19-specific issue, even though it applies to all four soaked peptides. The extent of peptide-induced thermal stabilisations observed by TSA and CETSA is different, with the latter experiment consistently showing smaller shifts. This observation may be due to the more complex medium (cell lysate vs. purified protein) and/or different concentrations of the proteins in solution. In the CETSA, we over-expressed a HiBiT-tagged Cdc20, which is present in addition to any endogenously expressed Cdc20. Although we did not investigate it, the near identical D-box binding sites on Cdc20 and Cdh1 would suggest that there will be cross-specificity, which could further influence the CETSA experiments.
Reviewer #2 (Public review):
Summary:
The authors took a well-characterised (partly by them), important E3 ligase, in the anaphase-promoting complex, and decided to design peptide inhibitors for it based on one of the known interacting motifs (called D-box) from its substrates. They incorporate unnatural amino acids to better occupy the interaction site, improve the binding affinity, and lay foundations for future therapeutics - maybe combining their findings with additional target sites.
Strengths:
The paper is mostly strengths - a logical progression of experiments, very well explained and carried out to a high standard. The authors use a carefully chosen variety of techniques (including X-ray crystallography, multiple binding analyses, and ubiquitination assays) to verify their findings - and they impressively achieve their goals by honing in on tight-binders.
Weaknesses:
Some things are not explained fully and it would be useful to have some clarification. Why did the authors decide to model their inhibitors on the D-box motif and not the other two SLiMs that they describe?
For completeness, in addition to the D-box we did originally construct peptides based on the ABBA and KEN-box motifs, but they did not show any shift in melting temperature of cdc20 in the thermal shift assay whereas the D-box peptides did; consequently, we focused our efforts on the D-box peptides. Moreover, there is much evidence from the literature that points to the unique importance of the D-box motif in mediating productive interactions of substrates with the APC/C (i.e. those leading to polyubiquitination & degradation). One of the clearest examples is a study by Mark Hall’s lab (described in Qin et al. 2016), which tested the degradation of 15 substrates of yeast APC/C in strains carrying alleles of Cdh1 in which the docking sites for D-box, KEN or ABBA were mutated. They observed that whereas degradation of all 15 substrates depended on D-box binding, only a subset required the KEN binding site on Cdh1 and only one required the ABBA binding site. A more recent study from David Morgan’s lab (Hartooni et al. 2022) looking at binding affinities of different degron peptides concluded that KEN motif has very low affinity for Cdc20 and is unlikely to mediate degradation of APC/C-Cdc20 substrates. Engagement of substrate with the D-box receptor is therefore the most critical event mediating APC/C activity and the interaction that needs to be blocked for most effective inhibition of substrate degradation.
What exactly do they mean when they say their 'observation is consistent with the idea that high-affinity binding at degron binding sites on APC/C, such as in the case of the yeast 'pseudo-substrate' inhibitor Acm1, acts to impede polyubiquitination of the bound protein'? It's an interesting thing to think about, and probably the paper they cite explains it more but I would like to know without having to find that other paper.
Interesting results from a number of labs (Choi et al. 2008, Enquist-Newman et al. 2008, Burton et al. 2011, Qin et al. 2019) have shown that mutation of degron SLiMs in Acm1 that weaken interaction with the APC/C have the unexpected consequence of converting Acm1 from APC/C inhibitor to APC/C substrate. A necessary conclusion of these studies is that the outcome of degron binding (i.e. whether the binder functions as substrate or inhibitor) depends on factors other than D-box affinity and that D-box affinity can counteract them. One idea is that if a binder interacts too tightly, this removes some flexibility required for the polyubiquitination process. The most recent study on this question (Qin et al.2019) specifically pins the explanation for the inhibitory function of the high affinity D-box in Acm1 on its ‘D-box Extension’ (i.e. residues 8-12) preventing interaction with APC10. In our current study, the binding affinity of peptides is measured against Cdc20. In cellular assays however, the D-box must also engage APC10 for degradation to occur. It may be that the peptide binding most strongly to the D-box pocket on Cdc20 is less able to bind to APC10 and therefore less effective in triggering APC10-dependent steps in the polyubiquitination pathway. The important Hartooni et al. paper from David Morgan’s lab confirms that even though the binding of D-box residues to APC10 is very weak on its own, it can contribute 100X increase in affinity of a peptide by adding cooperativity to the interaction of D-box with co-activator.
After further reading on this topic, we will modify the relevant piece of text from:
“However, we found the opposite effect: D2 and D3 showed increased rates of mNeon degradation compared to D1 and D19 (Fig. 8C,D). This observation is consistent with the idea that high-affinity binding at degron binding sites on APC/C, such as in the case of the yeast ‘pseudo-substrate’ inhibitor Acm1, acts to impede polyubiquitination of the bound protein (Qin et al. 2019). Indeed, there is no evidence that Hsl1, which is the highest affinity natural D-box (D1) used in our study, is degraded any more rapidly than other substrates of APC/C in yeast mitosis. As shown in Qin et al., mutation of the high affinity D-box in Acm1 converts it from inhibitor to substrate (Qin et al. 2019). Overall, our results support the conclusions that all the D-box peptides engage productively with the APC/C and that the highest affinity interactors act as inhibitors rather than functional degrons of APC/C.”
to:
“However, we found the opposite effect: D2 and D3 showed increased rates of mNeon degradation compared to D1 and D19 (Fig. 8C,D). This observation is consistent with conclusions from other studies that affinity of degron binding does not necessarily correlate with efficiency of degradation. Indeed, there is no evidence that Hsl1, which is the highest affinity natural D-box (D1) used in our study, is degraded any more rapidly than other substrates of APC/C in yeast mitosis. A number of studies of a yeast ‘pseudo-substrate’ inhibitor Acm1, have shown that mutation of the high affinity D-box in Acm1 converts it from inhibitor to substrate (Choi et al. 2008, Enquist-Newman et al. 2008, Burton et al. 2011) through a mechanism that governs recruitment of APC10 (Qin et al. 2019). Our study does not consider the contribution of APC10 to binding of our peptides to APC/C<sup>Cdc20</sup> complex, but since there is strong cooperativity provided by this additional interaction (Hartooni et al. 2022) we propose this as the critical factor in determining the ability of the different peptides to mediate degradation of associated mNeon.”
Re Figure 6 and the fact that we did look at peptide binding in cells, these experiments were done in unsynchronised cells, so most Cdc20 would not be bound to APC/C.
Reviewer #3 (Public review):
Summary:
Eapen and coworkers use a rational design approach to generate new peptide-inspired ligands at the D-box interface of cdc20. These new peptides serve as new starting points for blocking APC/C in the context of cancer, as well as manipulating APC/C for targeted protein degradation therapeutic approaches.
Strengths:
The characterization of new peptide-like ligands is generally solid and multifaceted, including binding assays, thermal stability enhancement in vitro and in cells, X-ray crystallography, and degradation assays.
Weaknesses:
One important finding of the study is that the strongest binders did not correlate with the fastest degradation in a cellular assay, but explanations for this behavior were not supported experimentally. Some minor issues regarding experimental replicates and details were also noted.
Interesting results from a number of labs (Choi et al. 2008, Enquist-Newman et al. 2008, Burton et al. 2011, Qin et al. 2019) have shown that mutation of degron SLiMs in Acm1 that weaken interaction with the APC/C have the unexpected consequence of converting Acm1 from APC/C inhibitor to APC/C substrate. A necessary conclusion of these studies is that the outcome of degron binding (i.e. whether the binder functions as substrate or inhibitor) depends on factors other than D-box affinity and that D-box affinity can counteract them. One idea is that if a binder interacts too tightly, this removes some flexibility required for the polyubiquitination process. The most recent study on this question (Qin et al.2019) specifically pins the explanation for the inhibitory function of the high affinity D-box in Acm1 on its ‘D-box Extension’ (i.e. residues 8-12) preventing interaction with APC10. In our current study, the binding affinity of peptides is measured against Cdc20. In cellular assays however, the D-box must also engage APC10 for degradation to occur. It may be that the peptide binding most strongly to the D-box pocket on Cdc20 is less able to bind to APC10 and therefore less effective in triggering APC10-dependent steps in the polyubiquitination pathway. The important Hartooni et al. paper from David Morgan’s lab confirms that even though the binding of D-box residues to APC10 is very weak on its own, it can contribute 100X increase in affinity of a peptide by adding cooperativity to the interaction of D-box with co-activator.
After further reading on this topic, we will modify the relevant piece of text from:
“However, we found the opposite effect: D2 and D3 showed increased rates of mNeon degradation compared to D1 and D19 (Fig. 8C,D). This observation is consistent with the idea that high-affinity binding at degron binding sites on APC/C, such as in the case of the yeast ‘pseudo-substrate’ inhibitor Acm1, acts to impede polyubiquitination of the bound protein (Qin et al. 2019). Indeed, there is no evidence that Hsl1, which is the highest affinity natural D-box (D1) used in our study, is degraded any more rapidly than other substrates of APC/C in yeast mitosis. As shown in Qin et al., mutation of the high affinity D-box in Acm1 converts it from inhibitor to substrate (Qin et al. 2019). Overall, our results support the conclusions that all the D-box peptides engage productively with the APC/C and that the highest affinity interactors act as inhibitors rather than functional degrons of APC/C.”
to:
“However, we found the opposite effect: D2 and D3 showed increased rates of mNeon degradation compared to D1 and D19 (Fig. 8C,D). This observation is consistent with conclusions from other studies that affinity of degron binding does not necessarily correlate with efficiency of degradation. Indeed, there is no evidence that Hsl1, which is the highest affinity natural D-box (D1) used in our study, is degraded any more rapidly than other substrates of APC/C in yeast mitosis. A number of studies of a yeast ‘pseudo-substrate’ inhibitor Acm1, have shown that mutation of the high affinity D-box in Acm1 converts it from inhibitor to substrate (Choi et al. 2008, Enquist-Newman et al. 2008, Burton et al. 2011) through a mechanism that governs recruitment of APC10 (Qin et al. 2019). Our study does not consider the contribution of APC10 to binding of our peptides to APC/C<sup>Cdc20</sup> complex, but since there is strong cooperativity provided by this additional interaction (Hartooni et al. 2022) we propose this as the critical factor in determining the ability of the different peptides to mediate degradation of associated mNeon.”
Re Figure 6 and the fact that we did look at peptide binding in cells, these experiments were done in unsynchronised cells, so most Cdc20 would not be bound to APC/C.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public review):
In the manuscript entitled "A VgrG2b fragment cleaved by caspase-11/4 promotes Pseudomonas aeruginosa infection through suppressing the NLRP3 inflammasome", Qian et al. found an activation of the non-canonical inflammasome, but not the downstream NLRP3 inflammasome, during the infection of macrophage by P. aeruginosa, which is in sharp contrast to that by E. coli (Figure 1). In realizing that the suppression of the NLRP3 inflammasome is Caspase-11 dependent, the authors performed a screening among P. aeruginosa proteins and identified VgrG2b being a major substrate of Caspase-11 (Figure 2). Next, the authors mapped the cleavage site on VgrG2b to D883, and demonstrated that cleavage of VgrG2b by Caspase-11 is essential for the suppression of the NLRP3 inflammasome (Figure 3). Furthermore, they found that a binding between the C-terminal fragment of the cleaved VgrG2b and NLRP3 existed (Figure 4), which was then proved to block the association of NLRP3 with NEK7 (Figure 5). Finally, the authors demonstrated that blocking of VgrG2b cleavage, by either mutation of the D883 or administration of a designed peptide, effectively improved the survival rate of the P. aeruginosa-infected mice (Figure 6). This is a well-designed and executed study, with the results clearly presented and stated.
We are deeply grateful for your recognition and positive comments on our article. Thank you for your effort and dedication in reviewing our manuscript. We are honored to have the opportunity to receive feedback form professional reviewers like you.
Reviewer #2 (Public review):
Summary:
In their manuscript, Quian and colleagues identified a novel mechanism by which Pseudomonas control inflammatory responses upon inflammasome activation. They identified a caspase-11 substrate (VgrG2b) which, upon cleavage, binds and inhibits the NLRP3 to reduce the production of pro-inflammatory cytokines. This is a unique mechanism that allows for the tailoring of the innate immune response upon bacterial recognition.
Strengths:
The authors are presenting here a novel conceptual framework in host-pathogen interactions. Their work is supported by a range of approaches (biochemical, cellular immunology, microbiology, animal models), and their conclusions are supported by multiple independent evidences. The work is likely to have an important impact on the innate immunity field and host-pathogen interactions field and may guide the development of novel inhibitors.
Weaknesses:
Although quite exhaustive, a few of the authors' conclusions are not fully supported (e.g., caspase-11 directly cleaving VgrG2b, the unique affinity of VgrG2b-C for NLRP3) and would require complementary approaches to validate their findings fully. This is minimal.
We sincerely appreciate your professional review and kind appraisal on our article. These comments are really valuable and helpful for improving our manuscript. According to your suggestions, we have made some modifications and added some supplemental data to make our results more convincing. The detailed responses are listed point-by-point below.
Recommendations for the authors:
Reviewer #2 (Recommendations for the authors):
I really enjoyed reading your manuscript and believe this is an important conceptual advance for the innate immunity field. Your conclusions are in general well-supported, you used a range of methodologies and the quality of the presentation of the results is excellent. I have a few comments here that I hope will contribute to improving an already great piece of work:
Elements to be improved:
Line 109-110: the author claims that the release of mito DNA is required for NLRP3 activation. ' I would support this with a reference. I believe this may not be fully agreed on in the field. Cleavage of GSDMD by caspase4/11 is required, however. A few groups showed the required for K+ efflux in this context (Broz, Brough, Schroder labs).
It is a very good suggestion. Indeed, there is still controversy over this issue, and we have revised our text to make our manuscript more neutral. We have also cited these important references to help readers understand where the controversy lies.
I disagree that OMV _+ Pseudomonas is a natural way to simulate natural infection. I would argue it is even quite artificial. Pseudomonas alone should be sufficient to generate OMV without the addition of extra OMVs.
This is a good point. Before we infected BMDM cells with PAO1 stains, we had washed with PBS for at least three times to exclude the interference of contents in the LB medium. Moreover, in our experimental system, the time for co-incubation between bacteria and host cells is very limited. During this time, the amount of OMV secreted by bacteria may not reach the level of activating inflammasomes, and this concentration is also relatively low compared to the OMV concentration secreted by bacteria under physiological conditions. Therefore, we added extra OMVs to simulate the chronic infection condition in a short time.
The co-expression of caspase with VrG2b and assume the cleavage is direct. However, the work is lacking work with recombinant proteases (commercially available), which would strengthen their conclusions regarding the ability of caspase-4/11 to directly cleave the protein. Based on the recognised sequence (DXXD), I believe caspase-4/11 is not directly responsible for this. These caspases were shown to cleave caspase-3/7, which can cleave such sequence (DXXX). As caspase-4 can cleave caspase-3/7 in their lysates, I would recommend testing this hypothesis to further strengthen the authors' conclusions.
These are very good points. As data shown on Fig. 3F, we used recombinant VgrG2b and caspase-11 p22/p10 to prove the direct cleavage of caspase-11. To exclude the effect of caspase-3/7, we treated cells with inhibitors of caspase-3/7 and found that caspase-3/7 are not the executor for VgrG2b cleavage (new Fig. S3E, F).
The affinity between caspase-11 and VgrG2b-C is puzzling as one would normally expect the caspase and its substrates to quickly dissociate. Does VgrG2b-C impact the activity of caspase-4/11 upon cleavage? Can VrgG2b-C also interact with p20/p10 caspase-1? I believe the authors only tried the full-length version of caspase-1 in supplemental.
These are very good questions. We agree enzymes and substrates only have temporary interactions normally, which are not easy to catch. However, we used mutant caspase-11(C254A) inhibiting its cleavage of substrates, so that the combination of VgrG2b or VgrG2b-C with caspase-11(C254A) could be detected. This mutation is frequently used in immunoprecipitation (Wang K, Cell, 2020). We had tested the impact of VgrG2b-C on the enzyme activity of caspase-4/11, and showed that VgrG2b-C did not affect the cleavage of GSDMD by caspase-11 (Fig. 5C). We also tried the caspase-1 p20/p10, also found that they had no interaction with VgrG2b-C (new Fig. S4G).
Can more details be provided about the generation of recombinant caspase-11, VgrG2b-C, and other recombinant proteins tested?
Thanks for your suggestion, we have revised our description in the new version.
The authors assumed that VgrG2C-b does not impact other inflammasome (such as NLRC4) based on their X-gal assay. I would also confirm this with a functional assay (e.g., transfection of flagellin in macrophages).
This is a good suggestion. We have tested the impact of VgrG2b-C on NLRC4 inflammasome and found that VgrG2b-C does not affect NLRC4 activation with the transfection of flagellin (new Fig. S5K).
Often, representative experiments are shown. For Elisa, cell death assays and quantitative experiments, pooling the data would be appropriate. Appropriate statistical analysis should be conducted based on this as well.
Thanks for your suggestions. In the revised manuscript, we pooled the data of three independent experiments for our analysis of ELISA and cell death assays. We also added descriptions of statistical analysis in our revised text.
VgrG2b has been suggested to be a metalloprotease (PMID: 31577948). Is its protease activity required for the phenomenon observed?
This is a very good question. The active region of metalloprotease VgrG2b-C is aa932-941, especially the core sequence of HEXXH. Structure data also confirms that H935, E936, H939, E983 play key roles in the coordination with Zn ions (Sana TG, mBio, 2015; Wood TE, Cell reports, 2019). In our study, the cleavage of VgrG2b by caspase-4/11 depends on the recognition of tetrapeptide sequence in aa880-883. We added data showing that the cleavage of VgrG2b and the inhibition of NLRP3 inflammasome were not affected by VgrG2b enzymatic activity (new Fig. S4I-K).
What is the affinity of VgrG2b-C for NLRP3? Is it higher than NEK7? A quantitative experiment would be required to claim this.
This is a great point of view. We added the quantitative data certifying that VgrG2b-C has higher affinity with NLRP3 compared with NEK7 in the revised manuscript (326 nM VS 681 nM).
The Material and Method section is a bit light and would benefit from adding more information (e.g. cell density, microscopy details, number of cells imaged, etc).
Thanks for your suggestion. We have added more details in the Material and Method section in revised manuscript.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
We thank the reviewers for their concise and detailed summaries, and appreciate the constructive feedback on the article’s strengths and weaknesses. In response, we plan to strengthen our work in a revised version by presenting the model assumptions for the electrocyte more explicitly and further elaborate on the generalisability of the results to other cell types with different ion channels including calcium and chloride.
Experimental work is beyond the scope of our modelling-based study. However, we would like our work to serve as a framework for future experimental studies into the role of the electrogenic pump current (and its possible compensatory currents) in disease, and its role in evolution of highly specialised excitable cells (such as electrocytes).
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the previous reviews.
Reviewer #1:
I am satisfied with all clarifications and additional analyses performed by the authors.
The only concern I have is about changes in running after [AM+VM] mismatches.
The authors reported that they "found no evidence of a change in running speed or pupil diameter following [AM + VM] mismatch (Figures S5A)" (line 197).
Nevertheless, it seems that there is a clear increase in running speed for the [AM+VM] condition (S5A). Could this be more specifically quantified? I am concerned that part of the [AM+VM] could stem from this change in running behavior. Could one factor out the running contribution?
Please excuse, this was unintentionally omitted. We have added the quantification to Table S1 and included the results of the significance test in (Fig S2A, Fig S4A and Fig S5A). The increase in running speed upon MM presentation (0.5 – 1 s), compared to the baseline running speed in the time window preceding MM presentation (-0.5 – 0 s), was not significant in any of the tested conditions.
In the process of adding the statistics, we noticed an unfortunate inconsistency in our figures that relates to Figure S5A. The data shown in all other Figures is aligned to the onset of audiomotor mismatch. In Figure S5A, however, the data were aligned to the onset of the visuomotor mismatch. As there is a differential delay in the closed loop coupling of auditory and visual feedback of approximately 170 ms (as described in the methods), visuomotor mismatch onset is slightly before audiomotor mismatch onset. We have corrected this now in the manuscript but have done the statistical analysis for both old and new versions of the figure. In neither case do we find evidence of a running speed response.
The authors thoroughly addressed the concerns raised. In my opinion, this has substantially strengthened the manuscript, enabling much clearer interpretation of the results reported. I commend the authors for the response to review. Overall, I find the experiments elegantly designed, and the results robust, providing compelling evidence for non-hierarchical interactions across neocortical areas and more specifically for the exchange of sensorimotor prediction error signals across modalities.
We are happy to hear!
Reviewer #2:
The incorporation of the analysis of the animal's running speed and the pupil size upon sound interruption improves the interpretation of the data. The authors can now conclude that responses to the mismatch are not due to behavioral effects.
The issue of the relationship between mismatch responses and offset responses remains uncommented. The auditory system is sensitive to transitions, also to silence. See the work of the Linden or the Barkat labs (including the work of the first author of this manuscript) on offset responses, and also that of the Mesgarani lab (Khalighinejad et al., 2019) on responses to transitions 'to clean' (Figure 1c) in human auditory cortex. Offset responses, as the first author knows well, are modulated by intensity and stimulus length (after adaptation?). That responses to the interruption of the sound are similar in quality, if not quantity, in the closed and open loop conditions suggest that offset response might modulate the mismatch response. A mismatch response that reflects a break in predictability would presumably be less modulated by the exact details of the sensory input than an offset response. Therefore, what is the relationship between the mismatch response and the mean sound amplitude prior to the sound interruption (for example during the preceding 1 second)? And between the mismatch response and the mean firing rate over the same period?
Finally, how do visual stimuli modulate sound responses in the absence of a mismatch? Is the multimodal response potentiation specific to a mismatch?
There are probably two points important to clarify before answering the question – just to make sure there is no semantic misunderstanding.
(1) In the jargon of predictive processing, a prediction error is a deviation from a predictable relationship. This can be sensorimotor coupling (as in audio- and visuomotor mismatch), stimulus history (as in oddball, or sound offset responses), surround sensory input (as in endstopping response and center-surround effects in visual processing), etc. A sound offset perceived by an animal in an open loop condition is thus a negative prediction error based on stimulus history (this assumes the animal has no way to predict the time of offset – as is the case in our experiments). We are primarily interested in our work here in characterizing negative prediction errors that result from motor-related predictions – hence the comparison we use is unpredictable sound offset in closed-loop coupling vs. unpredictable sound offset in open-loop coupling. The first is a mixture of an audiomotor prediction error and a stimulus history prediction error. The second is just a stimulus history prediction error. Thus, we compare the two types of responses to isolate the component that can only be attributed to audiomotor prediction errors.
(2) Audiomotor mismatch responses can of course be explained in a large variety of ways. For example, one could consider a sound offset a sensory stimulus. One could further assume that locomotion increases sensory responses. If so, one could explain audiomotor mismatch responses as a locomotion related gain of a sensory offset response. However, we need to further postulate that this locomotion related gain is stimulus specific, as for sound onset responses there is no detectable difference between locomotion and sitting. Thus, we are left with a model that explains audiomotor mismatch responses as a “stimulus specific locomotion gain of sensory responses”. This is correct – it is just not very satisfying, has no computational basis, and makes no useful predictions (see e.g. https://pubmed.ncbi.nlm.nih.gov/36821437/ for an extended treatise of exactly this point for visuomotor mismatch responses).
That responses to the interruption of the sound are similar in quality, if not quantity, in the closed and open loop conditions suggest that offset response might modulate the mismatch response.
Conceptually both a “sound offset” and an “audiomotor mismatch” are negative prediction errors. Could one describe the effect we see as an audiomotor mismatch modulating a sound offset? Certainly. But if the reviewer means modulate in the sense of neuromodulatory – we are not aware of a neuromodulatory responses that would be fast enough (or be strong enough to have these effects – we have looked into ACh, NA, and Ser (unpublished – no MM response)). Alternatively, they could simply add linearly (as predictive processing would predict). Given that AM mismatch responses are likely computed in auditory cortex, we see no reason to speculate that anything more complicated is happening than a linear summation of different prediction error responses.
A mismatch response that reflects a break in predictability would presumably be less modulated by the exact details of the sensory input than an offset response. Therefore, what is the relationship between the mismatch response and the mean sound amplitude prior to the sound interruption (for example during the preceding 1 second)? And between the mismatch response and the mean firing rate over the same period?
The reviewer’s intuition here – that mismatch responses have a lower resolution than what one thinks of as sensory responses (or sound offset responses) – is probably not warranted. Experiments that quantify the resolution of mismatch responses are relatively data intense – and to the best of our knowledge this has only been done once in the visual system for visuomotor mismatch responses (Zmarz and Keller, 2016). Here we found that visuomotor mismatch responses exhibited matched spatial (in visual space) resolution to that of visual responses.
Regarding the suggested analyses: In a closed loop session, the sound amplitude preceding the mismatch is directly related to the running speed of the mouse. In visual cortex, the amplitude of visuomotor mismatch responses linearly scales with running speed (and consequently visual flow speed) prior to the mismatch – as predicted by predictive processing. See e.g. figure 4B in (Zmarz and Keller, 2016). We have tried this analysis for audiomotor mismatches in the previous round of reviews, but we fear we do not have sufficient data to address this question properly. If we look at how mismatch responses change as a function of locomotion speed (sound amplitude) across the entire population of neurons, we have no evidence of a systematic change (and the effects are highly variable as a function of speed bins we choose). However, just looking at the most audiomotor mismatch responsive neurons, we find a trend for increased responses with increasing running speed (Author response image 1). We analyzed the top 5% of cells that showed the strongest response to mismatch (MM) and divided the MM trials into three groups based on running speed: slow (10-20 cm/s), middle (20-30 cm/s), and fast (>30 cm/s). Given the fact that we have on average 14 mismatch events in total per neuron, the analysis when split by running speed is under-powered.
Author response image 1.
The average response of strongest AM MM responders to AM mismatches as a function of running speed (data are from 51 cells, 11 fields of view, 6 mice).
Regarding the relationship between mismatch response and firing rate prior to mismatch, we are not sure we understand the intuition. Does the reviewer mean, the average firing rate of the mismatch neuron? Or the population mean? The first is likely uninterpretable as it is bound to be confounded by regression to the mean type artefacts. But in either case, we would have no prediction of what to expect.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public Review):
Summary:
The authors demonstrate that, while the loss of Ezrin increases lysosomal biogenesis and function, its presence is required for the specific endocytosis of EGFR. Upon further investigation, the authors reveal that Ezrin is a crucial intermediary protein that links EGFR to AKT, leading to the phosphorylation and inhibition of TSC. TSC is a critical negative regulator of the mTORC1 complex, which is dysregulated in various diseases, making their findings a valuable addition to multiple fields of study. Their cell signaling findings are translatable to an in vivo Medaka fish model and suggest that Ezrin may play a crucial role in retinal degeneration.
Strengths:
Giamundo, Intartaglia, et al. utilized unbiased proteomic and transcriptomic screens in Ezrin KO cells to investigate the mechanistic function of Ezrin in lysosome and cell signaling pathways. The authors' findings are consistent with past literature demonstrating Ezrin's role in the EGFR and mTORC1 signaling pathways. They used several cell lines, small molecule inhibitors, and cellular and in vivo knockout models to validate signaling changes through biochemical and microscopy assays. Their use of multiple advanced microscopy techniques is also impressive.
We are grateful to the Editor and the Reviewers for their important and constructive comments, which amended us to improve our manuscript. We have now carried out new experiments and analyses to further support our findings.
Weaknesses:
While the authors demonstrated activation of TSC1 (lysosomal accumulation) and inactivation of Akt (decreased phosphorylation in TSC1), as well as decreased mTORC1 signaling in Ezrin knockout cells, direct experiments showing the rescue of mTORC1 activity by AKT and TSC1 mutants are required to confirm the linear signaling pathway and establish Ezrin as a mediator of EGFR-AKTTSC1-mTORC1 signaling. Although the authors presented representative images from advanced microscopy techniques to support their claims, there is insufficient quantification of these experiments. Additionally, several immunoblots in the manuscript lack vital loading controls, such as input lanes for immunoprecipitations and loading controls for western blots.
We wish to thank the Reviewer for his/her important and constructive comments on our manuscript and to consider that our study provides new information for understanding the mechanism regulating TSC/mTORC1 pathway. We have now extensively revised the manuscript according to his/her suggestions. Indeed, to expand on the evidence demonstrating Ezrin as a mediator of EGFR-AKTTSC1-mTORC1 signaling, the revised manuscript includes quantification of all advanced microscopy images, rescue experiments demonstrating the role of Ezrin in AKT/TSC/mTORC1 molecular network, and controls for WBs and immunoprecipitations.
Reviewer #2 (Public Review):
Summary:
The authors begin with the stated goal of gaining insight into the known repression of autophagy by Ezrin, a major membrane-actin linker that assembles signaling complexes on membranes. RNA and protein expression analysis is consistent with upregulation of lysosomal proteins in Ezrin-deficient MEFs, which the authors confirm by immunostaining and western blotting for lysosomal markers. Expression analysis also implicates EGF signaling as being altered downstream of Ezrin loss, and the authors demonstrate that Ezrin promotes relocalization of EGFR from the plasma membrane to endosomes. Ezrin loss impacts downstream MAPK/Akt/mTORC1 signaling, although the mechanistic links remain unclear. An Ezrin mutant Medaka fish line was then generated to test Ezrin's role in retinal cells, which are known to be sensitive to changes in autophagy regulation. Phenotypes in this model appear generally consistent with observations made in cultured cells, though mild overall.
Strengths:
Data on the impact of Ezrin-loss on relocalization of EGFR from the plasma membrane are extensive, and thoroughly demonstrate that Ezrin is required for EGFR internalization in response to EGF.
A new Ezrin-deficient in vivo model (Medaka fish) is generated.
Strong data demonstrates that Ezrin loss suppresses Akt signaling. Ezrin loss also clearly suppresses mTORC1 signaling in cell culture, although examination of mTORC1 activity is notably missing in Ezrin-deficient fish.
We thank the Reviewer for the recognition of our study and apologize for the insufficient evidence reported in the previous version of the manuscript. As requested by the Reviewer, we considerably expanded the number of experiments to support EZRIN/EGFR/TSC molecular network in regulating autophagy pathway in the revised manuscript. Furthermore, following the Reviewer’s comment we have expanded the interpretation of our findings in the "Discussion” section. We hope the new version of our manuscript will satisfy the Reviewer’s worries.
Weaknesses:
LC3 is used as a readout of autophagy, however the lipidated/unlipidated LC3 ratio generally does not appear to change, thus there does not appear to be evidence that Ezrin loss is affecting autophagy in this study.
We certainly agree with the Reviewer on the importance of this issue and apologize for the lack of clarity. Ezrin is an already widely characterized protein participating autophagy pathway. Several studies, including our previous studies, demonstrated that both silencing and pharmacological inhibition of Ezrin may promote autophagy by promoting activation of TFEB, in part through the TRPML1-calcineurin signaling pathway (Naso et al 2020; Intartaglia et al 2022; Lou et al 2024). However, a full elucidation on how Ezrin controls autophagy is still not unknown. As suggested by the Reviewer, to reinforce our data, we have now fixed this inaccuracy by better elucidating this aspect in the revised manuscript. Accordingly, we have monitored the autophagic flux and LC3 expression level following the guidelines for the use and interpretation of assays for monitoring autophagy (4th edition) by Klionsky et al. 2021. The data presented in the new Figure supplement 1 now better support the notion that depletion of Ezrin increases autophagic flux. We hope the new version of our manuscript will satisfy the Reviewer’s worries.
The conclusion is drawn that Ezrin loss suppresses EGF signaling, however this is complicated by a strong increase in phosphorylation of the p38 MAPK substrate MK2. Without additional characterization of MAPK and Erk signaling, the effect of Ezrin loss remains unclear. Causative conclusions between effects on MAPK, Akt, and mTORC1 signaling are frequently drawn, but the data only demonstrate correlations. For example, many signaling pathways can activate mTORC1 including MAPK/Erk, thus reduced mTORC1 activity upon Ezrin-loss cannot currently be attributed to reduced Akt signaling. Similarly, other kinases can phosphorylate TSC2 at the sites examined here, so the conclusion cannot be drawn that Ezrin-loss causes a reduction in Akt-mediated TSC2 phosphorylation.
We agree with the Reviewer that this is an interesting and important question. However, we respectfully disagree with the Reviewer and feel that addressing this point by additional studies on both MAPK and ERK pathways, as the Reviewer suggests, is outside the scope of this manuscript. We therefore prefer to address these questions in future studies. However, following the Reviewer’s comment we have expanded the interpretation of our findings in the "Discussion” section. We hope the new version of our manuscript will satisfy the Reviewer’s worries.
In Figure 7, the conclusion cannot be drawn that retinal degeneration results from aberrant EGFR signaling.
We certainly agree with the Reviewer on the importance of this issue. We now fixed this inaccuracy by adding TUNEL staining that showed the retinal degeneration in Ezrin KO medaka fish. The results of these assays are described in the Results section and documented in revised Figure 7, panels H.
It is unclear why TSC1 is highlighted in the title, as there does not appear to be any specific regulation of TSC1 here.
We modified the title accordingly
In Figure 1 the conclusion is drawn that there is an increase in lysosome number with Ezrin KO, however it does not appear that the current analysis can distinguish an increased number from increased lysosome size or activity. Similarly, conclusions about increased lysosome "biogenesis" could instead reflect decreased turnover.
Following this Reviewer’s observation, we changed the text according to his/her suggestion.
Immunoprecipitation data for a role for Ezrin as a signaling scaffold appear minimal and seem to lack important controls.
We apologize for these inaccuracies. We have now carried out new experiments to further support our findings. Moreover, all blots were changed for better exposed images. In the revised Figures the controls were showed.
In Figure 3A it seems difficult to conclude that EGFR dimerization is reduced since the whole blot, including the background between lanes, is lighter on that side.
We now fixed this inaccuracy. The blots were changed for better exposed images in revised Figure 3, panel A. and quantified
In Figure 6C specificity controls for the TSC1 and TSC2 antibodies are not included but seem necessary since their localization patterns appear very different from each other in WT cells.
We apologize because we have created some confusion. We have now emended this mistake and revised all panels in Figure 6C (now Figure 6D) for consistency between figures and text. Concerning the specificity of TSC1 and TSC2 antibodies and staining, indeed, antibodies labelling was showing the ordinary pattern from TSC in the cells as stated in Menon et al. 2014. We would like to point out that the antibodies are the same indicated in Menon et al. 2014 and our data are not only based on TSC1 and TSC2 staining but on a considerable number of in vivo and in vitro experiments in which many and different markers were used by performing several complementary approaches (i.e. immunofluorescence, western blot analysis, Omics, etc.)
Menon S, Dibble CC, Talbott G, Hoxhaj G, Valvezan AJ, Takahashi H, Cantley LC, Manning BD. Spatial control of the TSC complex integrates insulin and nutrient regulation of mTORC1 at the lysosome. Cell. 2014 Feb 13;156(4):771-85.
In Figure 7 the signaling effects in Ezrin-deficient fish are mild compared to cultured cells, and effects on mTORC1 are not examined. Further data on the retinal cell phenotypes would strengthen the conclusions.
We thank the Reviewer for his/her comment. We have now fixed this inaccuracy in the revised manuscript. We added the analysis for p4EBP1 (S65), a mTORC1 substrate Figure 7 panel D.
In Figure 7F there appears to be more EGFR throughout the cell, so it is difficult to conclude that more EGFR at the PM in Ezrin-/- fish means reduced internalization.
We agree with the Reviewer that it is an important question that helped us to improve the quality of the data presented. As correctly noted by the Reviewer, EGFR protein level is increased due to EZRIN deletion. This is evident in Figure 7 panel F, in line with both proteomic analysis and in vitro experiments (Figure 2I; Figure 3E; Figure 5C). We also agree that the increase of EGFR protein level could strength the background of immunofluorescence. Therefore, to better represent the EGFR membrane translocation on flat mount RPE from medaka lines, we add a highlighting box showing it in both WT and KO medaka line in the revised Figure 7 panel F.
Reviewer #3 (Public Review):
Summary:
In this study, the authors have attempted to demonstrate a critical role for the cytoskeletal scaffold protein Ezrin, in the upstream regulation of EGFR/AKT/MTOR signaling. They show that in the absence of Ezrin, ligand-induced EGFR trafficking and activation at the endosomes is perturbed, with decreased endosomal recruitment of the TSC complex, and a corresponding decrease in AKT/MTOR signaling.
Strengths:
The authors have used a combination of novel imaging techniques, as well as conventional proteomic and biochemical assays to substantiate their findings. The findings expand our understanding of the upstream regulators of the EGFR/AKT MTOR signaling and lysosomal biogenesis, appear to be conserved in multiple species, and may have important implications for the pathogenesis and treatment of diseases involving endo-lysosomal function, such as diabetes and cancer, as well as neuro-degenerative diseases like macular degeneration. Furthermore, pharmacological targeting of Ezrin could potentially be utilized in diseases with defective TFEB/TFE3 functions like LSDs. While a majority of the findings appear to support the hypotheses, there are substantial gaps in the findings that could be better addressed. Since Ezrin appears to directly regulate MTOR activity, the effects of Ezrin KO on MTOR-regulated, TFEB/TFE3 -driven lysosomal function should be explored more thoroughly. Similarly, a more convincing analysis of autophagic flux should be carried out. Additionally, many immunoblots lack key controls (Control IgG in co-IPs) and many others merit repetition to either improve upon the quality of the existing data, validate the findings using orthogonal approaches, or provide a more rigorous quantitative assessment of the findings, as highlighted in the recommendation for authors.
We thank the Reviewer for the recognition of our study and apologize for the inaccuracies previously. We also greatly appreciate the efforts the reviewer went through with his/her support and help for the improvement of our manuscript. We considerably expanded the number of experiments to support EZRIN/EGFR/AKT network in controlling mTORC1 pathway in the revised manuscript as requested by the Reviewer. We hope the new version of our manuscript will satisfy the Reviewer’s worries.
Reviewer #1 (Recommendations for The Authors):
Major comments:
(1) While the authors show that, in the absence of Ezrin, TSC accumulates on the lysosome and suppresses mTORC1 signaling, they should perform additional genetic experiments to strengthen their conclusions. Can they knockout or knockdown TSC1/2 in Ezrin-deficient cells to rescue mTORC1 activity? Can they mutate the lysosomal localization signal on TSC1 (TSC1Q149E/R204E/K238E) in Ezrin-deficient cells to rescue mTORC1 activity? Does constitutively active AKT (myr-AKT or AKT-E40K) restore mTORC1 activity in Ezrin-deficient cells?
We agree with the Reviewer that it is an important concern that helped us to improve the quality of the data presented. We now provide in the revised version of Figure supplement 4F the results of pharmacological inhibition of Ezrin on MEF-TSC2 KO cells. In line with our findings, the lack of TSC2 is able to rescue mTORC1 signaling in absence of Ezrin activity. Thus, these data strongly support that Ezrin is required for TORC1pathway via TSC complex targeting.
(2) In the absence of Ezrin, TSC1 constitutively localizes on the lysosome and suppresses mTORC1. Does this suppression hold in the presence of other mTORC1-activating signals (i.e., amino acids, insulin, oxygen)?
Following the reviewer’s suggestion we now provide this information in the revised Figure 6C, in which we showed that stimulation with insulin does not exert its activating effect on mTORC1 signaling (i.e. phosphorylation of pP70 S6 - pT389). These new data, together with the experiments on MEF TSC2 KO cells, clearly support the model by which Ezrin works as a scaffold protein connecting ATK signaling to TSC complex. The lack of Ezrin induces a disconnection between AKT and TSC complex, which is translocated on lysosomes and insensitive to inhibition of AKT signaling.
(3) In Figure 3A, the authors showed EGFR dimerization through a western blot of a crosslinking assay. However, the western blot data are unclear and do not strongly support their statement. Additionally, the authors mentioned that the dimerization is confirmed by immunofluorescence analysis, but this statement should be revised since the imaging analysis only indirectly shows the copresence of EZR and EGFR, not necessarily the dimerized EGFR. The authors should perform additional experiments to strengthen their claim or tone down their statements in the text and model figure.
We certainly agree with the Reviewer on the importance of this issue and now we have fixed this inaccuracy in the revised manuscript. The blots of crosslinking were changed for better exposed images in revised Figure 3, panel A. Moreover, we also properly quantified signals to support our conclusion.
(4) It is interesting that Ezrin binds EGFR, AKT, and TSC as a scaffolding protein. To define the mechanisms by which Ezrin interacts with AKT, EGFR, and TSC, can the authors perform domain analyses to determine which regions of Ezrin are required for its binding with AKT, EGFR, and TSC in mediating EGFR-AKT-TSC-mTORC1 signaling?
We thank the Reviewer for his/her comment that improves our manuscript. Conducting domain analysis in the lab would be ideal, although this seems to us a long tour de force that might be associated to several technical and experimental issues. However, in silico approaches provide a helpful alternative for generating initial hypotheses about domain-domain interactions, though they should be seen as a starting point rather than a complete solution. Recent advances in fold prediction suggest that AlphaFold3 could be used to predict dimer formation and, consequently, domain-domain interactions. However, such an approach is challenging in this case because some of the considered proteins are transmembrane, and all are prone to form multimeric complexes with multiple partners, making them poor candidates for reliable fold predictions. In fact, the predicted dimers are poorly supported, and AlphaFold3 lacks confidence in the relative positioning of interactors, limiting its interpretability. Alternatively, database mining and machine-learning methods, such as HINT, Domine, and PPIDomainMiner, provide more robust evidence. Indeed, these tools allow us to consistently identify a strong interaction between Ezrin's FERM central domain and EGFR's PK domain shown now in the Figure Supplement 2C and Supplement Figure 3C-H. Importantly, these findings generate valuable hypotheses, therefore experimental validation is still necessary. But we prefer to leave it for future studies.
Minor Comments:
(1) There are several immunoblots that did not have adequate controls: - In Figure 2D, an input lane should be shown for each of the cell lysates to demonstrate the presence of other proteins in the cell lysate used for the IP.
We have now fixed this inaccuracy in the revised manuscript.
- Figure 3A does not have a loading control. Also, immunoblot quality should be significantly improved.
We have now fixed this inaccuracy in the revised manuscript.
- The HER2 western blot in Figure 5C does not accurately represent the data shown in the quantification graph.
We have now fixed this inaccuracy by replacing HER2 western blot in the revised Figure 5C.
- In Figure 6A, the authors should include an input as a control for the IP. To further support their claim in the model figure, can the authors also probe the IP lysate for Ezrin and Tsc2? If all are indeed in a complex together, they should be present.
Following this Reviewer’s observation, we add the input as control in the IP in the revised Figure 6A. Moreover, we include the immunoprecipitation data for the EZRIN and TSC2 interaction, accordingly (Figure 6A).
- Phosphorylation sites across figures should be uniformly annotated for consistency and ease of understanding, e.g., pTSC2(S939), pS6K1(T389), and pAKT(S473).
We have now fixed this inaccuracy in the revised text.
(2) There are several microscopy data that lack adequate quantification. For instance, Figures 2E, 2F, 3C, 4A, 5A, and 6F only show very few cells as representative images, which is not sufficient to support their claims.
We thank the Reviewer for his/her comment that improves our manuscript. Accordingly, we add adequate quantification and statistical analysis in the revised Figures, accordingly.
(3) Some suggestions to improve the readability of the manuscript:
- In the abstract (line 32): "Loss of Ezrin was deficient in TSC repression by EGF and culminated in translocation of TSC to lysosomes triggering suppression of mTORC1 signaling." The wording is somewhat confusing, please change such as "Loss of Ezrin was not sufficient to repress TSC by EGF and culminated..." or "Loss of Ezrin blunted EGF-induced TSC suppression and culminated..."
We apologize for the lack of clarity and now we have fixed this inaccuracy by better elucidating this aspect in the revised manuscript.
- Figure 3D has a typo in the western blot labeling. Please change Citosol to Cytosol.
We have now fixed this inaccuracy in the revised text.
- Line 291: "Moreover, TSC2 resulted activated and AKT/mTOR signaling..." The wording is confusing.
We have now fixed this inaccuracy in the revised text. The text now reads: “Moreover, we found that TSC2 was dephosphorylated in response to light in the retina, when inactive Ezrin (Naso et al., 2020) and EGFR are weakly expressed (Figure supplement 6C) as a consequence of a decrease of the AKT/mTORC1 signaling…..)
- The model in Figure 8 indicates that upon EGF stimulation, the activated Ezrin interacts with EGFR, causing its dissociation from actin filaments and leading to its endosome incorporation. However, the authors did not provide supporting data for this claim. Can the authors either cite literature or provide data for this? Otherwise, the model should be edited to remove actin filaments in the model.
We have now fixed this inaccuracy by removing actin filaments in the revised model.
Reviewer #2 (Recommendations For The Authors):
The data and written text seem to deal entirely with mTORC1, rather than mTORC2, thus it seems "mTOR" should be changed to "mTORC1" throughout.
We have now fixed this inaccuracy in the revised manuscript.
For clarification, the TSC protein complex should be referred to as the "TSC complex", whereas "TSC" generally refers to the tumor syndrome Tuberous Sclerosis Complex.
We have now fixed this inaccuracy in the revised manuscript.
Quantification of colocalization would be helpful in all the panels where it is currently missing.
We thank the Reviewer for his/her comment that improves our manuscript. Accordingly, we add adequate quantification of colocalization for each immunofluorescence in the revised Figures, accordingly.
Line 84 typo "thorough" should be "through"
We have now fixed this inaccuracy in the revised manuscript.
Line 178 - typo
We have now fixed this inaccuracy in the revised manuscript.
Line 209 - typo
We have now fixed this inaccuracy in the revised manuscript.
Reviewer #3 (Recommendations For The Authors):
Fig. 1 The data showing an increase in lysosomal biogenesis suggests an increase in transcriptional activity. This should be confirmed by one or more of the following: 1) Increased TFEB/TFE3 nuclear localization following EZR loss, 2) Increased CLEAR promoter luciferase activity assays, 3) Increased expression of multiple CLEAR transcripts (https://www.science.org/doi/10.1126/science.1174447) or 4) Increased TFEB/ TFE3/ CLEAR gene signatures by RNA seq. Similarly, data showing increased autophagic flux should be confirmed in the presence of chloroquine or bafilomycin.
We agree with the Reviewer that it is an important concern that helped us to improve the quality of the data presented. It is well established that a major mechanism regulating TFEB activity is represented by the nuclear translocation. We have now carried out new experiments demonstrating that depletion of Ezrin induces TFEB nuclear translocation in Ezrin<sup>-/-</sup> cells. These findings are in line with our previous data in which pharmacological inhibition and silencing of Ezrin induced the same cellular phenotype. We also apologize because we have created some confusion, because we already carried out experiments with Bafilomycin to confirm the increase of autophagic flux. Therefore, the blots of autophagic flux were changed for better exposed images in revised Figure supplement 1H and the text was modified to emphasize these findings, accordingly.
Fig 2D, the lanes with EZR -/- cells expressing the EZR mutants should be repeated on the same gel as the first 2 lanes (with the WT and EZR<sup>-/-</sup> cells)
We thank the Reviewer for his/her comment that improves our manuscript. In order to avoid any confusion, when describing the results in Figure 2D, we have now modified the Figure 2D, providing the required controls in the response to Reviewer #1 and #2. We hope the new version of our data will satisfy the Reviewer’s worries.
Fig 2F- The presence of reduced EGFR in intracellular compartments in Ezrin KO/ -/- cells should be quantified, and shown for a 2nd EZR null cell line as well (Ezrin null MEFs)
We added EGFR quantification in Figure 2F. We have now carried out new experiments demonstrating that EGFR is localized on cytoplasmic membrane in MEF Ezrin KO (Figure supplement 2H), accordingly.
Fig 2G, did the authors test the effects of EZR depletion on basal and EGF stimulated EGFR autophosphorylation on Y1068 and Y1045 as well as downstream activation of p42/44 ERK MAPK? Those should be tested in the HeLa system as well as the MEFs cells with EZR KO.
Following the Reviewer’s request, we have now added western blot data for EGFR autophosphorylation on Y1068 and p42/44 ERK MAPK in Figure 5C. Moreover, we have now added western blot data for p42/44 ERK MAPK on MEF cells in Figure supplement 2F. In contrast, we cannot provide any data for EGFR autophosphorylation on Y1068, because the antibody was not working on proteins from MEF cells.
Also, why would HER3 levels be expected to decrease? There seems to be minimal change in HER3 expression. Also, the significance of increased MK2 phosphorylation should be further elaborated.
The Reviewer raised justified concerns about the HER3 and MK2. We have discussed these aspects in the "results section”, accordingly.
Fig 3A- Crosslinking of EGFR is not very apparent in this blot. The crosslinking blots should be repeated 3 times and quantified.
We certainly agree with the Reviewer on the importance of this issue and now we have fixed this inaccuracy in the revised manuscript. The blots of crosslinking were changed for better exposed images in revised Figure 3, panel A. Moreover, we also properly quantified signals to support our conclusion.
Fig 3D- How were membrane endosomes isolated? This should be stated in the methods. Membrane/ Cytosol and Endosome fractionation showing EGFR levels should be shown in Ezrin null MEFs as well, and membrane expression should be further substantiated with surface biotinylation for cell surface EGFR.
We now report more information about the method that we used for membrane endosomes isolation in the Materials and Methods section. Following the Reviewer’s request, we also show that EGFR was not localized on endosomes upon EGF on Ezrin null MEFs. This data was reported in the new revised Figure Supplement 2G. Moreover, we have now carried out new experiments demonstrating the membrane localization of EGFR in MEF Ezrin KO cells. These findings are shown in Figure supplement 2H.
Fig 5C: Similar to 2G, EGFR autophosphorylation on Y1068 and Y1045 should also be measured, as well as downstream activation of p42/44 ERK MAPK?
Following the Reviewer’s request, we have now carried out new experiments to assess the EGFR autophosphorylation on Y1068 and Y1045, as well as downstream activation of p42/44 ERK MAPK. We added these new data in the revised Figure 5C, accordingly.
Fig 5D: Similar to 3D, Membrane/ Cytosol and Endosome fractionation showing EGFR levels should be shown in Ezrin null MEFs as well, and further substantiated with surface biotinylation for cell surface EGFR.
Following the Reviewer’s request, we show that EGFR was not localized on endosomes upon EGF (Figure Supplement 2G).
Supplement 2E: The blots show lower expression of EGFR and higher MAPK activation in EZR KO cells, contradicting the data in the other cells.
We apologize because we have created some confusion. It occurred during the preparation of Figure supplement 2E, reflecting image of a previous not finalized version of the Figure. We have now removed the error and replaced with a correct WB panel.
Supplement 2F: The authors should repeat the NSC668394 experiment using: 1) multiple doses, 2) In both the Ezrin KO and null cell lines 3) and repeat 3X to quantify differences in total EGFR.
We respectfully disagree with the Reviewer and feel that addressing this point by additional studies on dose response of NSC668394, as the Reviewer suggests, is outside the scope of this manuscript. However, we would like to point out that we have already conducted extensive studies on the doseresponse effects of NSC668394 administration in vitro (Patent: WO2020070333A1).
Moreover, we apologize for not having provided enough information about the number of biological independent replicates for WB analyses. Therefore, to fill this gap of information we have expanded the Material and Methods section, accordingly.
Patent: WO2020070333A1 - Ezrin inhibitors and uses thereof
Fig 6A: The IP experiments should be repeated with Control IgG
We have now fixed this inaccuracy in the revised manuscript.
Typos:
(1) Figure 3D: Citosol
We have now fixed this inaccuracy in the revised manuscript.
(2) Line 216-217: "increased EGFR protein 217 levels on purified membranes and endosomes (Figure 3D and E)" - That should be decreased EGFR on endosomes in accordance with Figure 3D (lower panels)
We have now fixed this inaccuracy in the revised manuscript.
(3) Abstract: "Consistently, Medaka fish deficient for Ezrin exhibit defective endo-lysosomal pathway"
We have now fixed this inaccuracy in the revised manuscript.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Review:
Reviewer #1:
(1) To support the finding that texture is not represented in a modular fashion, additional possibilities must be considered. These include (a) the effectiveness and specificity of the texture stimulus and control stimuli, (b) further analysis of possible structure in images that may have been missed, and (c) limitations of imaging resolution.
Thank you for your comments. To address your concerns, we have conducted a new 3T fMRI experiment to demonstrate the effectiveness and specificity of our stimuli, performed further analyses to investigate possible structure of texture-selective activation, and discussed the limitations of imaging resolution.
(a) To demonstrate the effectiveness and specificity of our stimuli, we conducted a new 3T fMRI experiment in five participants using an experimental design and texture families similar to those in Freeman (2013). Six texture stimuli in the 7T experiment were also included. To assess the effectiveness of each stimulus type, different texture families and their corresponding noise patterns were presented in separate blocks for 24 seconds, at a high presentation rate of 5 frames per second. In Figure S7, all texture families showed significantly stronger activation in V2 compared to their corresponding noise patterns, even for those that ‘appeared’ to have residual texture (e.g., the third texture family). These results demonstrate that our texture vs. noise stimuli were effective in producing texture-selective activations in area V2. Compared to the 7T results, the 3T data showed a notable increase in texture-selective activations in V2, likely due to increased stimulus presentation speed (1.25 vs. 5 frames/second). Future studies should use stimuli with faster presentation speed to validate our results in the 7T experiment.
(b)Thank you for pointing out the possible structures of texture-selective activations in the peripheral visual field (Figure S1). In further analyses, we also found stronger texture selectivity in more peripheral visual fields (Figure 2D), and there were weak but significant correlations in the texture-noise activation patterns during split-half analysis (Author response image 2). Although this is not strong evidence for columnar organization of naturalistic textures, it suggests a possibility for modular organizations in the peripheral visual field.
(c) Although our fMRI result at 1-mm isotropic resolution did not show strong evidence for modular processing of naturalistic texture in V2 stripe columns, this does not exclude the possibility that smaller modules exist beyond the current fMRI resolution. We have discussed this possibility in the revised manuscript.
We hope this response clarifies our findings, and we have revised the conclusions in the manuscript accordingly.
(2) More in-depth analysis of subject data is needed. The apparent structure in the texture images in peripheral fields of some subjects calls for more detailed analysis. e.g Relationship to eccentricity and the need for a 'modularity index' to quantify the degree of modularity. A possible relationship to eccentricity should also be considered.
Based on your recommendations, we have performed further analysis and found interesting results regarding the modularity index in relation to eccentricity. As shown in Figure 2D, the texture-selectivity index increased as eccentricity. This may suggest a higher possibility of modular organization for texture representation in the peripheral compared to central visual fields. We have updated our results in Figure 2C, and discussed this possibility in the revised manuscript.
(3) Given what is known as a modular organization in V4 and V3 (e.g. for color, orientation, curvature), did images reveal these organizations? If so, connectivity analysis would be improved based on such ROIs. This would further strengthen the hierarchical scheme.
Following your recommendations, we have conducted further analysis to investigate the potential modular organizations in V4 and V3ab. In Figure S9 (Figure S9), vertices that are most responsive to color, disparity and texture were shown in a representative subject. Indeed, texture-selective patches can be found in both V4 and V3ab, along with the color- and disparity-selective patches. We agree with you that there should be pathway-specific connectivity among the same type of functional modules. In the informational connectivity analyses, we already used highly informative voxels by feature selection, which should mainly represent information from the modular organizations in these higher visual areas.
Reviewer #2:
(1) In lines 162-163, it is stated that no clear columnar organization exists for naturalistic texture processing in V2. In my opinion, this should be rephrased. As far as I understand, Figure 2B refers to the analysis used to support the conclusion. The left and middle bar plots only show a circular analysis since ROIs were based on the color and disparity contrast used to define thin and thick stripes. The interesting graph is the right plot, which shows no statistically significant overlap of texture processing with thin, thick, and pale stripe ROIs. It should be pointed out that this analysis does not dismiss a columnar organization per se but instead only supports the conclusion of no coincidence with the CO-stripe architecture.
Thank you for your suggestions. Reviewer #1 also raised a similar concern. We agree that there may be a smaller functional module of textures in area V2 at a finer spatial scale than our fMRI resolution. We have rephrased our conclusions to be more precise.
(2) In Figure 3, cortical depth-dependent analyses are presented for color, disparity, and texture processing. I acknowledge that the authors took care of venous effects by excluding outlier voxels. However, the GE-BOLD signal at high magnetic fields is still biased to extravascular contributions from around larger veins. Therefore, the highest color selectivity in superficial layers might also result from the bias to draining veins and might not be of neuronal origin. Furthermore, it is interesting that cortical profiles with the highest selectivity in superficial layers show overall higher selectivity across cortical depth. Could the missing increase toward the pial surface in other profiles result from the ROI definition or overall smaller signal changes (effect size) of selected voxels? At least, a more careful interpretation and discussion would be helpful for the reader.
We agree with you that there will be residual venous effects even after removing voxels containing large veins. However, calculating the selectivity index largely removed the superficial bias (Figure 3). In the revised manuscript, we discussed the limitations of cortical depth-dependent analysis using GE-BOLD fMRI.
In Line 397-403: “Due to the limitations of the T2*w GE-BOLD signal in its sensitivity to large draining veins (Fracasso et al., 2021; Parkes et al., 2005; Uludag & Havlicek, 2021), the original BOLD responses were strongly biased towards the superficial depth in our data (Figure S8). Compared to GE-BOLD, VASO-CBV and SE-BOLD fMRI techniques have higher spatial specificity but much lower sensitivity (Huber et al., 2019). As shown in a recent study (Qian et al., 2024), using differential BOLD responses in a continuous stimulus design can significantly enhance the laminar specificity of the feature selectivity measures in our results (Figure 3).”
It is unlikely that the strongest color selectivity index in the superficial depth is a result of stronger signal change or larger effect size in this condition. As shown by the original BOLD responses in Figure S8, all stimulus conditions produced robust activations that strongly biased to the superficial depth. High texture selectivity was also found in V4 and V3ab across cortical depth, which showed a flat laminar profile.
(3) I was slightly surprised that no retinotopy data was acquired. The ROI definition in the manuscript was based on a retinotopy atlas plus manual stripe segmentation of single columns. Both steps have disadvantages because they neglect individual differences and are based on subjective assessment. A few points might be worth discussing: (1) In lines 467-468, the authors state that V2 was defined based on the extent of stripes. This classical definition of area V2 was questioned by a recent publication (Nasr et al., 2016, J Neurosci, 36, 1841-1857), which showed that stripes might extend into V3. Could this have been a problem in the present analysis, e.g., in the connectivity analysis? (2) The manual segmentation depends on the chosen threshold value, which is inevitably arbitrary. Which value was used?
A previous study showed that the retinotopic atlas of early visual areas (V1-V3) aligned very well across participants on the standard surface after surface-based registration by the anatomical landmarks (Benson 2018). Thus, the group-averaged atlas should be accurate in defining the boundaries of early visual areas. To directly demonstrate the accuracy of this method, retinotopic data were acquired in five participants in a 3T fMRI experiment. A phase-encoded method was used to define the boundaries of early visual areas (black lines in Author response image 1), which were highly consistent with the Benson atlas.
Although a few feature-selective stripes may extend into V3, these stripe patterns were mainly represented in V2. Thus, the signal contribution from V3 is likely to be small and should not affect the pattern of results. The activation map threshold for manual segmentation was abs(T)>2. We have clarified this in the revised methods.
Author response image 1.
Retinotopic ROIs defined by the Benson atlas (left) and the polar angle map (right) of the representative subject. Black lines denote the boundaries of early visual areas based on the retinotopic map from the subject.
Benson, N. C., Jamison, K. W., Arcaro, M. J., Vu, A. T., Glasser, M. F., Coalson, T. S., Van Essen, D. C., Yacoub, E., Ugurbil, K., Winawer, J., & Kay, K. (2018). The Human Connectome Project 7 Tesla retinotopy dataset: Description and population receptive field analysis. J Vis, 18(13), 23. https://doi.org/10.1167/18.13.23
(4) The use of 1-mm isotropic voxels is relatively coarse for cortical depth-dependent analyses, especially in the early visual cortex, which is highly convoluted and has a small cortical thickness. For example, most layer-fMRI studies use a voxel size of around isotropic 0.8 mm, which has half the voxel volume of 1 mm isotropic voxels. With increasing voxel volume, partial volume effects become more pronounced. For example, partial volume with CSF might confound the analysis by introducing pulsatility effects.
We agree that a 1-mm isotropic voxel is much larger in volume than a 0.8-mm isotropic voxel, but the resolution along the cortical depth is not a big difference. In addition to our study, a previous study showed that fMRI at 1-mm isotropic resolution is capable of resolving cortical depth-dependent signals (Roefs et al., 2024; Shao et al., 2021). We have discussed these issues about fMRI resolution in the revised manuscript.
In Line 403-408: “Compared to the submillimeter voxels, as used in most laminar fMRI studies, our fMRI resolution at 1-mm isotropic voxel may have a stronger partial volume effect in the cortical depth-dependent analysis. However, consistent with our results, previous studies have also shown that 7T fMRI at 1-mm isotropic resolution can resolve cortical depth-dependent signals in human visual cortex (Roefs et al., 2024; Shao et al., 2021).”
Shao, X., Guo, F., Shou, Q., Wang, K., Jann, K., Yan, L., Toga, A. W., Zhang, P., & Wang, D. J. J. (2021). Laminar perfusion imaging with zoomed arterial spin labeling at 7 Tesla. NeuroImage, 245, 118724. https://doi.org/10.1016/j.neuroimage.2021.118724
Roefs, E. C., Schellekens, W., Báez-Yáñez, M. G., Bhogal, A. A., Groen, I. I., van Osch, M. J., ... & Petridou, N. (2024). The Contribution of the Vascular Architecture and Cerebrovascular Reactivity to the BOLD signal Formation across Cortical Depth. Imaging Neuroscience, 2, 1–19.
(5) The SVM analysis included a feature selection step stated in lines 531-533. Although this step is reasonable for the training of a machine learning classifier, it would be interesting to know if the authors think this step could have reintroduced some bias to draining vein contributions.
We excluded vertices with extremely large signal change and their corresponding voxels in the gray matter when defining ROIs. The same number of voxels were selected from each cortical depth for the SVM analysis, thus there was no bias in the number of voxels from the superficial layers susceptible to large draining veins.
Reviewer #3:
The authors tend to overclaim their results.
Re: Thank you for your comments. We added more control analyses to strengthen our findings, and gave more appropriate discussion of results.
Recommendations for the authors:
Reviewer #1:
(1) Controls: There is a bit more complexity than is expressed in the introduction. The authors hypothesize that the emergence of computational features such as texture may be reflected in specialized columns. That is, if texture is generated in V2, there may be texture columns (perhaps in the pale stripes of V2); but if generated at a higher level, then no texture columns would be needed. This is a very interesting and fundamental hypothesis. While there may be merit to this hypothesis, the demonstration that color and disparity are modular but not texture falls short of making a compelling argument. At a minimum, the finding that texture is not organized in V2 requires additional controls. (a) To boost the texture signal, additional texture stimuli or a sequence of multiple texture stimuli per trial could be considered. (b) Unfortunately, the comparison noise pattern also seems to contain texture; perhaps a less textured control could be designed. (c) It also appears that some of the texture images in Supplementary Figure S1 contain possible structure, e.g. in more peripheral visual fields. (d) Is it possible that the current imaging resolution is not sufficient for revealing texture domains? (e) Note that 'texture' may be a property that defines surfaces and not contours. Thus, while texture may have orientation content, its function may be associated with the surface processing pathways. A control stimulus might contain oriented elements of a texture stimulus that do not elicit texture percept; such a control might activate pale and/or thick stripes (both of which contain orientation domains), while the texture percept stimulus may activate surface-related bands in V4.
Thank you for your suggestions. They are extremely helpful in improving our manuscript. For the controls you mentioned in (a-d), we discussed them in the public review that we also attached below.
(a) and (b): To demonstrate the effectiveness and specificity of our stimuli, we conducted a new 3T fMRI experiment in five participants using an experimental design and texture families similar to those in Freeman (2013). All texture stimuli in the 7T experiment were also included. To assess the effectiveness of each stimulus type, different texture families and their corresponding noise patterns were presented in separate blocks for 24 seconds, at a high presentation rate of 5 frames per second. In Figure S7, all texture families showed significantly stronger activation in V2 compared to their corresponding noise patterns, even for those that ‘appeared’ to have residual texture (e.g., the third texture family). These results suggest that our texture stimuli were effective in producing texture-selective activations in area V2 compared to the noise control. Compared to the 7T results, the 3T data showed a notable increase in texture-selective activations in V2, likely due to the increased stimulus presentation speed (1.25 vs. 5 frames/second). Weak texture activations might preclude the detection of columnar representations in the 7T experiment.
(c) Thank you for pointing out the possible structures of texture-selective activations in the peripheral visual field (Figure S1). In further analyses, we also found stronger texture selectivity in more peripheral visual fields (Figure 2D), and there were weak but significant correlations in the texture-noise activation patterns during split-half analysis (Author response image 2). Although these are not strong evidence for columnar organization of naturalistic textures, it suggests a possibility for such organizations in the peripheral visual field.
(d) Although our fMRI result at 1-mm isotropic resolution did not show strong evidence for modular processing of naturalistic texture in V2 stripe columns, this does not exclude the possibility that smaller modules exist beyond the current fMRI resolution. We have discussed these limitations in the revised manuscript.
We fully agree with your explanation in (e). It fits our data very well. Both texture and control stimuli strongly activated the CO-stripes (Figure 2 and Figure 2D), while modular organizations for texture were found in V4 and V3ab (Figure S9). We have discussed this explanation in the revised manuscript.
In Line 371-374: “Consistently, our pilot results also revealed modular organizations for textures in V4 and V3ab (Figure S9). These texture-selective organizations may be related to surface representations in these higher order visual areas (Wang et al., 2024).”
(2) Overly simple description of FF, FB circuitry. The classic anatomical definition of feedforward is output from a 'lower' area, in most cases predominantly arising from superficial layers and projecting to middle layers of a 'higher area' (Felleman and Van Essen 1991). This description holds for V1-to-V2, V2-to-V3, and V2-to-V4. [Note there are also feedforward projections from central 5 degrees of V1-to-V4 (cf. Ungerleider) as well as V3-to-V4.] The definition of feedback can be more varied but is generally considered from cells in superficial and deep layers of 'higher' areas projecting to superficial and deep layers of 'lower' areas. Feedback inputs to V1 heavily innervate Layer 1 and superficial Layer 2, as well as the deep layers. Note that feedback connections from V2 to V1, similar to that from V1 to V2, are functionally specific, i.e. thin-to-blob and pale/thick-to interblob (Federer...Angelucci 2021, Hu...Roe 2022). Thus, current views are moving away from the dogma that feedback is diffuse. Recognition that feedback may be modular introduces new ideas about analysis.
Thanks for your detailed recommendations. We have expanded the discussion of circuit models of functional connectivity in the introduction. Our model and experiments primarily aim to investigate how higher-level areas provide feedback to the V2 area. While we acknowledge that feedback may indeed be functionally specific, our methodology has some certain advantages: it ensures signal stability and avoids the double-dipping issue. Meanwhile, it also focuses on voxels with high feature selectivity, which may already be included in the modular organizations of early visual areas. In the functional connectivity analysis, we performed feature selection to use the most informative voxels. These voxels with high feature selectivity should already be included in the modular organizations of early visual areas. Identifying functionally specific feedback connections between modular areas will be an important and meaningful work for future research. We have added a discussion of this topic in the revised manuscript.
In Line 136-138: “Only major connections were shown here. There are also other connections, such as V1 interblobs projecting to thick stripes (Federer et al., 2021; Hu & Roe, 2022; Sincich and Horton, 2005).”
(3) Imaging superficial layers: Although removal of the top layer of cortical voxels (top 5% of voxels) is a common method for dealing with surface vascular artifact contribution to BOLD signal, it likely removes a portion of the Layer 1&2 feedback signals. Is this why the authors define feedback and deep layer to deep layer? If so, both superficial and deep-layer data in Figure 4 should be explicitly explained and discussed.
Thank you for pointing this out. We would like to clarify the surface-based method removing vascular artifact. The vertices influenced by large pial veins were first defined on the cortical surface, and then voxels were removed from the entire columns corresponding to these vertices to avoid sampling bias along the cortical depth. Thus, there should be complete data from all cortical depths for the remaining columns. We defined the feedback connectivity from deep layers to deep layers because it represents strong feedback connections according to literature (Markov et al., 2013; Ullman, 1995) and also avoids confounding the feedforward signals from superficial layers.
Markov, N. T., Vezoli, J., Chameau, P., Falchier, A., Quilodran, R., Huissoud, C., Lamy, C., Misery, P., Giroud, P., Ullman, S., Barone, P., Dehay, C., Knoblauch, K., & Kennedy, H. (2014). Anatomy of hierarchy: feedforward and feedback pathways in macaque visual cortex. The Journal of comparative neurology, 522(1), 225–259. https://doi.org/10.1002/cne.23458
Ullman S. (1995). Sequence seeking and counter streams: a computational model for bidirectional information flow in the visual cortex. Cerebral cortex, 5(1), 1–11. https://doi.org/10.1093/cercor/5.1.1
(4) More detail on other subjects in Figure S1. Ten subjects conducted visual fixation and used a bite bar. Imaging data are illustrated in detail from one subject and the remaining subjects are depicted in graphs and in Supplemental Figure S1. Please provide arrowheads in each image to help guide the reader. Some kind of summary or index of modularity would also be helpful.
Thanks for your suggestions. There are arrowheads in each image in our original manuscript and we have revised Figure S1 for better illustration. Additionally, we have added a table summarizing the number of stripes to provide a clearer overview.
(5) How are ROIs in V3ab and V4 defined? V2 ROIs were defined (thin, thick, and pale stripe), but V3ab and V4 averaged across the whole area. Why not use the most activated "domains" from V3ab and V4? How does this influence connectivity analysis?
Thank you for your question. We defined V4 and V3ab on the cortical surface using a retinotopic atlas (Benson 2018), which has been shown to be quite accurate in defining ROIs for the early visual areas. Since all ‘domains’ showed robust BOLD activation to our stimuli, we used voxels from the entire ROI in the depth-dependent analysis. In the functional connectivity analysis, we used the most informative voxels by feature selection, which should already be included in the feature domains.
Minor:
English language editing is needed.
Thank you for your feedback. We have carefully revised the manuscript for clarity and readability.
Line 31 "its" should be "their".
Thank you. We have corrected "its" to "their".
Replace 'representative subject' with 'subject'.
We have replaced "representative subject" with "subject" in the manuscript.
Replace 'naturalistic texture' with 'texture'.
Thank you for your suggestion. The textures used in our experiment were generated based on the algorithm by Portilla and Simoncelli (2000), and the term "naturalistic texture" was used to be consistent with literature. The textures used in our study are different from traditional artificial textures, as they contain higher-order statistical dependencies. Following your recommendations, we have replaced ‘naturalistic texture’ with ‘texture’ in some places in the main text to improve readability.
Typo: Line 126, Fig 2B should be 1B.
Thank you. We have corrected "Fig 2B" to "Fig 1B" in Line 128.
Fig. 2A: point out where are texture domains in anterior V2.
The texture-selective activations in anterior V2 (corresponds to peripheral visual field) have been highlighted by arrowheads.
Fig 2B, 3 legend: Round symbols are for each subject?
Yes, the round symbols in Figures 2B represent data for individual participants. We have revised the legend for clarity.
Fig. 3: Disparity and texture values do not look different across depth (except may the V2 texture values).
While the difference in feature selectivity is small across cortical depths, they are highly consistent across participants. We have provided a figure showing the original BOLD responses in the revised manuscript (Figure S8 and Figure S8). Data from individual subjects were also available at Open Science Framework (OSF, https://doi.org/10.17605/OSF.IO/KSXT8 (‘rawBetaValues.mat’ in the data directory)).
Line 57-59 The statement is not strictly accurate. V1 also has color, orientation, and motion representations.
Thank you for your feedback. Our statement was intended to convey that M and P information from the geniculate input are transformed into representations of color, orientation, disparity, and motion in the primary visual cortex. We have clarified this point in the revised manuscript.
In Line 58-60: “In the primary visual cortex (V1), the M and P information from the geniculate input are transformed into higher-level visual representations, such as motion, disparity, color, orientation, etc. (Tootell & Nasr, 2017).”
Fig. 1B V1 interblobs also project to thick stripes (Sincich and Horton).
Thank you for the additional information. We appreciate your input. Our figure is intended as a simplified schematic and does not fully represent all the connections. We have discussed this reference in the revised manuscript.
In Line 136-138: “Only major connections were shown here. There are also other connections, such as V1 interblobs projecting to thick stripes (Federer et al., 2021; Hu & Roe, 2022; Sincich and Horton, 2005).”
Line 207 "suggesting that both local and feedforward connections are involved in processing color information in area V2." Logic? English?
Thank you for pointing this out. The superficial layers are involved in local intracortical processing by lateral connections and also send output to higher order visual areas along the feedforward pathway. Thus, the strongest color selectivity in the superficial depth of V2 supports that color information was processed in local neural circuits in area V2 and transmitted to higher order areas along the feedforward pathway. We have revised the manuscript for clarity.
In Line 241-245: “According to the hierarchical model, the strongest color selectivity in the superficial cortical depth is consistent with the fact that color blobs locate in the superficial layers of V1 (Figure 1B, Felleman & Van Essen, 1991; Hubel & Livingstone, 1987; Nassi & Callaway, 2009). The strongest color selectivity in superficial V2 suggests that both local and feedforward connections are involved in processing color information (Figure 1C).”
Line 254 "Laminar". Please use "cortical depth" or explicitly state that 'laminar' refers to superficial, middle, and deep as defined by cortical depth.
Thank you for your suggestion. We have clarified the term "laminar" in the manuscript as referring to superficial, middle, and deep layers as defined by cortical depth.
In Line 96-99: “To better understand the mesoscale functional organizations and neural circuits of information processing in area V2, the present study investigated laminar (or cortical depth-dependent) and columnar response profiles for color, disparity, and naturalistic texture in human V2 using 7T fMRI at 1-mm isotropic resolution.”
Fig. S5 Please add a unit of isoluminance.
Thank you for your suggestion. Supplementary Figure S10A and S10B illustrate the blue-matched luminance levels in RGB index. In our isoluminance experiment, blue was set as the reference color (RGB [0 0 255]) to measure the red and gray isoluminance.
Line 448-449 To make this rationale clearer, refer to:
Wang J, Nasr S, Roe AW, Polimeni JR. 2022. Critical factors in achieving fine‐scale functional MRI: Removing sources of inadvertent spatial smoothing. Human Brain Mapping. 43:3311-3331.
Thank you for your suggestion. We have added this reference to better support the rationale of data analysis.
Reviewer #2:
(1) Line 126 should refer to Figure 1B.
Thank you. We have corrected the reference in the revised manuscript as Figure 1B.
(2) Even if only one naturalistic texture session was acquired per participant, it might be interesting to see the within-session repeatability by, e.g., splitting the texture runs into two halves.
Thank you for your suggestion. We performed a split-half correlation analysis for participants who completed 10 runs in the naturalistic texture session. The result from one representative subject was shown in the figure below (for other participants, r = 0.38, 0.38, 0.24, and 0.23, respectively).
Author response image 2.
Split-half correlations for the texture-selective activation maps in a representative subject (S01) in V2.
(3) Unfortunately, Figure S2 only shows the stripe ROIs but not V3ab or V4 ROIs. Including another figure that shows all ROIs in more detail would be interesting.
Thank you for your suggestion. We have included a figure showing the ROIs for V4 and V3ab (the black dotted lines in Figure S9).
(4) It would be helpful for the reader to have a more detailed discussion about methodological limitations, including the unspecificity of the GE-BOLD signal (Engel et al., 1997, Cereb Cortex, 7, 181-192; Parkes et al., 2005, MRM, 54, 1465-1472; Fracasso et al., 2021, Prog Neurobiol, 202, 102187) and the used voxel sizes.
Thank you for your suggestion. We have added a more detailed discussion about the methodological limitations, including the unspecificity of the GE-BOLD signal and the voxel sizes used.
In Line 397-408: “Due to the limitations of the T2*w GE-BOLD signal in its sensitivity to large draining veins (Fracasso et al., 2021; Parkes et al., 2005; Uludag & Havlicek, 2021), the original BOLD responses were strongly biased towards the superficial depth in our data (Figure S8). Compared to GE-BOLD, VASO-CBV and SE-BOLD fMRI techniques have higher spatial specificity but much lower sensitivity (Huber et al., 2019). As shown in a recent study (Qian et al., 2024), using differential BOLD responses in a continuous¬¬ stimulus design can significantly enhance the laminar specificity of the feature selectivity measures in our results (Figure 3). Compared to the submillimeter voxels, as used in most laminar fMRI studies, our fMRI resolution at 1-mm isotropic voxel may have a stronger partial volume effect in the cortical depth-dependent analysis. However, consistent with our results, previous studies have also shown that 7T fMRI at 1-mm isotropic resolution can resolve cortical depth-dependent signals in human visual cortex (Roefs et al., 2024; Shao et al., 2021).”
(5) If I understand correctly, different numbers of runs/sessions were acquired for different subjects. It would be good to discuss if this could have impacted the results, e.g., different effect sizes could have biased the manual ROI definition.
Thank you for your suggestion. Although there were differences in the number of runs/sessions acquired for different subjects, there were at least four runs of data for each experiment, which should be enough to examine the within-subject effect. We have discussed this point in the revised manuscript.
In Line 481-484: “Although the number of runs were not equal across participants, there were at least four runs (twenty blocks for each stimulus condition) of data in each experiment, which should be sufficient to investigate within-subject effects.”
(6) It would be good to add the software used for layer definition. Was it Laynii?
We have provided more details in the revised methods.
In Line 523-526: “An equi-volume method was used to calculate the relative cortical depth of each voxel to the white matter and pial surface (0: white matter surface, 1: pial surface, Supplementary Figure S11A), using mripy (https://github.com/herrlich10/mripy).”
(7) It would be interesting to see (at least for one subject) the contrasts of color-selective thin stripes and disparity-selective thick stripes from single sessions to demonstrate the repeatability of measurements.
Thank you for your suggestion. We have shown the test-retest reliability of the response pattern of color-selective thin stripes and disparity-selective thick stripes in a representative subject in Figure S5.
(8) By any chance, do the authors also have resting-state data from the same subjects? It would be interesting to see the connectivity analysis between stripes and V3ab, V4 with resting-state data.
Thank you for your suggestion. Unfortunately, we do not have resting-state data from the same subjects at this time. We agree with you that layer-specific connectivity analysis with resting-state data is very interesting and worth investigating in future studies.
Reviewer #3:
(1) For investigating information flow across areas, the authors rely on layer-specific informational connectivity analyses, which is an exciting approach. Covariation in decoding accuracy for a specific dependent variable between the superficial layers of a lower area and the middle layer of a higher area is taken as evidence for feedforward connectivity, whereas FB was defined as the connection between the two deep layers. Yet this method is not assumption-free. For example, the canonical idea (Figure 1C) of FF terminals exclusively arriving in layer 4 and FB terminals exclusively terminating in supra-or infragranular layers is not entirely correct. This is not even the case for area V1 - see for example Kathy Rockland's exquisite tractography studies, showing that even single axons with branches terminating in different layers. Also, feedback signals not only arrive in the deep layers of a lower area. Although these informational connectivity analyses can be suggestive of information flow, this reviewer doubts it can be considered as conclusive evidence. Therefore, the authors should drastically tone down their language in this respect, throughout the text. They present suggestive, not conclusive evidence. To obtain truly conclusive evidence, one likely has to perform laminar electrophysiological recordings simultaneously across multiple areas and infer the directionality of information flow using, for example, granger causality.
Thank you for pointing out this important issue. In our response to a previous question (Reviewer #1, the 2nd comment), we have discussed other possible connections in addition to the canonical feedforward and feedback pathways. In the revised manuscript, the conclusion has been toned down to properly reflect our findings. However, we would also like to emphasize that our conclusion about laminar circuits was supported by converging lines of evidence. For example, in addition to the depth-dependent connectivity results, the role of feedback circuit in processing texture information was also supported by greater selectivity in V4 than V2, and the strongest deep layer selectivity in V2 (Figure 3C).
(2) In the same realm, how reproducible are the information connectivity results? In the first part of the study, the authors performed a split-half analyses. This should be also done for Figure 4.
Thank you for your suggestion. We have performed a split-half analysis for the informational connectivity results. As shown in Author response image 3, the results for the color experiment were robust and reproducible, while the disparity and texture connectivity results were less consistent between the two halves. The results from the second half (Author response image 3, below) are more consistent with the original findings (Figure 4). Overall, the pattern of results were qualitatively similar between the two halves. The inconsistency may be due to the fact that some participants had only four runs of data, which could make the split-half analysis less reliable.
Author response image 3.
Split-half analysis of informational connectivity.
(3) Most of the other layer-specific claims (not the ones about the flow of information) are based on indices. It is unclear which ROIs contributed to these indices. Was it the entire extent of V1, V2, ...? Or only the visually-driven voxels within these areas? How exactly were the voxels selected? For V2, it would make sense to calculate the selectivity indices independently for the disparity and color-selective (putative) thick and (putative) thin stripe compartments, respectively. Adding voxels of non-selective compartments (e.g. putative thick stripe voxels for calculating the color-index; or adding putative thin-strip voxels for calculating the disparity index), will only add noise.
In the revised manuscript, we have clarified that we selected the entire ROI in the depth-dependent analysis. Since our study does not have an independent functional localizer, using the entire ROI avoids the problem of double dipping. The processing of visual features is not confined solely to specific stripes. We have also provided a more comprehensive explanation of this issue in the discussion section.
In Line 541-544: “For the cortical depth-dependent analyses in Figure 3, we used all voxels in the retinotopic ROI. Pooling all voxels in the ROI avoids the problem of double-dipping and also increases the signal-to-noise ratio of ROI-averaged BOLD responses.”
(4) It is apparent from Figure 3, that the indices are largely (though not exclusively) driven by 2 subjects. Therefore, this reviewer wishes to see the raw data in addition to a table for calculating the color, disparity, and texture selectivity indices -along with the number of voxels that contributed to it.
Thank you for your suggestion. We have provided a figure showing the original BOLD responses (Figure S8 and Figure S8). Data from individual subjects were also available at Open Science Framework (OSF, https://doi.org/10.17605/OSF.IO/KSXT8 (‘rawBetaValues.mat’ in the data directory)).
Minor:
(1) I typically find inferences about 'layer fMRI' vastly overstated. We all know that fMRI does not (yet) provide laminar-specific resolution, i.e., whereby meaningful differences in fMRI signals can be extracted from all 6 individual layers of neocortex, without partial volume effects, or without taking into account pre-and postsynaptic contributions of neurons to the fMRI signal (the cell bodies may very well lay in different layers than the dendritic trees etc.), or without taking into account the vascular anatomy, etc. The authors should use the term cortical depth-dependent fMRI throughout the text -as they do in the abstract and intro.
Thank you for pointing out this important issue. We have now defined the meaning of layer or laminar as “cortical depth-dependent” in the introduction, to be consistent with the terminology in most published papers on this topic.
(2) 1st sentence abstract: I disagree with this statement. The parallel streams in intermediate-level areas are probably equally well studied as the geniculostriate pathway -already starting with the seminal work of Hubel, Livingstone, and more recently by Angelucci and co-workers who looked in detail at the anatomical and functional interactions across sub-compartments of V1 and V2.
Thank you for your feedback. In the revised manuscript, we have removed the term "much" from the first sentence of the abstract. Although there have been seminal studies of V2 sub-compartments in monkeys, only a few fMRI studies investigated this issue in humans.
(3) The authors show inter-session correlations for color and disparity. This reviewer would like to see test-retest images since the explained variance is not terribly good. Also, show the correlation values for the inter-session texture beta values.
Thank you for your suggestion. We have performed the test-retest reliability analysis of texture-selective patterns in the response to a previous question (Reviewer #2, the 2nd comment, Author response image 2).
(4) The stripe definitions are threshold dependent. Please clarify whether the reported results are threshold-independent.
Thank you for your question. To address your concern, we defined the stripe ROIs using different thresholds, and the results remained consistent. Specifically, we ranked the voxels in manually defined stripe ROIs by the color-disparity response. We then defined the lowest 10% as the thick stripe voxels, the highest 10% as thin stripe voxels, and the middle 10% as pale stripe voxels. Additionally, we adjusted the thresholds to 20% and 30% to define the three stripes (with 30% being the least strict threshold). Feature selectivities at different thresholds were shown in Figure S6 (from left to right: 10%, 20%, 30%). Notably, in all threshold conditions, there was no significant difference in texture selectivity across different stripes.
(5) How were the visual areas defined?
In the revised manuscript, we have provided a detailed description about methods.
In Line 531-535: “ROIs were defined on the inflated cortical surface. Surface ROIs for V1, V2, V3ab, and V4 were defined based on the polar angle atlas from the 7T retinotopic dataset of Human Connectome Project (Benson et al., 2014, 2018). Moreover, the boundary of V2 was edited manually based on columnar patterns. All ROIs were constrained to regions where mean activation across all stimulus conditions exceeded 0.”
(6) "According to the hierarchical model in Figure 1B and 1C, the strongest color selectivity in the superficial cortical depth is consistent with the fact that color blobs mainly locate in the superficial layers of V1, suggesting that both local and feedforward connections are involved in processing color information in area V2." But color-selective activation within V2 could be also consistent with feedback from other areas (some of which were not covered in the present experiments) -the more since most parts of the brain were not covered (i.e. a slab of 4 cm was covered)?
Thank you for reminding us about this issue. We have discussed the possibility of feedback influence in explanation of the superficial bias of color selectivity in area V2.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Point-by-point responses to the reviewers' comments:
All three reviewers found our analysis of focal adhesion-associated oncogenic pathways (Figs 3 and S3) to be inconsistent (Reviewer 1), not convincing/consistent (Reviewer 2, #2), and too variable and not well supported (Reviewer 3, #2). This was probably the basis for the eLife assessment, which stated: “However, the study is incomplete because the downstream molecular activities of PLECTIN that mediate the cancer phenotypes were not fully evaluated.” We agree with the reviewers that the degree of attenuation of the FAK, MAP/Erk, and PI3K/AKT signaling pathways differs depending on the cell line used (Huh7 and SNU-475) and the mode of inactivation (CRISPR/Cas9-generated plectin KO, functional KO (∆IFBD), and organoruthenium-based inhibitor plecstatin-1). However, we do not share the reviewers' skepticism about the unconvincing nature of the data presented.
Several previous studies have shown that plectin inactivation invariably leads to dysregulation of cell adhesions and associated signaling pathways in various cell systems. The molecular mechanisms driving these changes are not fully understood, but the most convincingly supported scenarios are uncoupling of keratin filaments (hemidesmosomes; (Koster et al., 2004)) and vimentin filaments (focal adhesions; (Burgstaller et al., 2010; Gregor et al., 2014)) from adhesion sites in conjunction with altered actomyosin contractility (Osmanagic-Myers et al., 2015; Prechova et al., 2022; Wang et al., 2020). This results in altered morphometry (Wang et al., 2020), dynamics (Gregor et al., 2014), and adhesion strength (Bonakdar et al., 2015) of adhesions. These changes are accompanied by reduced mechanotransduction capacity and attenuation of downstream signaling such as FAK, Src, Erk1/2, and p38 in dermal fibroblasts (Gregor et al., 2014); decrease in pFAK, pSrc, and pPI3K levels in prostate cancer cells (Wenta et al., 2022); increase in pErk and pSrc in keratinocytes (Osmanagic-Myers et al., 2006); decrease in pERK1/2 in HCC cells (Xu et al., 2022) and head and neck squamous carcinoma cells (Katada et al., 2012).
Consistent with these published findings, we show that upon plectin inactivation, the HCC cell line SNU475 exhibits aberrant cytoskeletal organization (vimentin and actin; Figs 4A-D, S4A-F), altered number, topography and morphometry of focal adhesions (Figs 4A, E-G, S4H,I), and ineffective transmission of traction forces (Fig 4H,I). Similar, although not quantified, phenotypes are present in Huh7 with inactivated plectin (data not shown). It is worth noting, that even robust cytoskeletal (e.g. #ventral stress fibers, Fig 4A,D and vimentin architecture, Fig S4A-C) and focal adhesion (%central FA, Fig 4A,E) phenotypes differ significantly between different modes of plectin inactivation and would certainly do so if compared between cell lines. These phenotypes are heterogeneous but not inconsistent. Interestingly, both SNU-475 and Huh7 plectin-inactivated cells show similar functional consequences such as prominent decrease in migration speed (Fig 5B). This suggests that while specific aspects of cytoarchitecture are differentially affected in different cell lines, the functional consequences of plectin inactivation are shared between HCC cell lines.
It is therefore not surprising that the activation status of downstream effectors, resulting from different degrees of cytoskeletal and focal adhesion reconfiguration, is not identical (or even comparable) between cell lines and treatment conditions. Furthermore, we compare highly epithelial (keratin- and almost no vimentin-expressing) Huh7 cells with highly dedifferentiated (low keratin- and high vimentinexpressing) SNU-475 cells, which differ significantly in their cytoskeleton, adhesions, and signaling networks. Alternative approaches to plectin inactivation are not expected to result in the same degree of dysregulation of specific signaling pathways. Effects of adaptation (CRISPR/Cas9-generated KOs and ∆IFBDs), engagement of different binding domains (CRISPR/Cas9-generated ∆IFBDs), and pleiotropic modes of action (plecstatin-1) are expected.
In our study, we provide the reader with an unprecedented complex comparison of adhesion-associated signaling between WT and plectin-inactivated HCC cell lines. First, we compared the proteomes of WT, KO and PST-treated WT SNU-475 cells using MS-based shotgun proteomics and phosphoproteomics (Fig 3A-C). Second, we extensively and quantitatively immunoblotted the major molecular denominators of MS-identified dysregulated pathways (such as “FAK signaling”, “ILK signaling”, and “Integrin signaling”) with the following results. Data (shown in Figs 3D and S3C) are expressed as a percentage of untreated WT, with downregulated values are highlighted in red:
Author response table 1.
In addition, we show dysregulated expression (mostly downregulation) of focal adhesion constituents ITGβ1 and αv, talin, vinculin, and paxilin which nicely complements fewer and larger focal adhesions in plectin-inactivated HCC cells. In light of these results, we believe that our statement that “Although these alterations were not found systematically in both cell lines and conditions (reflecting thus presumably their distinct differentiation grade and plectin inactivation efficacy), collectively these data confirmed plectin-dependent adhesome remodeling together with attenuation of oncogenic FAK, MAPK/Erk, and PI3K/Akt pathways upon plectin inactivation” (see pages 8-9) is fully supported. Furthermore, in support of the results of MS-based (phospho)proteomic and immunoblot analyses we show strong correlation between plectin expression and the signatures of “Integrin pathway” (R<sup>2</sup>=0.15, p= 2x10<sup>-45</sup>), “FAK pathway” (R<sup>2</sup>=0.11, p= 2x10<sup>-34</sup>), “PI3K Akt/mTOR signaling” (R<sup>2</sup>=0.06, p= 2x10<sup>-20</sup>) or “Erk pathway” (R<sup>2</sup>=0.10, p= 6x10<sup>-30</sup>) in HCC samples from 1268 patients (Fig S7-2C and S7-3).
In conclusion, we show that plectin is required for proper/physiological adhesion-associated signaling pathways in HCC cells. The HCC adhesome and associated pathways are dysregulated upon plectin inactivation and we show context-dependent varying degrees of attenuation of the FAK, MAPK/Erk, and PI3K/Akt pathways. In our view, presenting context-dependent variability in expression/activation of pathway molecular denominators is a trade-off for our intention to address this aspect of plectin inactivation in the complexity of different cell lines, tissues, and modes of inactivation. We prefer rather this complex approach to presenting “more convincing” black-and-white data assessed in a single cell line (Qi et al., 2022) or upon plectin inactivation by a single approach (compare with otherwise excellent studies such as (Xu et al., 2022) or (Buckup et al., 2021)). In fact, unlike the reviewers, we consider this complexity (and the resulting heterogeneity of the data) to be a strength rather than a weakness of our study.
Reviewer 1:
(1) The authors suggest that plectin controls oncogenic FAK, MAPK/Erk, and PI3K/Akt signaling in HCC cells, representing the mechanisms by which plectin promotes HCC formation and progression. However, the effect of plectin inactivation on these signaling was inconsistent in Huh7 and SNU-475 cells (Figure 3D), despite similar cell growth inhibition in both cell lines (Figure 2G). For example, pAKT and pERK were only reduced by plectin inhibition in SNU-475 cells but not in Huh7 cells.
We agree with the reviewer that plectin inactivation yields varying degrees of attenuation of the FAK, MAPK/Erk, and PI3K/Akt pathways depending on the cell type (Huh7 vs SNU-475 cells) and mode of plectin inactivation (CRISPR/Cas9-generated plectin KO vs functional KO (∆IFBD) vs organorutheniumbased inhibitor plecstatin-1). This context-dependent heterogeneity in the expression/activation of molecular denominators of signaling pathways reflects different degrees of cytoskeletal (e.g. #ventral stress fibers, Fig 4A,D and vimentin architecture, Fig S4A-C) and focal adhesion (e.g. %central FA, Fig 4A,E) phenotypes under different conditions. We expect, that functional consequences (such as reduced migration and anchorage-independent proliferation) arise from a combination of changes in individual pathways. The sum of often subtle changes will result in comparable effects not only on cell growth, but also on migration or transmission of traction forces. For more detailed comment, please see our response to all Reviewers on the first three pages of this letter.
We believe, that our data show that both pAkt and pErk are attenuated upon plectin inactivation in both Huh7 and SNU-475 cells. The following data (shown in Figs 3D and S3C) are expressed as a percentage of untreated WT, with downregulated values are highlighted in red:
Author response table 2.
(2) In addition, pFAK was not changed by plectin inhibition in both cells, and the ratio of pFAK/FAK was increased in both cells.
We agree with the reviewer that pFAK/FAK levels are either comparable or slightly higher upon plectin inactivation. However, we believe that our data convincingly show that FAK expression is downregulated in both Huh7 and Snu-475 cells. In our opinion, this results in an overall attenuation of the FAK signaling (see percentage for Normalized pFAKxNormalized FAK), which is expectedly more pronounced in migratory Snu-475 cells. The following data (shown in Figs 3D and S3C) are expressed as a percentage of untreated WT, with downregulated values are highlighted in red:
Author response table 3.
Given these results, we feel that our statement that “inhibition of plectin attenuates FAK signaling” (pages 8-9) is well supported.
(3) Thus, it is hard to convince me that plectin promotes HCC formation and progression by regulating these signalings.
Previous studies have shown that dysregulation of cell adhesions and attenuation of adhesionassociated FAK, MAPK/Erk, and PI3K/Akt signaling has inhibitory effects on HCC formation and progression. We show that plectin is required for the proper/physiological functioning of adhesionassociated signaling pathways in selected HCC cells. The HCC adhesome and associated pathways are dysregulated upon plectin inactivation and we show context-dependent varying degrees of attenuation of the FAK, MAPK/Erk, and PI3K/Akt pathways. We support these conclusions by providing the reader with proteomic and phosphoproteomic comparisons of adhesion-associated signaling between WT and plectin-inactivated HCC cell lines (Figs 3B,C and S3A,B). We further validate our findings by extensive and quantitative immunoblotting analysis (Figs 3D and S3C). In addition, we show a strong correlation between plectin expression and the signatures of “Integrin pathway” (R<sup>2</sup>=0.15, p= 2x10<sup>-45</sup>), “FAK pathway” (R<sup>2</sup>=0.11, p= 2x10<sup>-34</sup>), “PI3K Akt/mTOR signaling” (R<sup>2</sup>=0.06, p= 2x10<sup>-20</sup>) or “Erk pathway” (R<sup>2</sup>=0.10, p= 6x10<sup>-30</sup>) in HCC samples from 1268 patients (Fig S7E).
Our data and conclusions are fully consistent with previously published studies in HCC cells. For instance, even a mild decrease in FAK levels leads to a significant reduction in colony size (see effects of KD (Gnani et al., 2017) , effects of FAK inhibitor and sorafenib in xenografts (Romito et al., 2021), or effects of inhibitors in soft agars and xenografts (Wang et al., 2016)). Similar effects were observed upon partial Akt inhibition (compare with Akt inhibitors in soft agars (Cuconati et al., 2013; Liu et al., 2020)). Of course, we cannot rule out synergistic plectin-dependent effects mediated via adhesion-independent mechanisms. To identify these mechanisms and to distinguish contribution of various consequences of cytoskeletal dysregulation to phenotypes described in this manuscript would be experimentally challenging and we feel that these studies go beyond the scope of our current study.
As we feel that the adhesion-independent mechanisms were not sufficiently discussed in the original manuscript, we have removed the original sentence “Given the well-established oncogenic activation of these pathways in human cancer(33), our study identifies a new set of potential therapeutic targets.” (page 15) from the Discussion and added the following text: “However, it is conceivable that dysregulated cytoskeletal crosstalk could affect HCC through multiple mechanisms independent from FA-associated signaling. Indeed, we and others (Jirouskova et al., 2018; Xu et al., 2022) have shown that upon plectin inactivation, liver cells acquire epithelial characteristics that promote increased intercellular cohesion and reduced migration. Further studies will be required to identify and investigate synergistic adhesion-independent effects of plectin inactivation on HCC growth and metastasis.” (page 15). See also our response to Reviewer 2, #4 and Reviewer 3, #3 and #4.
(4) The authors claimed that Plectin inactivation inhibits HCC invasion and metastasis using in vitro and in vivo models. However, the results from in vivo models were not as compelling as the in vitro data. The lung colonization assay is not an ideal in vivo model for studying HCC metastasis and invasion, especially when Plectin inhibition suppresses HCC cell growth and survival. Using an orthotopic model that can metastasize into the lung or spleen could be much more convincing for an essential claim.
We agree with the reviewer that the orthotopic in vivo model would be an ideal setting to address HCC metastasis experimentally. There are several published models of HCC extrahepatic metastasis, including an orthotopic model of lung metastasis (Fan et al., 2012; Voisin et al., 2024; You et al., 2016), but to our knowledge, none of these orthotopic models are commonly used in the field. In contrast, the administration of tumor cells via the tail vein of mice is a standard, well-established approach of first choice for modelling lung metastasis in a variety of tumor types (e.g. (Hiratsuka et al., 2011; Jakab et al., 2024; Lu et al., 2020)), including HCC (Jin et al., 2017; Lu et al., 2020; Tao et al., 2015; Zhao et al., 2020).
Furthermore, we do not believe that the use of an orthotopic model would provide a comparable advantage in terms of plectin-mediated effects on metastatic growth compared to tail vein delivery of tumor cells. Importantly, the lung colonization model used in our study allows for the injection of a defined number of HCC cells into the bloodstream, thus eliminating the effect of the primary tumor size on the number of metastasizing cells. To distinguish between effects of plectin inhibition on HCC cell growth/survival and dissemination, we carefully evaluated both the number and volume of lung metastases (Figs 6I and S6C-F). The observed reduction in the number of metastases (Figs 6I and S6D) reflects the initiation/early phase of metastasis formation, which is strongly influenced by the adhesion, migration, and invasion properties of the HCC cells and corresponds well with the phenotypes described after plectin inactivation in vitro (Figs 4H,I; 5; 6A-E; S5; and S6A,B). The reduction in the volume of metastases (Figs 6I and S6E) reflects the effects of plectin inhibition on HCC cell growth and metastatic outgrowth and corresponds well with the in vitro data shown in Figs 2G,H and S2F,G.
(5) Also, in Figure 6H, histology images of lungs from this experiment need to be shown to understand plectin's effect on metastasis better.
We are grateful to the reviewer for bringing our attention to the lung colonization assay results presented. The description of the experiments in the text of the original manuscript was incorrect. The animals monitored by in vivo bioluminescence imaging (shown in Fig 6H) are the same as the mice from which cleared whole lung lobes were analyzed by lattice light sheet fluorescence microscopy (shown in Fig. 6I). The corrected description is now provided in the revised manuscript as follows: “To identify early phase of metastasis formation, we next monitored the HCC cell retention in the lungs using in vivo bioluminescence imaging (Fig. 6H). This experimental cohort was expanded for WT-injected mice which were administered PST…” (page 11).
Therefore, lungs from all animals shown in Fig 6H,I were CUBIC-cleared and analyzed by lattice light sheet fluorescence microscopy. As requested by Reviewer 2, Recommendation #1, we provide in the revised manuscript (Fig S6F) “whole slide scan results for all the groups” which could help to understand plectin's effect on metastasis better”. To address the reviewer's concern, we also post-processed cleared and visualized lungs for hematoxylin staining and immunolabeled them for HNF4α. A representative image is shown as a panel A in Author response image 1. Post-processing of CUBIC-cleared and immunolabeled lung lobes resulted in partial tissue destruction and some samples were lost. In addition, as the entire experimental setup was designed for the early phase of metastasis formation, only small Huh7 foci were formed (compared to the larger metastases that developed within 13 weeks after inoculation shown in the panel B). As the IHC for HNF4α provides significantly lower sensitivity compared to the immunofluorescence images provided in the manuscript, we were only able to identify a few HNF4α-positive foci. Overall, we consider our immunofluorescence images to be qualitatively and quantitatively superior to IHC sections. However, if the reviewer or the editor considers it beneficial, we are prepared to show our current data as a part of the manuscript.
Author response image 1.
(A) HNF4α staining of lung tissue after CUBIC clearing from mice inoculated with WT Huh7 from the timepoint of BLI, when the positive signal in chest area has been detected. This timepoint was then selected for the comparison of initial stages of lung colonization. (B) H&E and HNF4α staining from lung tissue of mice inoculated with WT Huh7 cells from the survival experiment. Scale bars, 50 µm.
(6) Figure 6G, it is unclear how many mice were used for this experiment. Did these mice die due to the tumor burdens in the lungs?
The number of animals is given in the legend to Fig 6G (page 34; N = 14 (WT), 13 (KO)). Large Huh7 metastases were identified in the lungs of animals that could be analyzed post-mortem by IHC (see panel B in the figure above). No large metastases were found in other organs examined, such as the liver, kidney and brain. It is therefore highly likely that these mice died as a result of the tumor burden in the lungs. A similar conclusion was drawn from the results of the lung colonization model in the previous studies (Jin et al., 2017; Zhao et al., 2020).
(7) The whole paper used inhibition strategies to understand the function of plectin. However, the expression of plectin in Huh7 cells is low (Figure 1D). It might be more appropriate to overexpress plectin in this cell line or others with low plectin expression to examine the effect on HCC cell growth and migration.
For this study, we selected two model HCC cell lines – Huh7 and SNU-475. Our intention was to investigate the role of plectin in “well-differentiated” (Huh7) and “poorly differentiated” (SNU-475) HCC cells, including thus early and advanced stages of HCC development (as categorized before (Boyault et al., 2007; Yuzugullu et al., 2009a); see also our description and rationale on page 6). As anticipated, less migratory “epithelial-like” Huh7 cells are characterized by relatively high E-cadherin, low vimentin, and low plectin expression levels (Fig 1D). In contrast, migratory “mesenchymal-like” SNU-475 cells are characterized by relatively low E-cadherin, high vimentin, and high plectin expression levels (Fig 1D). Therefore, the majority of analyses were performed in both relatively low plectin-expressing Huh7 and high plectin-expressing SNU-475 cells. It is noteworthy, that inactivation of plectin had similar (although less pronounced) inhibitory effects on growth and migration in both Huh7 and SNU-475 cells.
We agree with the reviewer that “It might be more appropriate to overexpress plectin in this cell line or others with low plectin expression to examine the effect on HCC cell growth and migration”. In fact, we have received similar suggestions since we started publishing our studies on plectin. There are two reasons, which preclude the successful overexpression experiments. First, there are about 14 known isoforms of plectin (Prechova et al., 2023). Although, previous studies have analyzed the phenotypic rescue potential of some plectin isoforms using transient transfection (e.g. (Burgstaller et al., 2010; Osmanagic-Myers et al., 2015; Prechova et al., 2022)), the isoform variability precludes rescue/overexpression experiments if the causative isoform is not known. Second, plectin is a giant cytoskeletal crosslinker protein of more than 4,500 amino acids with binding sites for intermediate filaments, F-actin, and microtubules. Overexpression of the approximately 500 kDa-large crosslinker invariably leads to the collapse of cytoskeletal networks in every cell type we have tested so far. See also our response to Reviewer 3, #2.
Reviewer 2:
(1) The annotation of mouse numbers is confusing. In Figures 2A B D E F, it should be the same experiment, but the N numbers in A are 6 and 5. In E and F they are 8 and 3. Similarly, in Figure 2H, in the tumor size curve, the N values are 4,4,5,6. In the table, N values are 8,8,10,11 (the authors showed 8,7,8,7 tumors that formed in the picture).
We are grateful to the reviewer for bringing our attention to the inconsistency the number of animals in DEN-induced hepatocarcinogenesis. Results from two independent cohorts are presented in the manuscript. The first cohort was used for MRI screening (Fig 2A-C) and at the second screening timepoint of 44 weeks, approximately 75% of animals died during anesthesia. Therefore, the second cohort of Ple<sup>ΔAlb</sup> and Ple<sup>fl/fl</sup> mice was used for macroscopic confirmation and histology (Figs 2D-F and S2A). We agree with the reviewer that the original presentation of the data may be misleading; therefore, we have rephrased the sentence describing macroscopic confirmation and histology (Figs 2D-F and S2A) as follows: “Decreased tumor burden in the second cohort of Ple<sup>ΔAlb</sup> mice was confirmed macroscopically…” (page 7).
For the experiments shown in Fig 2H, mice were injected in both hind flanks. We have added this information to the figure legend along with the correct number of tumors.
(2) In Figure 3D and Figure S3C, the changes in most of the proteins/phosphorylation sites are not convincing/consistent. These data are not essential for the conclusion of the paper and WB is semi-quantitative. Maybe including more plots of the proteins from proteomic data could strengthen their detailed conclusions about the link between Plectin and the FAK, MAPK/Erk, PI3K/Akt pathways as shown in 3E.
We agree with the reviewer that plectin inactivation yields varying degrees of attenuation of the FAK, MAPK/Erk, and PI3K/Akt pathways depending on the cell type (Huh7 vs SNU-475 cells) and mode of plectin inactivation (CRISPR/Cas9-generated plectin KO vs functional KO (∆IFBD) vs organorutheniumbased inhibitor plecstatin-1). This context-dependent heterogeneity in the expression/activation of pathway molecular denominators reflects different degrees of cytoskeletal (e.g. #ventral stress fibers, Fig 4A,D and vimentin architecture, Fig S4A-C) and focal adhesion (e.g. %central FA, Fig 4A,E) phenotypes under different conditions. See also the detailed response to all reviewers (on the first three pages of this letter) and the responses to Reviewer 1, #1 and #2, Reviewer 3, #4.
Our immunoblot analysis is based on NIR fluorescent secondary antibodies which were detected and quantified using an Odyssey imaging system (LI-COR Biosciences). This approach allows a wider linear detection range than chemiluminescence without a signal loss and is considered to provide quantitative immunoblot detection (Mathews et al., 2009; Pillai-Kastoori et al., 2020) (see also manufacturer's website: https://www.licor.com/bio/applications/quantitative-western-blots/).
Following the reviewer's recommendation, we have carefully reviewed our proteomic and phosphoproteomic data. There are no further MS-based data (other than those already presented in the manuscript) to support the association of plectin with the FAK, MAPK/Erk, PI3K/Akt pathways.
(3) Figure S7A and B, The pictures do not show any tumor, which is different from Figure 7A and B (and from the quantification in S7A lower right). Is it just because male mice were used in Figure 7 and female mice were used in Figure S7? Is there literature supporting the sex difference for the Myc-sgP53 model?
As indicated in the Figure legends and in the corresponding text in the Results section (page 12), the Fig 7A,B shows Myc;sgTp53-driven hepatocarcinogenesis in male mice, whereas Fig S7C,D shows results from the female cohort. In general, the HDTVi-induced HCC onset and progression differs considerably between individual experiments, and it is therefore crucial to compare data within an experimental cohort (as we have done for Ple<sup>ΔAlb</sup> and Ple<sup>fl/fl</sup> mice). Nevertheless, we cannot exclude the influence of sexual dimorphism on the results presented. The existence of sexual dimorphism in liver cancer is supported by a substantial body of evidence derived from various studies (e.g. (Bigsby and CaperellGrant, 2011; Bray et al., 2024)). To date, no reports have specifically addressed sexual dimorphism in Myc;sgTp53 HDTVI-induced liver cancer. This is likely due to the fact that the vast majority of studies using this model have only presented data for one sex. However, a study using an HDTVI-administered combination of c-MET and mutated beta-catenin oncogenes to induce HCC in mice observed elevated levels of alpha-fetoprotein (AFP) in males when compared to females (Bernal et al., 2024). The study suggests that estrogen may have a protective effect in female mice, as ovariectomized females had AFP levels comparable to those observed in males. Our data suggest that female hormones may have a similar effect in the Myc;sgTp53 HDTVI-induced liver cancer model.
(4) Figure 2F, S2A, Ple<sup>ΔAlb</sup> mice more frequently formed larger tumors, as reflected by overall tumor size increase. The interpretation of the authors is "possibly implying reduced migration or increased cohesion of plectin-depleted cells". It is quite arbitrary to make this suggestion in the absence of substantial data or literature to support this theory.
We agree with the reviewer that our statement “Notably, Ple<sup>ΔAlb</sup> mice more frequently formed larger tumors, as reflected by overall tumor size increase (Fig. 2F; Figure 2—figure supplement 1A), possibly implying reduced migration or increased cohesion of plectin-depleted cells(25).” (page 7) is rather speculative. As we did not further address the formation of larger tumors in Ple<sup>ΔAlb</sup> mice further in the current study, we wanted to provide the readers with some, even speculative, hypotheses. In support of our hypothesis, we cite our own publication (#26; Jirouskova et al., J Hepatol., 2018), where we show that plectin inactivation in Ple<sup>ΔAlb</sup> livers results in upregulation of the epithelial marker E-cadherin. Previous studies have shown that similar increase in E-cadherin expression levels reflects mesenchymalto-epithelial transition (e.g. (Adhikary et al., 2014; Auersperg et al., 1999; Wendt et al., 2011)) and is often associated with reduced cancer cell migration/invasion. This is consistent with our finding that “migrating plectin-disabled SNU-475 cells exhibited more cohesive, epithelial-like features while progressing collectively. By contrast, WT SNU-475 leader cells were more polarized and found to migrate into scratch areas more frequently than their plectin-deficient counterparts (Figure 5—figure supplement 1B). Consistent with this observation, individually seeded SNU-475 cells less frequently assumed a polarized, mesenchymal-like shape upon plectin inactivation in both 2D and 3D environments (Fig. 5C). Moreover, plectin-inactivated SNU-475 cells exhibited a decrease in N-cadherin and vimentin levels when compared to WT counterparts (Figure 5—figure supplement 1C).” (page 10).
In conclusion, we have shown that plectin-deficient hepatocytes express higher levels of E-cadherin and hepatocyte-derived SNU-475 cells express less N-cadherin and vimentin. In addition, we show that SNU475 cells exhibited more cohesive, epithelial-like features in scratch-wound experiments. To address the reviewer's concern and to further support our statement about the increased cohesiveness of plectindeficient HCC cells we have included the citation of the recent study #27 (Xu et al., 2022). Using the MHCC97H and MHCC97L HCC cell lines, this study shows that plectin downregulation “inhibits HCC cell migration and epithelial mesenchymal transformation”, which is fully consistent with our hypothesis. To mitigate the impression of an unsubstantiated statement, we also discuss adhesion-independent plectin-mediated mechanisms in the revised Discussion section as follows: “However, it is conceivable that dysregulated cytoskeletal crosstalk could affect HCC through multiple mechanisms independent from FA-associated signaling. Indeed, we and others (Jirouskova et al., 2018; Xu et al., 2022) have shown that upon plectin inactivation, liver cells acquire epithelial characteristics that promote increased intercellular cohesion and reduced migration. Further studies will be required to identify and investigate synergistic adhesion-independent effects of plectin inactivation on HCC growth and metastasis.” (page 15).
(5) Mutation or KO PLEC has been shown to cause severe diseases in humans and mice, including skin blistering, muscular dystrophy, and progressive familial intrahepatic cholestasis. Please elaborate on the potential side effects of targeting Plectin to treat HCC.
Indeed, mutation or ablation of plectin has been implicated in many diseases (collectively known as plectinopathies). These multisystem disorders include an autosomal dominant form of epidermolysis bullosa simplex (EBS), limb-girdle muscular dystrophy, aplasia cutis congenita, and an autosomal recessive form of EBS that may be associated with muscular dystrophy, pyloric atresia, and/or congenital myasthenic syndrome. Several mutations have also been associated with cardiomyopathy and malignant arrhythmias. Progressive familial intrahepatic cholestasis has also been reported. In genetic mouse models, loss of plectin leads to skin fragility, extensive intestinal lesions, instability of the biliary epithelium, and progressive muscle wasting (for more details see (Vahidnezhad et al., 2022)).
It is therefore important to evaluate potential side effects, and plectin inactivation therefore presents challenges comparable to other anti-HCC targets. For instance, Sorafenib, the most widely used chemotherapy in recent decades, targets numerous serine/threonine and tyrosine kinases (RAF1, BRAF, VEGFR 1, 2, 3, PDGFR, KIT, FLT3, FGFR1, and RET) that are critical for proper non-pathological functions (Strumberg et al., 2007; Wilhelm et al., 2006; Wilhelm et al., 2004). The combinatorial therapy of atezolizumab and bevacizumab targets also PD-L1 in conjunction with VEGF, which plays an essential role in bone formation (Gerber et al., 1999), hematopoiesis (Ferrara et al., 1996), or wound healing (Chintalgattu et al., 2003). To allow readers to read a comprehensive account of the pathological consequences of plectin inactivation, we included two additional citations (Prechova et al., 2023; Vahidnezhad et al., 2022) and rephrased Introduction section as follows: “…multiple reports have linked plectin with tumor malignancy(12) and other pathologies (Prechova et al., 2023; Vahidnezhad et al., 2022), mechanistic insights…” (page 4-5).
Reviewer 3:
(1) The rationale for using Huh7 cells in the manuscript is not well explained as it has the lowest Plectin expression levels.
For this study, we selected two model HCC cell lines - Huh7 and SNU-475. Our intention was to address the role of plectin in “well-differentiated” (Huh7) and “poorly differentiated” (SNU-475) HCC cells, thus including early and advanced stages of HCC development (as categorized before (Boyault et al., 2007; Yuzugullu et al., 2009b) see also our description and reasoning on page 6). The Huh7 cell line is also a well-established and widely used model suitable for both in vitro and in vivo settings (e.g. (Du et al., 2024; Fu et al., 2018; Si et al., 2023; Zheng et al., 2018).
As anticipated, less migratory “epithelial-like” Huh7 cells are characterized by relatively high E-cadherin, low vimentin, and low plectin expression levels (Fig 1D). In contrast, migratory “mesenchymal-like” SNU475 cells are characterized by relatively low E-cadherin, high vimentin, and high plectin expression levels (Fig 1D). Therefore, the majority of analyses were performed in both relatively low plectin-expressing Huh7 and high plectin-expressing SNU-475 cells. It is noteworthy, that inactivation of plectin had similar (although less pronounced) inhibitory effects on the phenotypes in both Huh7 and SNU-475 cells. We believe that these findings highlight the importance of plectin in HCC growth and metastasis, as plectin inactivation has inhibitory effects on both early (low plectin) and advanced (high plectin) stages of HCC.
(2) The KO cell experiments should be supplemented with overexpression experiments.
We agree with the reviewer that it would be helpful to complement our plectin inactivation experiments by overexpressing plectin in the HCC cell lines used in this study. In fact, we have received similar suggestions since we started to publish our studies on plectin. There are two reasons, which preclude the successful overexpression experiments. First, there is about 14 known isoforms of plectin (Prechova et al., 2023). Although previous studies have analyzed the phenotypic rescue potential of some plectin isoforms using transient transfection (e.g. (Burgstaller et al., 2010; Osmanagic-Myers et al., 2015; Prechova et al., 2022)), the isoform variability precludes rescue/overexpression experiments if the causative isoform is not known. Second, plectin is a giant cytoskeletal crosslinker protein of more than 4,500 amino acids with binding sites for intermediate filaments, F-actin, and microtubules. Overexpression of the approximately 500 kDa-large crosslinker invariably leads to the collapse of cytoskeletal networks in every cell type we have tested so far. See also our response to Reviewer 1, #7.
(3) There is significant concern that while ablation of Ple led to reduced tumor number, these mice had larger tumors. The data indicate that Plectin may have distinct roles in HCC initiation versus progression. The data are not well explained and do not fully support that Plectin promotes hepatocarcinogenesis.
In the DEN-induced HCC model MRI screening revealed fewer tumors and also tumor volume was reduced at 32 and 44 weeks post-induction (Fig 2A-C). Larger tumors formed in Ple<sup>ΔAlb</sup> compared to Ple<sup>fl/fl</sup> livers (Figs 2F and S2A) refer only to a subset of macroscopic tumors visually identified at necropsy. Larger Ple<sup>ΔAlb</sup> tumors were not observed in the Myc;sgTp53 HDTVI-induced HCC model (data not shown). In contrast, plectin deficiency reduced the size of xenografts formed in NSG mice (Fig 2H), and agar colonies grown from Huh7 and SNU-475 cells with inactivated plectin were also smaller (Fig S2F). In all in vivo and in vitro approaches presented in the manuscript, plectin inactivation reduced the number of colonies/xenografts/tumors. As hepatocarcinogenesis is a multistep process including initiation, promotion, and progression (Pitot, 2001), we feel confident in concluding that plectin inactivation inhibits hepatocarcinogenesis and we consider this conclusion to be fully supported by the data presented in the manuscript.
However, we agree with the reviewer that larger macroscopic Ple<sup>ΔAlb</sup> tumors in the DEN-induced HCC model are intriguing. As we do not see similar effects (or even trends) in other approaches used in this study, we cannot exclude the contribution of plectin-deficient environment in Ple<sup>ΔAlb</sup> livers during longterm (44 weeks) tumor formation and growth. In our previous study (Jirouskova et al., 2018), we showed that plectin deficiency in Ple<sup>ΔAlb</sup> livers leads to biliary tree malformations, collapse of bile ducts and ductules, and mild ductular reaction. We could speculate that Ple<sup>ΔAlb</sup> livers suffer from continuous bile leakage into the parenchyma, which would exacerbate all models of long-term pathology.
As we did not further address the formation of larger tumors in Ple<sup>ΔAlb</sup> mice further in the current study, we offered the reader the hypothesis that large tumors could “…possibly implying reduced migration or increased cohesion of plectin-depleted cells25.” In support of our hypothesis, we cite our own publication (#26; Jirouskova et al., J Hepatol., 2018), where we show that plectin inactivation in Ple<sup>ΔAlb</sup> livers results in upregulation of the epithelial marker E-cadherin. Previous studies have shown that similar increase in E-cadherin expression levels reflects mesenchymal-to-epithelial transition (e.g. (Adhikary et al., 2014; Auersperg et al., 1999; Wendt et al., 2011)) and is often associated with reduced cancer cell migration/invasion. This is consistent with our finding that “migrating plectin-disabled SNU475 cells exhibited more cohesive, epithelial-like features while progressing collectively. By contrast, WT SNU-475 leader cells were more polarized and found to migrate into scratch areas more frequently than their plectin-deficient counterparts (Figure 5—figure supplement 1B). Consistent with this observation, individually seeded SNU-475 cells less frequently assumed a polarized, mesenchymal-like shape upon plectin inactivation in both 2D and 3D environments (Fig. 5C). Moreover, plectin-inactivated SNU-475 cells exhibited a decrease in N-cadherin and vimentin levels when compared to WT counterparts (Figure 5—figure supplement 1C).” (page 10).
In conclusion, we have shown that plectin-deficient hepatocytes express higher levels of E-cadherin and hepatocyte-derived SNU-475 cells less N-cadherin and vimentin. In addition, we show that SNU-475 cells exhibited more cohesive, epithelial-like features in scratch-wound experiments. To address the reviewer's concern and to further support our claim of increased cohesiveness of plectin-deficient HCC cells we included the citation of the recent study(27). Using the MHCC97H and MHCC97L HCC cell lines, this study shows that plectin downregulation “inhibits HCC cell migration and epithelial mesenchymal transformation” and is therefore fully consistent with our hypothesis. To mitigate the impression of an unsubstantiated statement, we also discuss adhesion-independent plectin-mediated mechanisms in the revised Discussion section as follows: “However, it is conceivable that dysregulated cytoskeletal crosstalk could affect HCC through multiple mechanisms independent from FA-associated signaling. Indeed, we and others (Jirouskova et al., 2018; Xu et al., 2022) have shown that upon plectin inactivation, liver cells acquire epithelial characteristics that promote increased intercellular cohesion and reduced migration. Further studies will be required to identify and investigate synergistic adhesionindependent effects of plectin inactivation on HCC growth and metastasis.” (page 15).
(4) Figure 3 showed that Plectin does not regulate p-FAK/FAK expression. Therefore, the statement that Plectin regulates the FAK pathway is not valid. Furthermore, there are too many variables in turns of p-AKT and p-ERK expression, making the conclusion not well supported.
We agree with the reviewer that pFAK/FAK levels are either comparable or slightly higher upon plectin inactivation. However, we believe that our data convincingly show that FAK expression is downregulated in both Huh7 and Snu-475 cells. In our opinion, this results in an overall attenuation of the FAK signaling (see percentage for Normalized pFAKxNormalized FAK), which is expectedly more pronounced in migratory Snu-475 cells. The following data (shown in Figs 3D and S3C) are expressed as a percentage of untreated WT, with downregulated values highlighted in red:
Author response table 4.
Given these results, we believe that our statement that “inhibition of plectin attenuates FAK signaling” (pages 8-9) is well supported.
We believe, that our data show that both pAkt and pErk are attenuated upon plectin inactivation in both Huh7 and SNU-475 cells. The following data (presented in Figs 3D and S3C) are shown as a percentage of untreated WT, with downregulated values highlighted in red:
Author response table 5.
We agree with the reviewer that plectin inactivation yields varying degrees of attenuation of the FAK, MAPK/Erk, and PI3K/Akt pathways depending on the cell type (Huh7 vs SNU-475 cells) and mode of plectin inactivation (CRISPR/Cas9-generated plectin KO vs functional KO (∆IFBD) vs organorutheniumbased inhibitor plecstatin-1). This context-dependent heterogeneity in the expression/activation of pathway molecular denominators reflects different degrees of cytoskeletal (e.g. #ventral stress fibers, Fig 4A,D and vimentin architecture, Fig S4A-C) and focal adhesion (e.g. %central FA, Fig 4A,E) phenotypes under different conditions. See also the detailed response to all Reviewers (on the first three pages of this letter) and the responses to Reviewer 1, #1 and #2 and Reviewer 2, #4.
(5) The studies of plecstatin-1 in HCC should be expanded to a panel of human HCC cells with various Plectin expression levels in turns of cell growth and cell migration. The IC50 values should be determined and correlate with Plectin expression.
Following the reviewer's suggestion, we have included graphs showing IC50 values for Huh7 (low plectin) and SNU-475 (high plectin) cells as Fig S2E. As expected, the IC50 values are higher for SNU-475 cells. Corresponding parts of the Figure legends have been changed. We refer to new data in the Results section as follows: “If not stated otherwise, we applied PST in the final concentration of 8 µM, which corresponds to the 25% of IC50 for Huh7 cells (Figure 2—figure supplement 1E).” (page 7). We also provide details of the IC50 determination in the revised Supplement Materials and methods section (pages 5-6).
(6) One of the major issues is the mechanistic studies focusing on Plectin regulating HCC migration/metastasis, whereas the in vivo mouse studies focus on HCC formation (Figures 3 and 7). These are distinct processes and should not be mixed.
In our study, we investigated the role of plectin in the development and dissemination of HCC. Using DEN- and Myc;sgTp53 HDTVI-induced HCC models (Figs 2A-F, S2A, 7A-C, and S7A-D), we show the effects of plectin inactivation on HCC formation in vivo. These studies are complemented by xenografts (Figs 2H and S2G) and in vitro colony formation assay (Figs 2G and S2F). Using an in vivo lung colonization assay (Figs 6G-I and S6C-F), we show the effects of plectin inactivation on the metastatic potential of HCC cells. In complementary in vitro studies, we show how plectin deficiency affects migration (Figs 5 and S5) and invasion (Figs 6A-E and S6A,B).
Our mechanistic studies show that plectin inactivation leads to dysregulation of cytoskeletal networks, adhesions, and adhesion-associated signaling. We believe that we have provided substantial experimental data suggesting that the proposed mechanisms play a role in plectin-mediated inhibition of both HCC development and dissemination. Of course, we cannot rule out additional, adhesionindependent mechanisms for HCC formation. To clarify this, we have revised the Discussion section as follows: “However, it is conceivable that dysregulated cytoskeletal crosstalk could affect HCC through multiple mechanisms independent from FA-associated signaling. Indeed, we and others (Jirouskova et al., 2018; Xu et al., 2022) have shown that upon plectin inactivation, liver cells acquire epithelial characteristics that promote increased intercellular cohesion and reduced migration. Further studies will be required to identify and investigate synergistic adhesion-independent effects of plectin inactivation on HCC growth and metastasis.” (page 15).
(7) Figure 7B showed that Ple KO mice were treated with PST, but the data are not presented in the manuscript. Tumor cell proliferation and apoptosis rates should be analyzed as well.
We do not show any effects of PST in Ple<sup>ΔAlb</sup> mice. As stated in the Fig 7B legend: “Myc;sgTp53 HCC was induced in Ple<sup>fl/fl</sup>, Ple<sup>ΔAlb</sup>, and PST-treated Ple<sup>fl/fl</sup> (Ple<sup>fl/fl</sup>+PST) male mice as in (A). Shown are representative images of Ple<sup>fl/fl</sup>, Ple<sup>ΔAlb</sup>, and Ple<sup>fl/fl</sup>+PST livers from mice with fully developed multifocal HCC sacrificed 6 weeks post-induction.”.
Following the reviewer's recommendation, we include the analysis of proliferation and apoptosis rates as revised Fig S7A,B. Please note, that no differences in apoptosis and proliferation rates were found between experimental conditions. Due to additional data, the original Fig S7 – 1 has been split into revised Fig S7 – 1 and Fig S7 – 2.
(8) The status of FAK, AKT, and ERK pathway activation was not analyzed in mouse liver samples. In Figure 7D, most of the adjusted p-values are not significant.
We are aware that the majority of FDR corrected p-values shown in the Fig 7D are not significant. In fact, we deliberated with our colleagues from the laboratory of Prof. Samuel Meier-Menches (Department of Analytical Chemistry, University of Vienna), who conducted all the proteomic studies presented in this manuscript, on whether to present such "weak" data. Following a lengthy discussion, a decision was taken to include them despite the anticipation of criticism from the reviewers. The rationale for including these data is that, despite the lack of statistical significance, the findings are consistent with those of MS/immunoblot analyses of HCC cells (Figs 3 and S3) and patient data (Figs 7E, S7-2). The lack of statistical significance observed in the presented data is a consequence of the limited number of animals included in the Ple<sup>fl/fl</sup>, Ple<sup>ΔAlb</sup>, and PST-treated Ple<sup>fl/fl</sup> cohorts, which has resulted in a high degree of variability in the MS results. We agree with the reviewer that the inclusion of immunoblot analysis would provide further support for our conclusions. However, we do not have any remaining liver tissue that could be analyzed.
(9) There is no evidence to support that PST is capable of overcoming therapy resistance in HCC. For example, no comparison with the current standard care was provided in the preclinical studies.
We are grateful to the reviewer for bringing our attention to the incorrect statement in the Abstract: “…we show that plectin inhibitor plecstatin-1 (PST) is well-tolerated and capable of overcoming therapy resistance in HCC”. To address the reviewer's concern, we rephrased the Abstract as follows: “…we show that plectin inhibitor plecstatin-1 (PST) is well-tolerated and potently inhibits HCC progression”.
Recommendations for the authors:
Reviewer 2 (Recommendations for the authors):
(1) In Figures 6I and S6C, it would be better to show the whole slide scan result for all the groups.
Following the reviewer's recommendation, we include the whole slide scan result for all the groups as revised Fig S6F.
(2) In Figures S7C and D, what do the highlighted/colored dots represent? They are not mentioned in the figure legend or the results.
Following the reviewer's recommendation, we include the explanation in the revised Figure legends (page 30).
(3) In Figure 2H, the experiment schedule showed "6w Huh7 t.v.i.", but should it be subcutaneous injection?
We are grateful to the reviewer for bringing our attention to the incorrect description of the experiment. The schematics was corrected. The schematic has been corrected. We have also noticed an error in the table summarizing the number of tumors formed (N) and have corrected the values for the WT+PST and KO conditions.
(4) Supplemental Materials and Methods, Xenograft tumorigenesis, Error: 2.5×106 Huh7 cells in 250 ml PBS mice were administered subcutaneously in the left and right hind flanks. It probably should be "250ul".
We are grateful to the reviewer for bringing our attention to the incorrect description of the experiment. The corresponding part of the Materials and Methods section has been corrected (page 2).
(5) In Figure legend Supplementary Figure 6 C,D,E : "Representative magnified images from lung lobes with GFP-positive WT, KO, and WT+PST SNU-475 nodules". There is no picture for the WT+PST SNU-475 group.
We are grateful to the reviewer for bringing our attention to the incorrect description of the experiment. The corresponding part of the Figure legend (“WT+PST SNU-475”) has been deleted (page 27).
(6) In the Figure legend for Figure 6H, "Representative BLI images of WT, KO, and PST-treated WT (WT+PST) SNU-475 cells-bearing mice are shown". Should it be Huh7, not SNU-475?
We are grateful to the reviewer for bringing our attention to the incorrect description of the experiment. The description of the cell line has been corrected (page 34).
(7) The statement that current therapies rely on multikinase inhibitors is no longer correct.
We are grateful to the reviewer for bringing our attention to the incorrect statement. To address the reviewer's concern, we rephrased the original part of Discussion section: “Current therapies for HCC rely on multikinase inhibitors (such as sorafenib) that provide only moderate survival benefit(60,61) due to primary resistance and the plasticity of signaling networks(62)” as follows: “Current systemic therapies for advanced HCC rely on a combination of multikinase inhibitor (such as sorafenib) or anti-VEGF /VEGF inhibitor (such as bevacizumab) treatment with immunotherapy(59). Multikinase inhibitors provide only moderate survival benefit(60,61) due to primary resistance and the plasticity of signaling networks(62), and only a subset of patients benefits from addition of immunotherapy in HCC treatment(63)” (page 15).
References
Adhikary, A., S. Chakraborty, M. Mazumdar, S. Ghosh, S. Mukherjee, A. Manna, S. Mohanty, K.K. Nakka, S. Joshi, A. De, S. Chattopadhyay, G. Sa, and T. Das. 2014. Inhibition of epithelial to mesenchymal transition by E-cadherin up-regulation via repression of slug transcription and inhibition of Ecadherin degradation: dual role of scaffold/matrix attachment region-binding protein 1 (SMAR1) in breast cancer cells. The Journal of biological chemistry. 289:25431-25444.
Auersperg, N., J. Pan, B.D. Grove, T. Peterson, J. Fisher, S. Maines-Bandiera, A. Somasiri, and C.D. Roskelley. 1999. E-cadherin induces mesenchymal-to-epithelial transition in human ovarian surface epithelium. Proc Natl Acad Sci U S A. 96:6249-6254.
Bernal, A., M. McLaughlin, A. Tiwari, F. Cigarroa, and L. Sun. 2024. Abstract 772: Investigation of gender disparity in liver tumor formation using a hydrodynamic tail vein injection mouse model. Cancer Research. 84:772-772.
Bigsby, R.M., and A. Caperell-Grant. 2011. The role for estrogen receptor-alpha and prolactin receptor in sex-dependent DEN-induced liver tumorigenesis. Carcinogenesis. 32:1162-1166.
Bonakdar, N., A. Schilling, M. Sporrer, P. Lennert, A. Mainka, L. Winter, G. Walko, G. Wiche, B. Fabry, and W.H. Goldmann. 2015. Determining the mechanical properties of plectin in mouse myoblasts and keratinocytes. Exp Cell Res. 331:331-337.
Boyault, S., D.S. Rickman, A. de Reynies, C. Balabaud, S. Rebouissou, E. Jeannot, A. Herault, J. Saric, J. Belghiti, D. Franco, P. Bioulac-Sage, P. Laurent-Puig, and J. Zucman-Rossi. 2007. Transcriptome classification of HCC is related to gene alterations and to new therapeutic targets. Hepatology. 45:42-52.
Bray, F., M. Laversanne, H. Sung, J. Ferlay, R.L. Siegel, I. Soerjomataram, and A. Jemal. 2024. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 74:229-263.
Buckup, M., M.A. Rice, E.C. Hsu, F. Garcia-Marques, S. Liu, M. Aslan, A. Bermudez, J. Huang, S.J. Pitteri, and T. Stoyanova. 2021. Plectin is a regulator of prostate cancer growth and metastasis. Oncogene. 40:663-676.
Burgstaller, G., M. Gregor, L. Winter, and G. Wiche. 2010. Keeping the vimentin network under control: cell-matrix adhesion-associated plectin 1f affects cell shape and polarity of fibroblasts. Mol Biol Cell. 21:3362-3375.
Chintalgattu, V., D.M. Nair, and L.C. Katwa. 2003. Cardiac myofibroblasts: a novel source of vascular endothelial growth factor (VEGF) and its receptors Flt-1 and KDR. J Mol Cell Cardiol. 35:277-286. Cuconati, A., C. Mills, C. Goddard, X. Zhang, W. Yu, H. Guo, X. Xu, and T.M. Block. 2013. Suppression of AKT anti-apoptotic signaling by a novel drug candidate results in growth arrest and apoptosis of hepatocellular carcinoma cells. PLoS One. 8:e54595.
Du, Y.Q., B. Yuan, Y.X. Ye, F.L. Zhou, H. Liu, J.J. Huang, and Y.F. Wei. 2024. Plumbagin Regulates Snail to Inhibit Hepatocellular Carcinoma Epithelial-Mesenchymal Transition in vivo and in vitro. J Hepatocell Carcinoma. 11:565-580.
Fan, Z.C., J. Yan, G.D. Liu, X.Y. Tan, X.F. Weng, W.Z. Wu, J. Zhou, and X.B. Wei. 2012. Real-time monitoring of rare circulating hepatocellular carcinoma cells in an orthotopic model by in vivo flow cytometry assesses resection on metastasis. Cancer Res. 72:2683-2691.
Ferrara, N., K. Carver-Moore, H. Chen, M. Dowd, L. Lu, K.S. O'Shea, L. Powell-Braxton, K.J. Hillan, and M.W. Moore. 1996. Heterozygous embryonic lethality induced by targeted inactivation of the VEGF gene. Nature. 380:439-442.
Fu, Q., Q. Zhang, Y. Lou, J. Yang, G. Nie, Q. Chen, Y. Chen, J. Zhang, J. Wang, T. Wei, H. Qin, X. Dang, X. Bai, and T. Liang. 2018. Primary tumor-derived exosomes facilitate metastasis by regulating adhesion of circulating tumor cells via SMAD3 in liver cancer. Oncogene. 37:6105-6118.
Gerber, H.P., T.H. Vu, A.M. Ryan, J. Kowalski, Z. Werb, and N. Ferrara. 1999. VEGF couples hypertrophic cartilage remodeling, ossification and angiogenesis during endochondral bone formation. Nat Med. 5:623-628.
Gnani, D., I. Romito, S. Artuso, M. Chierici, C. De Stefanis, N. Panera, A. Crudele, S. Ceccarelli, E. Carcarino, V. D'Oria, M. Porru, E. Giorda, K. Ferrari, L. Miele, E. Villa, C. Balsano, D. Pasini, C. Furlanello, F. Locatelli, V. Nobili, R. Rota, C. Leonetti, and A. Alisi. 2017. Focal adhesion kinase depletion reduces human hepatocellular carcinoma growth by repressing enhancer of zeste homolog 2. Cell Death Differ. 24:889-902.
Gregor, M., S. Osmanagic-Myers, G. Burgstaller, M. Wolfram, I. Fischer, G. Walko, G.P. Resch, A. Jorgl, H. Herrmann, and G. Wiche. 2014. Mechanosensing through focal adhesion-anchored intermediate filaments. FASEB J. 28:715-729.
Hiratsuka, S., S. Goel, W.S. Kamoun, Y. Maru, D. Fukumura, D.G. Duda, and R.K. Jain. 2011. Endothelial focal adhesion kinase mediates cancer cell homing to discrete regions of the lungs via E-selectin up-regulation. Proc Natl Acad Sci U S A. 108:3725-3730.
Jakab, M., K.H. Lee, A. Uvarovskii, S. Ovchinnikova, S.R. Kulkarni, S. Jakab, T. Rostalski, C. Spegg, S. Anders, and H.G. Augustin. 2024. Lung endothelium exploits susceptible tumor cell states to instruct metastatic latency. Nat Cancer. 5:716-730.
Jin, H., C. Wang, G. Jin, H. Ruan, D. Gu, L. Wei, H. Wang, N. Wang, E. Arunachalam, Y. Zhang, X. Deng, C. Yang, Y. Xiong, H. Feng, M. Yao, J. Fang, J. Gu, W. Cong, and W. Qin. 2017. Regulator of Calcineurin 1 Gene Isoform 4, Down-regulated in Hepatocellular Carcinoma, Prevents Proliferation, Migration, and Invasive Activity of Cancer Cells and Metastasis of Orthotopic Tumors by Inhibiting Nuclear Translocation of NFAT1. Gastroenterology. 153:799-811 e733.
Jirouskova, M., K. Nepomucka, G. Oyman-Eyrilmez, A. Kalendova, H. Havelkova, L. Sarnova, K. Chalupsky, B. Schuster, O. Benada, P. Miksatkova, M. Kuchar, O. Fabian, R. Sedlacek, G. Wiche, and M. Gregor. 2018. Plectin controls biliary tree architecture and stability in cholestasis. J Hepatol. 68:1006-1017.
Katada, K., T. Tomonaga, M. Satoh, K. Matsushita, Y. Tonoike, Y. Kodera, T. Hanazawa, F. Nomura, and Y. Okamoto. 2012. Plectin promotes migration and invasion of cancer cells and is a novel prognostic marker for head and neck squamous cell carcinoma. J Proteomics. 75:1803-1815.
Koster, J., S. van Wilpe, I. Kuikman, S.H. Litjens, and A. Sonnenberg. 2004. Role of binding of plectin to the integrin beta4 subunit in the assembly of hemidesmosomes. Mol Biol Cell. 15:1211-1223.
Liu, H., Q. Chen, D. Lu, X. Pang, S. Yin, K. Wang, R. Wang, S. Yang, Y. Zhang, Y. Qiu, T. Wang, and H. Yu. 2020. HTBPI, an active phenanthroindolizidine alkaloid, inhibits liver tumorigenesis by targeting Akt. FASEB J. 34:12255-12268.
Lu, H.H., S.Y. Lin, R.R. Weng, Y.H. Juan, Y.W. Chen, H.H. Hou, Z.C. Hung, G.A. Oswita, Y.J. Huang, S.Y. Guu, K.H. Khoo, J.Y. Shih, C.J. Yu, and H.C. Tsai. 2020. Fucosyltransferase 4 shapes oncogenic glycoproteome to drive metastasis of lung adenocarcinoma. EBioMedicine. 57:102846.
Mathews, S.T., E.P. Plaisance, and T. Kim. 2009. Imaging systems for westerns: chemiluminescence vs. infrared detection. Methods in molecular biology (Clifton, N.J.). 536:499-513.
Osmanagic-Myers, S., M. Gregor, G. Walko, G. Burgstaller, S. Reipert, and G. Wiche. 2006. Plectincontrolled keratin cytoarchitecture affects MAP kinases involved in cellular stress response and migration. J Cell Biol. 174:557-568.
Osmanagic-Myers, S., S. Rus, M. Wolfram, D. Brunner, W.H. Goldmann, N. Bonakdar, I. Fischer, S. Reipert, A. Zuzuarregui, G. Walko, and G. Wiche. 2015. Plectin reinforces vascular integrity by mediating crosstalk between the vimentin and the actin networks. J Cell Sci. 128:4138-4150.
Pillai-Kastoori, L., A.R. Schutz-Geschwender, and J.A. Harford. 2020. A systematic approach to quantitative Western blot analysis. Analytical biochemistry. 593:113608.
Pitot, H.C. 2001. Pathways of progression in hepatocarcinogenesis. Lancet (London, England). 358:859860.
Prechova, M., Z. Adamova, A.L. Schweizer, M. Maninova, A. Bauer, D. Kah, S.M. Meier-Menches, G. Wiche, B. Fabry, and M. Gregor. 2022. Plectin-mediated cytoskeletal crosstalk controls cell tension and cohesion in epithelial sheets. J Cell Biol. 221.
Prechova, M., K. Korelova, and M. Gregor. 2023. Plectin. Curr Biol. 33:R128-R130.
Qi, L., T. Knifley, M. Chen, and K.L. O'Connor. 2022. Integrin alpha6beta4 requires plectin and vimentin for adhesion complex distribution and invasive growth. J Cell Sci. 135.
Romito, I., M. Porru, M.R. Braghini, L. Pompili, N. Panera, A. Crudele, D. Gnani, C. De Stefanis, M. Scarsella, S. Pomella, S. Levi Mortera, E. de Billy, A.L. Conti, V. Marzano, L. Putignani, M. Vinciguerra, C. Balsano, A. Pastore, R. Rota, M. Tartaglia, C. Leonetti, and A. Alisi. 2021. Focal adhesion kinase inhibitor TAE226 combined with Sorafenib slows down hepatocellular carcinoma by multiple epigenetic effects. J Exp Clin Cancer Res. 40:364.
Si, T., L. Huang, T. Liang, P. Huang, H. Zhang, M. Zhang, and X. Zhou. 2023. Ruangan Lidan decoction inhibits the growth and metastasis of liver cancer by downregulating miR-9-5p and upregulating PDK4. Cancer Biol Ther. 24:2246198.
Strumberg, D., J.W. Clark, A. Awada, M.J. Moore, H. Richly, A. Hendlisz, H.W. Hirte, J.P. Eder, H.J. Lenz, and B. Schwartz. 2007. Safety, pharmacokinetics, and preliminary antitumor activity of sorafenib: a review of four phase I trials in patients with advanced refractory solid tumors. Oncologist. 12:426-437.
Tao, Q.F., S.X. Yuan, F. Yang, S. Yang, Y. Yang, J.H. Yuan, Z.G. Wang, Q.G. Xu, K.Y. Lin, J. Cai, J. Yu, W.L. Huang, X.L. Teng, C.C. Zhou, F. Wang, S.H. Sun, and W.P. Zhou. 2015. Aldolase B inhibits metastasis through Ten-Eleven Translocation 1 and serves as a prognostic biomarker in hepatocellular carcinoma. Mol Cancer. 14:170.
Vahidnezhad, H., L. Youssefian, N. Harvey, A.R. Tavasoli, A.H. Saeidian, S. Sotoudeh, A. Varghaei, H. Mahmoudi, P. Mansouri, N. Mozafari, O. Zargari, S. Zeinali, and J. Uitto. 2022. Mutation update: The spectra of PLEC sequence variants and related plectinopathies. Human mutation. 43:17061731.
Voisin, L., M. Lapouge, M.K. Saba-El-Leil, M. Gombos, J. Javary, V.Q. Trinh, and S. Meloche. 2024. Syngeneic mouse model of YES-driven metastatic and proliferative hepatocellular carcinoma. Dis Model Mech. 17.
Wang, D.D., Y. Chen, Z.B. Chen, F.J. Yan, X.Y. Dai, M.D. Ying, J. Cao, J. Ma, P.H. Luo, Y.X. Han, Y. Peng, Y.H. Sun, H. Zhang, Q.J. He, B. Yang, and H. Zhu. 2016. CT-707, a Novel FAK Inhibitor, Synergizes with Cabozantinib to Suppress Hepatocellular Carcinoma by Blocking Cabozantinib-Induced FAK Activation. Mol Cancer Ther. 15:2916-2925.
Wang, W., A. Zuidema, L. Te Molder, L. Nahidiazar, L. Hoekman, T. Schmidt, S. Coppola, and A. Sonnenberg. 2020. Hemidesmosomes modulate force generation via focal adhesions. J Cell Biol. 219.
Wendt, M.K., M.A. Taylor, B.J. Schiemann, and W.P. Schiemann. 2011. Down-regulation of epithelial cadherin is required to initiate metastatic outgrowth of breast cancer. Mol Biol Cell. 22:24232435.
Wenta, T., A. Schmidt, Q. Zhang, R. Devarajan, P. Singh, X. Yang, A. Ahtikoski, M. Vaarala, G.H. Wei, and A. Manninen. 2022. Disassembly of alpha6beta4-mediated hemidesmosomal adhesions promotes tumorigenesis in PTEN-negative prostate cancer by targeting plectin to focal adhesions. Oncogene. 41:3804-3820.
Wilhelm, S., C. Carter, M. Lynch, T. Lowinger, J. Dumas, R.A. Smith, B. Schwartz, R. Simantov, and S. Kelley. 2006. Discovery and development of sorafenib: a multikinase inhibitor for treating cancer. Nat Rev Drug Discov. 5:835-844.
Wilhelm, S.M., C. Carter, L. Tang, D. Wilkie, A. McNabola, H. Rong, C. Chen, X. Zhang, P. Vincent, M. McHugh, Y. Cao, J. Shujath, S. Gawlak, D. Eveleigh, B. Rowley, L. Liu, L. Adnane, M. Lynch, D. Auclair, I. Taylor, R. Gedrich, A. Voznesensky, B. Riedl, L.E. Post, G. Bollag, and P.A. Trail. 2004. BAY 43-9006 exhibits broad spectrum oral antitumor activity and targets the RAF/MEK/ERK pathway and receptor tyrosine kinases involved in tumor progression and angiogenesis. Cancer Res. 64:7099-7109.
Xu, R., S. He, D. Ma, R. Liang, Q. Luo, and G. Song. 2022. Plectin Downregulation Inhibits Migration and Suppresses Epithelial Mesenchymal Transformation of Hepatocellular Carcinoma Cells via ERK1/2 Signaling. Int J Mol Sci. 24.
You, A., M. Cao, Z. Guo, B. Zuo, J. Gao, H. Zhou, H. Li, Y. Cui, F. Fang, W. Zhang, T. Song, Q. Li, X. Zhu, H. Yin, H. Sun, and T. Zhang. 2016. Metformin sensitizes sorafenib to inhibit postoperative recurrence and metastasis of hepatocellular carcinoma in orthotopic mouse models. J Hematol Oncol. 9:20.
Yuzugullu, H., K. Benhaj, N. Ozturk, S. Senturk, E. Celik, A. Toylu, N. Tasdemir, M. Yilmaz, E. Erdal, K.C. Akcali, N. Atabey, and M. Ozturk. 2009a. Canonical Wnt signaling is antagonized by noncanonical Wnt5a in hepatocellular carcinoma cells. Molecular Cancer. 8:90.
Yuzugullu, H., K. Benhaj, N. Ozturk, S. Senturk, E. Celik, A. Toylu, N. Tasdemir, M. Yilmaz, E. Erdal, K.C. Akcali, N. Atabey, and M. Ozturk. 2009b. Canonical Wnt signaling is antagonized by noncanonical Wnt5a in hepatocellular carcinoma cells. Mol Cancer. 8:90.
Zhao, J., Y. Hou, C. Yin, J. Hu, T. Gao, X. Huang, X. Zhang, J. Xing, J. An, S. Wan, and J. Li. 2020. Upregulation of histamine receptor H1 promotes tumor progression and contributes to poor prognosis in hepatocellular carcinoma. Oncogene. 39:1724-1738.
Zheng, H., Y. Yang, C. Ye, P.P. Li, Z.G. Wang, H. Xing, H. Ren, and W.P. Zhou. 2018. Lamp2 inhibits epithelial-mesenchymal transition by suppressing Snail expression in HCC. Oncotarget. 9:3024030252.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Response to the Reviewer #1 (Public review):
We greatly appreciate the reviewer’s high evaluation of our paper and helpful comments. As expected, we revealed that the CCL17/CCL22–CCR4 axes play an important role in guiding Tregs to the atherosclerotic aorta. Interestingly, we also demonstrated that these axes are critical for Treg-dependent regulation of proinflammatory T cell responses in lymphoid tissues and atherosclerotic aortas, which is a previously unrecognized role for CCR4 in regulating inflammatory immune responses. However, the role of the CCL17/CCL22–CCR4 axes in regulating inflammatory immune responses and atherosclerosis has not been fully elucidated and further investigation is needed.
Response to the reviewer #2 (Public review):
We greatly appreciate the reviewer’s high evaluation of our paper and helpful comments and suggestions. We isolated CD4<sup>+</sup>CD25<sup>+</sup> T cells and used them as Tregs in several experiments. As the reviewer pointed out, we realize that CD4<sup>+</sup>CD25<sup>+</sup> T cell population contains some activated effector T cells. However, in consideration of the high expression levels of the most reliable Treg marker Foxp3 in isolated CD4<sup>+</sup>CD25<sup>+</sup> T cells determined by flow cytometry, we believe that our method for separating Tregs would be acceptable.
Regarding the role of Th17 cells in atherosclerosis, conflicting results have been reported. Therefore, it is unclear whether augmented Th17 cell immune responses contribute to accelerated atherosclerosis in Ccr4<sup>-/-</sup>Apoe<sup>-/-</sup> mice.
As the reviewer pointed out, it is important to consider the clinical relevance of our findings. We analyzed public database to determine if Ccr4 single nucleotide polymorphisms correlate with a higher incidence of atherosclerotic cardiovascular disease. However, no evidence supporting the clinical relevance of our findings was found.
Response to the Reviewer #3 (Public review):
We greatly appreciate the reviewer’s high evaluation of our paper and helpful comments and suggestions. In accordance with the reviewer’s suggestion, we described the detailed methods and carefully performed data analysis regarding flow cytometry, which would strengthen the conclusion of this study.
We understood the importance of reviewer’s claim that CCR4 deficiency does not shift the Th1 cell/Treg balance toward Th1 cell responses in all lymphoid tissues. CCR4 deficiency promoted the accumulation of Th1 cells but did not affect the accumulation of Tregs in the atherosclerotic aorta, which led to the shift of the Th1 cell/Treg balance toward Th1 cell responses. The frequencies of both Tregs and Th1 cells in peripheral lymphoid tissues were increased by CCR4 deficiency, while these CCR4-deficient Tregs exhibited impaired suppressive function. Given this, we speculate that CCR4 deficiency may shift the Th1 cell/Treg balance toward Th1 cell responses in peripheral lymphoid tissues. However, it is difficult to clearly show this. We revised the manuscript accordingly.
Although the reviewer pointed out the possibility that modulation of the Th1 cell/Th17 cell balance might be responsible for the changes in aortic inflammatory cells in Ccr4<sup>-/-</sup>Apoe<sup>-/-</sup> mice, the role of Th17 cells in atherosclerosis remain controversial. However, we cannot completely exclude the possibility of the involvement of the Th17 response modulation in accelerated atherosclerosis in Ccr4<sup>-/-</sup>Apoe<sup>-/-</sup> mice.
As the limitation of this study, the phenotypic heterogeneity and dynamics of aortic leukocytes could not be revealed by flow cytometric analysis. Single-cell proteomic and transcriptomic approaches would provide additional important information on various aortic cells including immune cells and vascular cells.
Reviewer #1 (Recommendations for the authors):
Issue (1) Ideally, CCR4 could be deleted on Foxp3+ cells and some staining on double positive Rorg+Foxp3+ done. On the other side, a whole gene expression of infiltrated Foxp3 and effector could be also helpful. More challenging, it would be important to see whether those CCR4-specific Trges could or not regulate effector infiltrating cells.
As the reviewer suggested, single-cell proteomic and transcriptomic approaches would be helpful to reveal the phenotypic heterogeneity and dynamics of aortic leukocytes including Tregs. Also, the use of conditional knockout mice would reveal the precise role of CCR4-expressing Tregs in regulating aortic immune cell infiltration and atherosclerosis.
Reviewer #2 (Recommendations for the authors):
Minor Suggestions:
Issue (1) In supplementary Figure 1, CCR4 expression would be better represented by dot plots rather than histograms.
We revised Supplementary Figure 1A through 1C.
Issue (2) The reduction in CD103 expression shown in Figure 2E at 8 weeks should be discussed.
In Figure 2E, we found that the expression of CD103 in peripheral LN Tregs was slightly lower in 8-week-old Ccr4<sup>-/-</sup>Apoe<sup>-/-</sup> mice than in age-matched Apoe<sup>-/-</sup> mice, while there was no difference in its expression levels between 18-week-old Apoe<sup>-/-</sup> and Ccr4<sup>-/-</sup>Apoe<sup>-/-</sup> mice. In addition, there was no significant difference in the mRNA expression of this molecule in splenic Tregs between 8-week-old Apoe<sup>-/-</sup> and Ccr4<sup>-/-</sup>Apoe<sup>-/-</sup> mice. Based on the minor effect of CCR4 deficiency on CD103 expression in Tregs, reduced CD103 expression in Ccr4<sup>-/-</sup>Apoe<sup>-/-</sup> mice does not seem to be an important change.
Issue (3) The increased expression of CD86 by DCs should be discussed.
The upregulated CD86 expression on DCs in Ccr4<sup>-/-</sup>Apoe<sup>-/-</sup> mice might be explained by the data on a Treg-DC coculture experiment showing the impaired cell–cell contacts between CCR4-deficient Tregs and DCs. On the other hand, the expression of another important costimulatory molecule CD80 on DCs was not altered in these mice, which is not consistent with the data on the above coculture experiment. The reason why only CD86 expression on DCs was upregulated in Ccr4<sup>-/-</sup>Apoe<sup>-/-</sup> mice remains unclear.
Issue (4) In Figures 5F-H, using larger dots would enhance visibility.
We revised the graphs in Figure 5F-H.
Issue (5) In Figure 5I, since the data is normalized, a one-sample t-test is more appropriate.
In accordance with the reviewer’s suggestion, we reconsidered the data analysis. Because there was a dramatic difference in the absolute number of Kaede-expressing Tregs accumulated in the aorta among experiments, we were worried that the statistical analysis of the combined data from multiple experiments might draw a wrong conclusion. We have decided to show the representative data from 3 independent experiments in Figure 5I.
Issue (6) On page 11, line 256, the text mentions IL4 and IL10 being detected by cytokine array; however, the figures do not show these cytokines.
We are afraid that the reviewer might have misunderstood the data. The cytokine levels of IL-4 and IL-10 could not be detected by cytokine array analysis. Accordingly, we carefully revised the text in the manuscript.
Issue (7). On page 14, lines 326-330, the text should be revised for clarity.
We revised the text in the manuscript.
Issue (8) Several data are marked as "not shown"; some of this information is relevant and should be included in the supplementary figures.
We showed the data on CCL17 and CCL22 expression in peripheral LNs in Supplementary Figure 2.
Major Suggestions:
Issue (1) FoxP3 expression should be evaluated post-isolation of CD4<sup>+</sup>CD25<sup>+</sup> T cells, and FoxP3- CD4<sup>+</sup>CD25<sup>+</sup> T cells should be characterized. Tregs could be more effectively isolated using FoxP3eGFP mice.
After isolation of CD4<sup>+</sup>CD25<sup>+</sup> T cells (the purity was >95%), we examined Foxp3 expression by flow cytometry and found that most of these cells express Foxp3 (Supplementary Figure 10). Therefore, CD4<sup>+</sup>CD25<sup>+</sup> T cells without Foxp3 expression, which are considered contaminated effector T cells, are minor cells and would not substantially affect the results. Nonetheless, the use of Foxp3-eGFP mice would enable us to isolate Tregs more accurately.
Issue (2) In Figure 3, it would be interesting to evaluate whether there are RORgt+Tbet+ (IL17+IFNg+) cells. These cells would be pathogenic, whereas RORgt+CD73+ cells would be non-pathogenic.
We analyzed CD4<sup>+</sup> T cells producing both IL-17 and IFN-γ in the peripheral lymphoid tissues of Apoe<sup>-/-</sup> and Ccr4<sup>-/-</sup>Apoe<sup>-/-</sup> mice. We found that this cell population was quite rare and that there was no significant difference its proportion between the 2 groups, suggesting the possible minor contribution of this cell population to the atherosclerosis phenotype.
Author response image 1.
Issue (3) Different time points after adoptive cell transfer should be evaluated to confirm reduced migration to the atherosclerotic aorta.
It would be interesting to evaluate Treg migration to the atherosclerotic aorta at different time points after Treg transfer. However, it seems difficult to accurately evaluate the migration of Tregs at later time points because they would proliferate in the aorta.
Issue (4) The authors could evaluate whether Ccr4 SNPs correlate with an increased risk of atherosclerosis.
As the reviewer pointed out, it is important to consider the clinical relevance of our findings. However, there is no evidence supporting that Ccr4 single nucleotide polymorphisms correlate with a higher incidence of atherosclerotic cardiovascular disease.
Issue (5) The authors could evaluate if the transfer of Apoe<sup>-/-</sup> Tregs rescues early atherosclerosis development in Ccr4<sup>-/-</sup>Apoe<sup>-/-</sup> mice.
To confirm whether transfer of CCR4-intact Tregs rescues the development of early atherosclerotic lesions in Ccr4<sup>-/-</sup>Apoe<sup>-/-</sup> mice, we injected Ccr4<sup>-/-</sup>Apoe<sup>-/-</sup> mice with saline or Tregs from Apoe<sup>-/-</sup> or Ccr4<sup>-/-</sup>Apoe<sup>-/-</sup> mice and analyzed the aortic root atherosclerotic lesions of recipient Ccr4<sup>-/-</sup>Apoe<sup>-/-</sup> mice. However, we found no significant difference in the aortic sinus plaque area among the 3 groups. We described this result in the results section and included the data in Supplementary Figure 8.
Reviewer #3 (Recommendations for the authors):
Analysis of TCD4<sup>+</sup> cell populations in different tissues:
Issue (1) The description of flow cytometry analysis is incomplete and requires clarification. Please detail the use of controls to ensure correct analysis, including the following: i) cell viability; ii) staining controls to define positive and negative cells; iii) the gating strategy used to identify cell populations in each lymphoid tissue and aorta (please provide them as supplementary figures).
As we thought that most of the prepared cells would be viable, we did not check their viability. Based on our previous work where various immune cells including Tregs, effector memory T cells, and helper T cell subsets were clearly detected, in this study we performed flow cytometric analysis of these immune cells without preparing negative controls stained with isotype control antibodies. The gating strategy of flow cytometric analysis of various immune cells in peripheral lymphoid tissues was reported in our previous report (J Am Heart Assoc 2024; 13: e031639). We provided the gating strategy of flow cytometric analysis of helper T cells and Tregs in the aorta in Supplementary Figure 9.
Issue (2) The phenotype/differentiation markers used for analysing T CD4<sup>+</sup> cell subsets differ between lymphoid tissues and aortic lesions; might this influence results? If so, please comment on that.
As the number of aortic T cells was quite few compared with that in peripheral lymphoid tissues, it seemed difficult to precisely detect aortic T cells including various helper T cell subsets and Tregs by intracellular cytokine staining. Therefore, we decided to analyze these cells by evaluating transcription factors specific for helper T cell subsets. The difference in the markers used for analyzing T cell subsets would not considerably influence the results.
Issue (3) Considering my observations about the effect of CCR4 deficiency on the T CD4<sup>+</sup> differentiation profile in different tissues, I suggest comparing Th1/Treg and Th17/Treg ratios in all examined tissues. The modulation of the Th17/Th1 balance could shape inflammation.
The Th1 cell/Treg balance is shifted toward Th1 cell responses in the atherosclerotic aorta of Ccr4<sup>-/-</sup>Apoe<sup>-/-</sup> mice, while this balance would not be altered in the peripheral lymphoid tissues. It remains unclear whether CCR4 deficiency affects the Th17 cell/Treg ratio. We do not think that it is important to investigate the effect of CCR4 deficiency on the balance of Th17 cell/Treg or Th17 cell/Th1 cell because the role of Th17 cell responses in atherosclerosis remains controversial.
Issue (4) Cell numbers of recovered Treg from para-aortic lymphoid nodes and aortic tissues might not allow Treg functional assays. Analysis by flow cytometry of biomarkers of Treg activation state would be more informative than by quantifying mRNA expression levels. In particular, TGFβ analysis at the mRNA level does not provide much more information about the suppressive activity of Treg, and even at the protein level, the recognition of the active form of this cytokine is required. Analysis of PD1 (for exhausted cell phenotype) and Treg apoptosis along the stages of atherosclerosis could also yield useful information.
We performed flow cytometric analysis of activation markers CTLA-4 and CD103, cell exhaustion marker PD1, and apoptosis in Tregs in the para-aortic LNs of Apoe<sup>-/-</sup> or Ccr4<sup>-/-</sup>Apoe<sup>-/-</sup> mice, and found no major differences in the expression levels of these molecules or the proportion of apoptotic cells between the 2 groups. We showed these data below.
Author response image 2.
Unfortunately, we failed to evaluate the activity of TGF-β in Tregs because an appropriate experimental method for precisely detecting its active form was unavailable.
Issue (5) Regarding the result´s interpretation, I recommend being precise when concluding to avoid misunderstanding. A shift in the T CD4<sup>+</sup> response in lymphoid tissues might be interpreted as a modulation of the T cell differentiation process, which strongly depends on signals derived from DCs, which were not the focus of this study.
There are two possible mechanisms for the altered CD4<sup>+</sup> T cell responses in peripheral lymphoid tissues, which include the modulation of their differentiation and proliferation processes. These processes are substantially regulated by DCs whose function could be favorably modulated by CCR4-expressing Tregs as described in the manuscript. Therefore, we think that the interactions between Tregs and DCs are crucial for shifting the CD4<sup>+</sup> T cell responses in peripheral lymphoid tissues, though it remains unclear which process plays a major role in regulating CD4<sup>+</sup> T cell polarization.
Suppression studies:
Issue (1) In vitro assays. According to the methodology suppression studies were performed using Treg collected from peripheral lymphoid nodes and spleen, but it is unclear whether these cells were analysed separately or as a pool (this was not clarified in the legend of Figure 5 either). Besides, be precise about which cells were used as antigen-presenting cells in the Treg suppression assay.
In in vitro Treg suppression assay, we used Tregs purified from peripheral lymph nodes and spleen as a pool. We used splenocytes as antigen-presenting cells in Treg suppression assay. We revised the manuscript accordingly.
Issue (2) Obtaining CD4<sup>+</sup>CD25<sup>+</sup> and CD4<sup>+</sup>CD25-. The control of the purity and viability of cell preparations from CCR4 deficient and CCR4 sufficient Apoe<sup>-/-</sup> mice should be included as a supplementary material; these purified cells were used in in vitro suppressive assays and in vivo cell transfer experiments, being relevant information to guarantee results. Since this control was performed by flow cytometry, I wonder whether Foxp3 levels were also checked.
We included the data on the purity and viability of CD4<sup>+</sup>CD25<sup>+</sup> Tregs and CD4<sup>+</sup>CD25<sup>-</sup> T cells from Apoe<sup>-/-</sup> or Ccr4<sup>-/-</sup>Apoe<sup>-/-</sup> mice in Supplementary Figure 10. After the isolation of CD4<sup>+</sup>CD25<sup>+</sup> T cells, we examined Foxp3 expression by flow cytometry and found that most of these cells express Foxp3.
Issue (3) For in vitro assays, IL-2, IL-10, and TGFβ measurement in culture supernatants could confirm and provide more information about Treg function.
As both CD4<sup>+</sup>CD25<sup>+</sup> Tregs and CD4<sup>+</sup>CD25<sup>-</sup> T cells would produce various cytokines in in vitro Treg suppression assay, it is difficult to determine which cells mainly produce the above cytokines. Therefore, measurement of these cytokines would not provide more information about Treg function.
Issue (4) It would be interesting to assess whether CCR4-mediated DC-Treg interaction is equally important to regulate Th1 than Th17 and Th2 activation; this likely requires using different settings to favour each activation profile.
Based on our findings, we speculate that CCR4 may play an important role in regulating not only Th1 cell responses but also Th2 and Th17 cell responses by maintaining the interactions between Tregs and DCs. However, it may not be meaningful to investigate the effect of CCR4 deficiency on these T cell responses because the roles of Th2 and Th17 cell responses in atherosclerosis remain controversial.
Issue (5) The authors showed that the presence of Treg decreased CD80 and CD86 surface levels in DCs in vitro, remarking a lower capacity of Treg derived from CCR4-deficient mice (Figure 5B). However, the fact that CD86 on splenic CD11c+MHC-II+ DCs in 8-week-old Ccr4<sup>-/-</sup>Apoe<sup>-/-</sup> mice was significantly higher than in Apoe<sup>-/-</sup> was underestimated (Supplementary Figure 4). This data needs reconsideration as it might indicate an in vivo more permissive activation state of DCs in Ccr4<sup>-/-</sup>Apoe<sup>-/-</sup> mice than in Apoe<sup>-/-</sup> mice, explaining the augmented effector T cell response observed in these mice (Figure 2).
Our finding of the upregulated CD86 expression on DCs in Ccr4<sup>-/-</sup>Apoe<sup>-/-</sup> mice could be explained by the data on a Treg-DC coculture experiment showing the impaired ability of CCR4-deficient Tregs to downregulate CD80 and CD86 expression on DCs. As the reviewer pointed out, our data may indicate more permissive activation state of DCs and subsequent augmentation of effector T cell responses in Ccr4<sup>-/-</sup>Apoe<sup>-/-</sup> mice, which may be derived from impaired Treg suppressive function.
Assays for chemokine levels and influence on T cell activation and traffic:
Issue (1) Considering the findings described by Döring et al. (reference 24 in the paper), monitoring CCL22, CCL17, and CCL3 levels in the aorta and lymph nodes along atherosclerosis development would help in understanding when and how CCL17/CCL20-CCR4 might influence T cell activation and traffic. I wonder whether these chemokines were assayed by qPCR in lymphoid nodes and aorta from CCR4-deficient and sufficient Apoe<sup>-/-</sup> mice. The authors report that CCR8 (capable also of binding CCL17) was unaltered by CCR4 deficiency in splenic and para-aortic lymph nodes Treg from 8 and 18 weeks-old mice, respectively (Supplementary Figure 5 and 6), although a trend towards a high-level was observed for splenic Treg. It would be informative to evaluate CCR8 Treg levels along with atherosclerosis progress.
As it is considered that the mRNA expression levels of chemokines do not necessarily reflect their protein expression levels, we did not analyze the mRNA expression of Ccl17 or Ccl22 by quantitative reverse transcription PCR. Instead of this, we evaluated the protein expression of CCL17 and CCL22 not only in the aorta but also in the peripheral lymph nodes of 18-week-old wild-type, Apoe<sup>-/-</sup>, and Ccr4<sup>-/-</sup>Apoe<sup>-/-</sup> mice by immunohistochemistry. We found no marked differences in their expression levels in peripheral lymph nodes among these mice and included the data in Supplementary Figure 2.
As we focused on the role of the CCL17/CCL22–CCR4 axes in atherosclerosis, we did not examine the expression of CCL3 that is not directly related to these axes. The evaluation of CCR8+ Treg proportion is beyond the scope of this study, though we are interested in the change of this population by CCR4 deficiency associated with atherosclerotic lesion development.
Issue (2) According to IFNγ and IL-17 expressing TCD4<sup>+</sup> subclasses, Th1 and Th17 cell subset levels increase in the spleen (Figure 3B-D) and para-aortic lymphoid nodes (Figure 4E) in CCR4 absence. A comparison of the CCR4 dependence for the migration of Th17 and Th1 cell subsets to the aorta was not performed in this atherosclerosis model; this study could help to understand the mechanisms associated with the aortic inflammation development.
To evaluate the migration of Th1 or Th17 cells in the aorta, we need to specifically isolate them from the peripheral lymphoid tissues of Apoe<sup>-/-</sup> or Ccr4<sup>-/-</sup>Apoe<sup>-/-</sup> mice and adoptively transfer them into recipient Apoe<sup>-/-</sup> mice. However, it is impossible to isolate alive Th1 or Th17 cells because specific cell surface markers that enable us to separate these cells are unavailable.
Issue (3) The numbers of Kaede Treg cells detected in the aorta were extremely low in both Apoe<sup>-/-</sup> and Ccr4<sup>-/-</sup>Apoe<sup>-/-</sup> mice (Figure 5I), opening results to question. Besides, the flow cytometry assay used for determining Kaede Treg cells in tissues was not well described. How were cell viability and formation of doublets examined to avoid artefacts? The gating strategy used to ensure a confident analysis of Kaede Tregs, particularly in the aorta, should be included as supplementary material.
The extremely low number of Kaede-expressing Tregs migrated in the aorta of Apoe<sup>-/-</sup> and Ccr4<sup>-/-</sup>Apoe<sup>-/-</sup> mice may be derived from the small number of the transferred Tregs. As another explanation for this finding, Tregs may rarely migrate in the aorta under hypercholesterolemic conditions. We did not check the viability or doublets of Kaede-expressing Tregs because we thought that such experimental procedures would not considerably affect the results. We provided the gating strategy of flow cytometric analysis of Kaede-expressing Tregs in peripheral lymphoid tissues and aortas in Supplementary Figure 11.
Other comments:
Issue (1) As an alternative for statistical data analysis from independent experiments, two-way ANOVA with Tukey's post hoc (for data normally distributed) or the Mack Skillings exact test with Conover´s post hoc multiple comparison test (for a two-way layout in non-parametric conditions) could improve analysis.
We performed statistical analysis in Figure 5A according to the reviewer’s suggestion.
Issue (2) For future work, employing recombinant pseudo-receptor proteins capable of neutralizing chemokines (doi: 10.1016/j.jhep.2021.08.029) might help as an alternative to complete knockout mice.
We thank the reviewer for giving us the information on an interesting approach as an alternative to CCR4-deficient mice.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
eLife Assessment
This study uses state-of-the-art methods to label endogenous dopamine receptors in a subset of Drosophila mushroom body neuronal types. The authors report that DopR1 and Dop2R receptors, which have opposing effects in intracellular cAMP, are present in axons termini of Kenyon cells, as well as those of two classes of dopaminergic neurons that innervate the mushroom body indicative of autocrine modulation by dopaminergic neurons. Additional experiments showing opposing effects of starvation on DopR1 and DopR2 levels in mushroom body neurons are consistent with a role for dopamine receptor levels increasing the efficiency of learned food-odour associations in starved flies. Supported by solid data, this is a valuable contribution to the field.
We thank the editors for the assessment, but request to change “DopR2” to “Dop2R”. The dopamine receptors in Drosophila have confusing names, but what we characterized in this study are called Dop1R1 (according to the Flybase; aka DopR1, dDA1, Dumb) and Dop2R (ibid; aka Dd2R). DopR2 is the name of a different dopamine receptor.
Public Reviews:
Reviewer #1 (Public Review):
Summary:
This is an important and interesting study that uses the split-GFP approach. Localization of receptors and correlating them to function is important in understanding the circuit basis of behavior.
Strengths:
The split-GFP approach allows visualization of subcellular enrichment of dopamine receptors in the plasma membrane of GAL4-expressing neurons allowing for a high level of specificity.
The authors resolve the presynaptic localization of DopR1 and Dop2R, in "giant" Drosophila neurons differentiated from cytokinesis-arrested neuroblasts in culture as it is not clear in the lobes and calyx.
Starvation-induced opposite responses of dopamine receptor expression in the PPL1 and PAM DANs provide key insights into models of appetitive learning.
Starvation-induced increase in D2R allows for increased negative feedback that the authors test in D2R knockout flies where appetitive memory is diminished.
This dual autoreceptor system is an attractive model for how amplitude and kinetics of dopamine release can be fine-tuned and controlled depending on the cellular function and this paper presents a good methodology to do it and a good system where the dynamics of dopamine release can be tested at the level of behavior.
Weaknesses:
LI measurements of Kenyon cells and lobes indicate that Dop2R was approximately twice as enriched in the lobe as the average density across the whole neuron, while the lobe enrichment of Dop1R1 was about 1.5 times the average, are these levels consistent during different times of the day and the state of the animal. How were these conditions controlled and how sensitive are receptor expression to the time of day of dissection, staining, etc.
To answer this question, we repeated the experiment in two replicates at different times of day and confirmed that the receptor localization was consistent (Figure 3 – figure supplement 1); LI measurements showed that Dop2R is enriched more in the lobe and less in the calyx compared to Dop1R1 (Figure 3D). The states of animals that could affect LI (e.g. feeding state and anesthesia for sorting, see methods) were kept constant.
The authors assume without discussion as to why and how presynaptic enrichment of these receptors is similar in giant neurons and MB.
In the revision, we added a short summary to recapitulate that the giant neurons exhibit many characteristics of mature neurons (Lines #152-156): "Importantly, these giant neurons exhibit characteristics of mature neurons, including firing patterns (Wu et al., 1990; Yao & Wu, 2001; Zhao & Wu, 1997) and acetylcholine release (Yao et al., 2000), both of which are regulated by cAMP and CaMKII signaling (Yao et al., 2000; Yao & Wu, 2001; Zhao & Wu, 1997)." In addition, we found punctate Brp accumulations localized to the axon terminals of the giant neurons (former Figure 4D and 4E). Therefore, the giant neuron serves as an excellent model to study the presynaptic localization of dopamine receptors in isolated large cells.
Figures 1-3 show the expensive expression of receptors in alpha and beta lobes while Figure 5 focusses on PAM and localization in γ and β' projections of PAM leading to the conclusion that presynaptic dopamine neurons express these and have feedback regulation. Consistency between lobes or discussion of these differences is important to consider.
In the revised manuscript, we show data in the γ KCs (Figure 4C, Figure 5 - figure supplement 1) in addition to α/β KCs, and demonstrate the consistent synaptic localization of Dop1R1 and Dop2R as in α/β KCs (Figure 4B and 5A).
Receptor expression in any learning-related MBONs is not discussed, and it would be intriguing as how receptors are organized in those cells. Given that these PAMs input to both KCs and MBONs these will have to work in some coordination.
The subcellular localization of dopamine receptors in MBONs indeed provides important insights into the site of dopaminergic signaling in these neurons (Takemura et al., 2017; Pavlowsky et al., 2018; Pribbenow et al., 2022). Therefore, we added new data for Dop1R1 and Dop2R in MBON-γ1pedc>αβ (Figure 6). Interestingly, these receptors are localized to in the dendritic projection in the γ1 compartment as well as presynaptic boutons (Figure 6).
Although authors use the D2R enhancement post starvation to show that knocking down receptors eliminated appetitive memory, the knocking out is affecting multiple neurons within this circuit including PAMs and KCs. How does that account for the observed effect? Are those not important for appetitive learning?
In the appetitive memory experiment (Figure 9C), we knocked down Dop2R only in the select neurons of the PPL1 cluster, and this manipulation does not directly affect Dop2R expression in PAMs and KCs.
Starvation-induced enhancement of Dop2R expression in the PPL1 neurons (Figure 8F) would attenuate their outputs and therefore disinhibit expression of appetitive memory in starved flies (Krashes et al., 2009). Consistently, Dop2R knock-down in PPL1 impaired appetitive memory in starved flies (Figure 9C). We revised the corresponding text to make this point clearer (Lines #224227).
The evidence for fine-tuning is completely based on receptor expression and one behavioral outcome which could result from many possibilities. It is not clear if this fine-tuning and presynaptic feedback regulation-based dopamine release is a clear possibility. Alternate hypotheses and outcomes could be considered in the model as it is not completely substantiated by data at least as presented.
The reviewer’s concern is valid, and the presynaptic dopamine tuning by autoreceptors may need more experimental support. We therefore additionally discussed another possibility (Lines #289-291): “Alternatively, these presynaptic receptors could potentially receive extrasynaptic dopamine released from other DANs. Therefore, the autoreceptor functions need to be experimentally clarified by manipulating the receptor expression in DANs.”
Reviewer #2 (Public Review):
Summary:
Hiramatsu et al. investigated how cognate neurotransmitter receptors with antagonizing downstream effects localize within neurons when co-expressed. They focus on mapping the localization of the dopaminergic Dop1R1 and Dop2R receptors, which correspond to the mammalian D1- and D2-like dopamine receptors, which have opposing effects on intracellular cAMP levels, in neurons of the Drosophila mushroom body (MB). To visualize specific receptors in single neuron types within the crowded MB neuropil, the authors use existing dopamine receptor alleles tagged with 7 copies of split GFP to target reconstitution of GFP tags only in the neurons of interest as a read-out of receptor localization. The authors show that both Dop1R1 and Dop2R, with differing degrees, are enriched in axonal compartments of both the Kenyon Cells cholinergic presynaptic inputs and in different dopamine neurons (DANs), which project axons to the MB. Co-localization studies of dopamine receptors with the presynaptic marker Brp suggest that Dop1R1 and, to a larger extent Dop2R, localize in the proximity of release sites. This localization pattern in DANs suggests that Dop1R1 and Dop2R work in dual-feedback regulation as autoreceptors. Finally, they provide evidence that the balance of Dop1R1 and Dop2R in the axons of two different DAN populations is differentially modulated by starvation and that this regulation plays a role in regulating appetitive behaviors.
Strengths:
The authors use reconstitution of GFP fluorescence of split GFP tags knocked into the endogenous locus at the C-terminus of the dopamine receptors as a readout of dopamine receptor localization. This elegant approach preserves the endogenous transcriptional and post-transcriptional regulation of the receptor, which is essential for studies of protein localization.
The study focuses on mapping the localization of dopamine receptors in neurons of the mushroom body. This is an excellent choice of system to address the question posed in this study, as the neurons are well-studied, and their connections are carefully reconstructed in the mushroom body connectome. Furthermore, the role of this circuit in different behaviors and associative memory permits the linking of patterns of receptor localization to circuit function and resulting behavior. Because of these features, the authors can provide evidence that two antagonizing dopamine receptors can act as autoreceptors within the axonal compartment of MB innervating DANs. The differential regulation of the balance of the two receptors under starvation in two distinct DAN innervations provides evidence of the role that regulation of this balance can play in circuit function and behavioral output.
Weaknesses:
The approach of using endogenously tagged alleles to study localization is a strength of this study, but the authors do not provide sufficient evidence that the insertion of 7 copies of split GFP to the C terminus of the dopamine receptors does not interfere with the endogenous localization pattern or function. Both sets of tagged alleles (1X Venus and 7X split GFP tagged) were previously reported (Kondo et al., 2020), but only the 1X Venus tagged alleles were further functionally validated in assays of olfactory appetitive memory. Despite the smaller size of the 7X split-GFP array tag knocked into the same location as the 1X venus tag, the reconstitution of 7 copies of GFP at the C terminus of the dopamine receptor, might substantially increase the molecular bulk at this site, potentially impeding the function of the receptor more significantly than the smaller, single Venus tag. The data presented by Kondo et al. 2020, is insufficient to conclude that the two alleles are equivalent.
In the revision, we validated the function of these engineered receptors by a new set of olfactory learning experiments. Both these receptors in KCs were shown to be required for aversive memory (Kim et al., 2007, Scholz-Kornehl et al., 2016). As in the anatomical experiments, we induced GFP110 expression in KC of the flies homozygous for 7xGFP<sub>11</sub>-tagged receptors using MB-Switch and 3 days of RU486 feeding o. We confirmed STM performance of these flies were not significantly different from the control (Figure 2 – figure supplement 1). Thus, these fusion receptors are functional.
The authors' conclusion that the receptors localize to presynaptic sites is weak. The analysis of the colocalization of the active zone marker Brp whole-brain staining with dopamine receptors labeled in specific neurons is insufficient to conclude that the receptors are localized at presynaptic sites. Given the highly crowded neuropil environment, the data cannot differentiate between the receptor localization postsynaptic to a dopamine release site or at a presynaptic site within the same neuron. The known distribution of presynaptic sites within the neurons analyzed in the study provides evidence that the receptors are enriched in axonal compartments, but co-labeling of presynaptic sites and receptors in the same neuron or super-resolution methods are needed to provide evidence of receptor localization at active zones. The data presented in Figures 5K-5L provides compelling evidence that the receptors localize to neuronal varicosities in DANs where the receptors could play a role as autoreceptors.
Given the highly crowded environment of the mushroom body neuropil, the analysis of dopamine receptor localization in Kenyon cells is not conclusive. The data is sufficient to conclude that the receptors are preferentially localizing to the axonal compartment of Kenyon cells, but co-localization with brain-wide Brp active zone immunostaining is not sufficient to determine if the receptor localizes juxtaposed to dopaminergic release sites, in proximity of release sites in Kenyon cells, or both.
To better resolve the microcircuits of KCs, we triple-labeled the plasma membrane and DAR::rGFP in KCs, and Brp, and examined their localizations with high-resolution imaging with Airyscan. This strategy revealed the receptor clusters associated with Brp accumulation within KCs (Figure 4). To further verify the association of DARs and active zones within KCs, we co-expressed Brp<sup>short</sup>::mStraw and GFP<sub>1-10</sub> and confirmed their colocalization (Figure 5A), suggesting presynaptic localization of DARs in KCs. With these additional characterizations, we now discuss the significance of receptors at the presynaptic sites of KCs.
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
This is an important and interesting study that uses the split-GFP approach. Localization of receptors and correlating them to function is important in understanding the circuit basis of behavior.
For Figure 1, the authors show PAM, PPL1 neurons, and the ellipsoid body as a validation of their tools (Dop1R1-T2A-GAL4 and Dop2R-T2A-GAL4) and the idea that these receptors are colocalized. However, it appears that the technique was applied to the whole brain so it would be great to see the whole brain to understand how much labelling is specific and how stochastic. Methods could include how dissection conditions were controlled and how sensitive are receptor expression to the time of day of dissection, staining, etc.
The expression patterns of the receptor T2A-GAL4 lines (Figure 1A and 1B) are consistent in the multiple whole brains (Kondo et al., 2020, Author response image 1).
Author response image 1.
The significance of the expression of these two receptors in an active zone is not clearly discussed and presynaptic localization is not elaborated on. Would something like expansion microscopy be useful in resolving this? It would be important to discuss that as giant neurons in culture don't replicate many aspects of the MB system.
In the revised manuscript, we elaborated discussion regarding the function of the two antagonizing receptors at the AZ (Lines #226-275).
Does MB-GeneSwitch > GFP1-1 reliably express in gamma lobes? Most of the figures show alpha/beta lobes.
Yes. MB-GeneSwitch is also expressed in γ KCs, but weakly. 12 hours of RU486 feeding, which we did in the previous experiments, was insufficient to induce GFP reconstitution in the γ KCs. By extending the time of transgene induction, we visualized expression of Dop1R1 and Dop2R more clearly in γ KCs. Their localization is similar to that in the α/β KCs (Figure 4C, Figure 5 - figure supplement 1).
Figure 6, y-axis says protein level. At first, I thought it was related to starvation so maybe authors can be more specific as the protein level doesn't indicate any aspect of starvation.
We appreciate this comment, and the labels on the y-axis were now changed to “rGFP levels” (Figure 8C and 8F, Figure 8 - figure supplement 1B, 1D and 1F).
Reviewer #2 (Recommendations For The Authors):
Title:
The title of the manuscript focuses on the tagging of the receptors and their synaptic enrichment.
Given that the alleles used in the study were generated in a previously published study (Kondo et al, 2020), which describes the receptor tagging and that the data currently provided is insufficient to conclude that the receptors are localizing to synapses, the title should be changed to reflect the focus on localizing antagonistic cognate neurotransmitter receptors in the same neuron and their putative role as autoreceptors in DANs.
Following this advice, we removed the methodology from the title and revised it to “Synaptic enrichment and dynamic regulation of the two opposing dopamine receptors within the same neurons”.
Minor issues with text and figures:
Figure 1
A conclusion from Figure 1 is that the two receptors are co-expressed in Kenyon cells. Please provide panels equivalent to the ones shown in D-G, with Kenyon cells cell bodies, or mark these cells in the existing panels, if present. Line 111 refers to panel 1D as the Kenyon cells panel, which is currently a PAM panel.
We added images for coexpression of these receptors in the cell bodies of KCs (Figure 1 - figure supplement 1) and revised the text accordingly (Lines #89-90).
Given that most of the study centers on visualizing receptor localization, it would benefit the reader to include labels in Figure 1 that help understand that these panels reflect expression patterns rather than receptor localization. For instance, rCD2::GFP could be indicated in the Dop1R1-LexA panels.
As suggested, labels were added to indicate the UAS and lexAop markers (Figure 1D, 1E, 1G-1I and Figure 1 – figure supplement 1).
Given that panels D-E focus on the cell bodies of the neurons, it could be beneficial for the reader to present the ellipsoid body neurons using a similar view that only shows the cell bodies. Similarly, one could just show the glial cell bodies .
We now show the cell bodies of ring neurons (Figure 1G) and ensheathing glia (Figure 1I).
For panel 1E, please indicate the subset of PPL1 neurons that both expressed Dop1R1 and Dop2R, as indicated in the text, as it is currently unclear from the image.
Dop1R1-T2A-LexA was barely detected in all PPL1 (Figure 1E). We corrected the confusing text (Lines #95-96).
Figure 2
The cartoon of the cell-type-specific labeling should show that the tag is 7XFP-11 and the UAScomponent FP-10, as the current cartoon leads the reader to conclude that the receptors are tagged with a single copy of split GFP. The detail that the receptors are tagged with 7 copies of split GFP is only provided through the genotype of the allele in the resource table. This design aspect should be made clear in the figure and the text when describing the allele and approach used to tag receptors in specific neuron types.
We now added the construct design in the scheme (Figure 2A) and revised the corresponding text (Line #101-103).
Panel A. The arrow representing the endogenous promoter in the yellow gene representation should be placed at the beginning of the coding sequence. Currently, the different colors of what I assume are coding (yellow) and non-coding (white) transcript regions are not described in the legend. I would omit these or represent them in the same color as thinner boxes if the authors want to emphasize that the tag is inserted at the C terminus within the endogenous locus.
The color scheme was revised to be more consistent and intuitive (Figure 2A).
Figure 3
Labels of the calyx and MB lobes would benefit readers not as familiar with the system used in the study. In addition, it would be beneficial to the reader to indicate in panel A the location of the compartments analyzed in panel H (e.g., peduncle, α3).
Figure 3A was amended to clearly indicate the analyzed MB compartments.
Adding frontal and sagittal to panels B-E, as in Figure 2, would help the reader interpret the data.
In Figure 3B, “Frontal” and “Sagittal” were indicated.
Panel F-G. A scale bar should be provided for the data shown in the insets. Could the author comment on the localization of Dop1R1 in KCs? The data in the current panel suggests that only a subset of KCs express high levels of receptors in their axons, as a portion of the membrane is devoid of receptor signals. This would be in line with differential dopamine receptor expression in subsets of Kenyon cells, as shown in Kondo et al., 2020, which is currently not commented on in the paper.
We confirmed that the majority of the KCs express both Dop1R1 and Dop2R genes (Figure 1 - figure supplement 1). LIs should be compared within the same cells rather than the differences of protein levels between cell types as they also reflect the GAL4 expression levels.
Panel H. Some P values are shown as n.s. (p> 0.05). Other non-significant p values in this panel and in other figures throughout the paper are instead reported (e.g. peduncle P=0.164). For consistency, please report the values as n.s. as indicated in the methods for all non-significant tests in this panel and throughout the manuscript.
We now present the new dataset, and the graph represents the appropriate statistical results (Figure 3D; see the methods section for details).
The methods of labeling the receptors through the expression of the GeneSwitch-controlled GFP1-10 in Kenyon cells induced by RU486 are not provided in the methods. Please provide a description of this as referenced in the figure legend and the genotypes used in the analysis shown in the panels.
The method of RU486 feeding has been added. We apologize for the missing method.
Figure 4
Please provide scale bars for the inset in panels A-B.
Scale bars were added to all confocal images.
The current analysis cannot distinguish between postsynaptic and presynaptic dopamine receptors in KCs, and the figure title should reflect this.
We now present the new data dopamine receptors in KCs and clearly distinguish Brp clusters of the KCs and other cell types (Figure 4, Figure 5).
The reader could benefit from additional details of using the giant neuron model, as it is not commonly used, and it is not clear how to relate this to interpret the localization of dopaminergic receptors within Kenyon cells. The use of the venus-tagged receptor variant should be introduced in the text, as using a different allele currently lacks context. Figures 4F-4J show that the receptor is localizing throughout the neuron. Quantifying the fraction of receptor signal colocalizing with Brp could aid in interpreting the data. However, it would still not be clear how to interpret this data in the context of understanding the localization of the receptors in neurons within fly brain circuits. In the absence of additional data, the data provided in Figure 4 is inconclusive and could be omitted, keeping the focus of the study on the analysis of the two receptors in DANs. Co-expressing a presynaptic marker in Kenyon cells (e.g., by expressing Brp::SNAP) in conjunction with rGFP labeled receptor would provide additional evidence of the relationship of release sites in Kenyon cells and tagged dopamine receptors in these same cells and could add evidence in support to the current conclusion.
Following the advice, we added a short summary to recapitulate that the giant neurons exhibit many characteristics of mature neurons (Lines #152-156): "Importantly, these giant neurons exhibit characteristics of mature neurons, including firing patterns (Wu et al., 1990; Yao & Wu, 2001; Zhao & Wu, 1997) and acetylcholine release (Yao et al., 2000), both of which are regulated by cAMP and CaMKII signaling (Yao et al., 2000; Yao & Wu, 2001; Zhao & Wu, 1997)." Therefore, the giant neuron serves as an excellent model to study the presynaptic localization in large cells in isolation.
To clarify polarized localization of Brp clusters and dopamine receptors but not "localizing throughout the neuron", we now show less magnified data (Figure 5C). It clearly demonstrates punctate Brp accumulations localized to the axon terminals of the giant neurons (former Figure 4D and 4E). This is the same membrane segment where Dop1R1 and Dop2R are localized (Figure 5C). Therefore, the association of Brp clusters and the dopamine receptors in the isolated giant neurons suggests that the subcellular localization in the brain neurons is independent of the circuit context.
As the giant neurons do not form intermingled circuits, venus-tagged receptors are sufficient for this experiment and simpler in genetics.
Following the suggestion to clarify the AZ association of the receptors in KCs, we coexpressed Brpshort-mStraw and GFP1-10 in KCs and confirmed their colocalization (Figure 5A).
Figure 6
The data and analysis show that starvation induces changes in the α3 compartment in PPL1 neurons only, while the data provided shows no significant change for PPL1 neurons innervating other MB compartments. This should be clearly stated in lines 174-175, as it is implied that there is a difference in the analysis for compartments other than α3. Panel L of Figure 6 - supplement 1 shows no significant change for all three compartments analyzed and should be indicated as n.s. in all instances, as stated in the methods.
We revised the text to clarify that the starvation-induced differences of Dop2R expression were not significant (Lines #217-219). The reason to highlight the α3 compartment is that both Dop1R1 and Dop2R are coexpressed in this PPL1 neuron (Figure 8D).
Additional minor comments:
There are a few typos and errors throughout the manuscript. The text should be carefully proofread to correct these. Here are the ones that came to my attention:
Please reference all figure panels in the text. For instance, Figure 3A is not mentioned and should be revised in line 112 as Figure 3A-E.
Lines 103-104. The sentence "LI was visualized as the color of the membrane signals" is unclear and should be revised.
Figure 4 legend - dendritic claws should likely be B and C and not B and E.
Lines 147 - Incorrect figure panels, should be 5C-L or 5D-E.
Line 241 - DNAs should be DANs.
Methods - please define what the abbreviation CS stands for.
We really appreciate for careful reading of this reviewer. All these were corrected.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the previous reviews.
Public Reviews:
Reviewer #1 (Public review):
Summary:
Shen et al. conducted three experiments to study the cortical tracking of the natural rhythms involved in biological motion (BM), and whether these involve audiovisual integration (AVI). They presented participants with visual (dot) motion and/or the sound of a walking person. They found that EEG activity tracks the step rhythm, as well as the gait (2-step cycle) rhythm. The gait rhythm specifically is tracked superadditively (power for A+V condition is higher than the sum of the A-only and V-only condition,
Experiments 1a/b), which is independent of the specific step frequency (Experiment 1b). Furthermore, audiovisual integration during tracking of gait was specific to BM, as it was absent (that is, the audiovisual congruency effect) when the walking dot motion was vertically inverted (Experiment 2). Finally, the study shows that an individual's autistic traits are negatively correlated with the BM-AVI congruency effect.
Strengths:
The three experiments are well designed and the various conditions are well controlled. The rationale of the study is clear, and the manuscript is pleasant to read. The analysis choices are easy to follow, and mostly appropriate.
Weaknesses:
On revision, the authors are careful not to overinterpret an analysis where the statistical test is not independent from the data (channel) selection criterion.
Thanks for the suggestion and we have done this according to your recommendations below.
Reviewer #1 (Recommendations for the authors):
Re: the double-dipping concern: I appreciate the revision. Just to clarify: my concern rests with the selection of *electrodes* based on the interaction test for the 1Hz condition. The 2Hz condition analogous test yields no significant electrodes. You perform subsequent tests (t-tests and 3-way interaction) on the data averaged across the electrodes that were significant for the 1Hz condition. Therefore, these tests will be biased to find a pattern reflecting an interaction at 1Hz, while no similar bias exists for an effect at 2Hz. Therefore, there is a bias to observe a 3-way interaction, and simple effects compatible with a 2-way interaction only for 1Hz, not for 2Hz (which is exactly what you found). There is no good statistical alternative here, I appreciate that, but the bias exists nonetheless. I think the wording is improved in this revision, and the evidence is convincing even in light of this bias.
We are grateful for your thoughtful comments on the analytical methods. We appreciate your concerns regarding the potential bias of examining 3-way interaction based on electrodes yielding a 2-way interaction effect. To address this issue, we have conducted a bias-free analysis based on electrodes across the whole brain. The results showed a similar pattern of 3-way interaction as previously reported (p = 0.051), suggesting that the previous findings might not be caused by electrode selection. Given that the main results of Experiment 2 were not based on whole-brain analysis, we did not involve this analysis in the main text, and we have removed the three-way interaction results based on selected electrodes from the manuscript to reduce potential concerns. It is also noteworthy that, when performing analyses based on channels independent of the interaction effect at 1 Hz (i.e., significant congruency effects in the upright and inverted conditions, respectively, at 2Hz), we got similar results as reported in the main text (i.e., non-significant interaction and correlation at 2 Hz). These results were presented in the supplementary file in previous versions and mentioned in the correlation part of the Results section (see Fig. S2). Once again, we sincerely appreciate your careful review of our research. We hope the abovementioned points adequately address your concern.
Reviewer #2 (Public review):
Summary:
The authors evaluate spectral changes in electroencephalography (EEG) data as a function of the congruency of audio and visual information associated with biological motion (BM) or non-biological motion. The results show supra-additive power gains in the neural response to gait dynamics, with trials in which audio and visual information was presented simultaneously producing higher average amplitude than the combined average power for auditory and visual conditions alone. Further analyses suggest that such supra-additivity is specific to BM and emerges from temporoparietal areas. The authors also find that the BM-specific supra-additivity is negatively correlated with autism traits.
Strengths:
The manuscript is well-written, with a concise and clear writing style. The visual presentation is largely clear. The study involves multiple experiments with different participant groups. Each experiment involves specific considered changes to the experimental paradigm that both replicate the previous experiment's finding yet extend it in a relevant manner.
In the first revisions of the paper, the manuscript better relays the results and anticipates analyses, and this version adequately resolves some concerns I had about analysis details. In a further revision, it is clarified better how the results relate to the various competing hypotheses on how biological motion is processed.
Weaknesses:
Still, it is my view that the findings of the study are basic neural correlate results that offer only minimal constraint towards the question of how the brain realizes the integration of multisensory information in the service of biological motion perception, and the data do not address the causal relevance of observed neural effects towards behavior and cognition. The presence of an inversion effect suggests that the supraadditivity is related to cognition, but that leaves open whether any detected neural pattern is actually consequential for multi-sensory integration (i.e., correlation is not causation). In other words, the fact that frequency-specific neural responses to the [audio & visual] condition are stronger than those to [audio] and [visual] combined does not mean this has implications for behavioral performance. While the correlation to autism traits could suggest some relation to behavior and is interesting in its own right, this correlation is a highly indirect way of assessing behavioral relevance. It would be helpful to test the relevance of supra-additive cortical tracking on a behavioral task directly related to the processing of biological motion to justify the claim that inputs are being integrated in the service of behavior. Under either framework, cortical tracking or entrainment, the causal relevance of neural findings toward cognition is lacking.
Overall, I believe this study finds neural correlates of biological motion that offer some constraint toward mechanism, and it is possible that the effects are behaviorally relevant, but based on the current task and associated analyses this has not been shown (or could not have been, given the paradigm).
Reviewer #2 (Recommendations for the authors):
Thank you for your revisions; I have updated the Strengths section, and reworded the weaknesses section. I now concede that the neural effects observed offer some constraint towards what the neural mechanisms for AV integration for BM are, whereas in my previous review, I said too strongly that these results do not offer any information about mechanism.
Thank you again for your insightful thoughts and comments on our research. They have contributed greatly to enhancing the discussion of the article and provided valuable inspiration for future exploration of causal mechanisms.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public Review):
Summary:
This paper investigates the mechanism of axon growth directed by the conserved guidance cue UNC-6/Netrin. Experiments were designed to distinguish between alternative models in which UNC-6/Netrin functions as either a short-range (haptotactic) cue or a diffusible (chemotactic) signal that steers axons to their final destinations. In each case, axonal growth cones execute ventrally directed outgrowth toward a proximal source of UNC-6/Netrin. This work concludes that UNC-6/Netrin functions as both a haptotactic and chemotactic cue to polarize the UNC-40/DCC receptor on the growth cone membrane facing the direction of growth. Ventrally directed axons initially contact a minor longitudinal nerve tract (vSLNC) at which UNC-6/Netrin appears to be concentrated before proceeding in the direction of the ventral nerve cord (VNC) from which UNC-6/Netrin is secreted. Time-lapse imaging revealed that growth cones appear to pause at the vSLNC before actively extending ventrally directed filopodia that eventually contact the VNC. Growth cone contacts with the vSLNC were unstable in unc-6 mutants but were restored by the expression of a membrane-tethered UNC-6 in vSLNC neurons. In addition, the expression of membrane-tethered UNC-6/Netrin in the VNC was not sufficient to rescue initial ventral outgrowth in an unc-6 mutant. Finally, dual expression of membrane-tethered UNC-6/Netrin in both vSLNC and VNC partially rescued the unc-6 mutant axon guidance defect, thus suggesting that diffusible UNC-6 is also required. This work is important because it potentially resolves the controversial question of how UNC-6/Netrin directs axon guidance by proposing a model in which both of the competing mechanisms, e.g., haptotaxis vs chemotaxis, are successively employed. The impact of this work is bolstered by its use of powerful imaging and genetic methods to test models of UNC-6/Netrin function in vivo thereby obviating potential artifacts arising from in vitro analysis.
Strengths:
A strength of this approach is the adoption of the model organism C. elegans to exploit its ready accessibility to live cell imaging and powerful methods for genetic analysis.
Weaknesses:
A membrane-tethered version of UNC-6/Netrin was constructed to test its haptotactic role, but its neuron-specific expression and membrane localization are not directly determined although this should be technically feasible. Time-lapse imaging is a key strength of multiple experiments but only one movie is provided for readers to review.
Thank you for your comments. We have now used SNAP labeling to directly visualize the localization of membrane tethered UNC-6 and confirmed UNC-6 is only detectable on the sublateral and ventral nerve cords (Figure S3A). These data have been added to the manuscript on page 15, lines 342-347. We have also provided a representative movie for each imaged genotype (Videos S2-10).
Reviewer #2 (Public Review):
Nichols et al studied the role of axon guidance molecules and their receptors and how these work as long-range and/or local cues, using in-vivo time-lapse imaging in C. elegans. They found that the Netrin axon guidance system works in different modes when acting as a long-range (chemotaxis) cue vs local cue (haptotaxis). As an initial context, they take advantage of the postembryonic-born neuron, PDE, to understand how its axon grows and then is guided into its target. They found that this process occurs in various discrete steps, during which the growth cone migrates and pauses at specific structures, such as the vSLNC. The role of the UNC-6/Netrin and UNC-40/DCC axon guidance ligand-receptor pair was then looked at in terms of its requirement for
(1) initial axon outgrowth direction
(2) stabilization at the intermediate target
(3) directional branching from the sublateral region or
(4) ventral growth from the intermediate target to the VNC.
They found that each step is disrupted in the unc-6/Netrin and unc-40/DCC mutants and observed how the localization of these proteins changed during the process of axon guidance in wild-type and mutant contexts. These observations were further supported by analysis of a mutant important for the regulation of Netrin signaling, the E3 ubiquitin ligase madd-2/Trim9/Trim67. Remarkably, the authors identified that this mutant affected axonal adhesion and stabilization, but not directional growth. Using membrane-tethered UNC-6 to specific localities, they then found this to be a consequence of the availability of UNC-6 at specific localities within the axon growth path. Altogether, this data and in-vivo analysis provide compelling evidence of the mechanistic foundation of Netrin-mediated axon guidance and how it works step by step.
The conclusions are well-supported, with both imaging and quantification of each step of axon guidance and localization of UNC-6 and UNC-40. Using a different type of neuron to validate their findings further supports their conclusions and strengthens their model. It's not yet known whether this model holds true for other ligand-receptor pairs, but the current work sets the stage for future analysis of other axon guidance molecules using time-lapse in-vivo imaging. There are still two outstanding questions that are important to address to support the authors' model and conclusions.
(1) The results of UNC-6-TM expression at different locations are clear and support the conclusions but need to consider that there's no diffusible UNC-6 available. What would happen if UNC-6 is tethered to the membrane in an otherwise completely 'normal' UNC-6 gradient. Does the axon guidance ensue normally or does it get stuck in the respective site of the membrane tethered-UNC-6 and doesn't continue to outgrow properly? This is an important control (expression of the UNC-6-TM at the vSLNC or VNC in the wild type background) that would help clarify this question and gain a better insight into the separability of both axon guidance steps and the ability to manipulate these.
Thank you for your comments. We expressed UNC-6<SUP>TM</SUP> at vSLNC and VNC in wild-type animals and examined adult morphology of both HSN and PDE in the control conditions you suggested. These data are available in Tables 1 and 2 with no statistical differences compared to wildtype animals. Second, we also provide still images of developing PDE axons near the vSLNC (Figure S3D) to confirm that this axon guidance step is intact when UNC-6<SUP>TM</SUP> is overexpressed in specific regions. Together, these data suggest that the TM rescue constructs do not interfere with endogenous axon guidance pathways. We have added these results to the manuscript on page 15, lines 347-349.
(2) Axon guidance systems do not work in a vacuum and are generally competing against each other. For example, the SLT-1/Slit and SAX-3/ROBO axon guidance ligand-receptor pair is also required for PDE, and other post-embryonic neurons, axon guidance. It would be interesting to test mutants for these genes with the membrane tethered-UNC-6 to determine if the different steps of axon guidance are disrupted and if so, to what degree these are disrupted.
Thank you for this suggestion. We have performed time-lapse imaging on slt-1 mutants and unc-6; slt-1 double mutants. These data are available in a new figure, Figure 3. Indeed, we found that slt-1 mutants showed abnormal direction of axon emergence and stabilization at the VNC but normal stabilization at vsLNC and axonal branching (Fig.3). These data can be found in the manuscript from pages 11-12, lines 248-269.
Reviewer #3 (Public Review):
Summary:
This manuscript from Nichols, Lee, and Shen tackles an important question of how unc6/netrin promotes axon guidance: i.e. haptotaxis vs chemotaxis. This has recently been a large topic of investigation and discussion in the axon guidance field. Using live cell imaging of unc6/netrin and unc40/DCC in several neurons that extend axons ventrally during development, as well as TM localized mutants of Unc6, they suggest that unc6 promotes first haptotaxis of the emerging growth cone followed by chemotaxis of the growth cone. This is timely, as a recent preprint from the Lundquist group, using a similar strategy to make only a TM anchored unc6 similarly found that this could rescue only the haptotaxis-like growth of the PDE neuron, but not the second phase of growth. However, their conclusions were quite different based on the overexpression of unc6 everywhere rescuing the second phase, and thus they conclude that a gradient is not present.
Strengths:
As this has been quite a controversy in both the invertebrate and vertebrate field, one strength of this paper is that they use an unc6-neon green to demonstrate unc6 localization, and show a gradient of localization.
Weaknesses:
This is important, although it could be strengthened by first showing a more zoomed-out image of unc6 in the animal, and second demonstrating the localization of the transmembrane anchored unc6 mutants, to help define what may be the "diffusible Unc6".
Thank you for your comments. We have performed both of these experiments. In Figure 6A, we provide a zoomed out image of PDE growth cone interacting with UNC-6::mNG prior to reaching the vSLNC. Notably, we do not observe an obvious gradient that extends into this more dorsal region of the animal. We have also shown the membrane localization of UNC-6<sup>TM</sup> through SNAP labeling in Figure S3A. These data have been added to the manuscript on page 15, lines 342-347.
I suggest two additional experimental or analysis suggestions: First, the authors clarify the phenotype of ventral emergence of the growth cone. Though the manuscript images suggest that no matter the mutant there is ventral emergence of the growth cone, but then later defects, yet they claim ventral emergence defects with the UNC6 tethered mutants, but there is no comparison of rose plots. This is confusing and needs to be addressed.
Thank you for your comment. We have now included images (i.e. slt-1(eh15) and unc-6(ev400); slt-1(eh15) genotypes in Figure 3) and movies showing misoriented axon emergence. We have also provided an additional quantification that allows for statistical comparison of emergence angle across genotypes. This quantification takes the sine function of the angle to quantify the relative emergence trajectory across the dorsal-ventral axis. A value of 1 indicates 90° dorsal emergence, and -1 indicates 90° ventral emergence. Statistical comparisons across genotypes demonstrate that axons in both unc-6 and slt-1 mutants are misoriented relative to wild-type axons. These comparisons can be found in Figures S1B, 3C, S2B, S3C.
Second, I have concerns that the analysis of unc40 polarization may be misleading in some cases when there appears to indeed be accumulation in the growth cone, but since the only analysis shown is relative to the rest of the cell, that can be lost.
Thank you for sharing your concerns about the UNC-40 polarization quantifications. We have separately compared the value of the integrated density of UNC-40::GFP in each cellular domain (vSLNC-contacting area and the dorsal soma) between genotypes. While we did not include these comparisons in the original manuscript, we have now included them in the revised manuscript. Overall, these data support our conclusions that UNC-40 mispolarization occurs across the entire cell (Fig. S1F,G; S2E-H; S3E,F).
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer 1:
Comment 1: Within the scope of the current work there are no major weaknesses. That said, the authors themselves note pressing questions beyond the scope of this study that remain unanswered. For instance, the mechanistic nature of the interactions between FMO-4 and the other players in this story, for example in terms of direct protein-protein interactions, is not at all understood yet.
We thank the reviewer for the positive review, and fully agree and acknowledge that there are unanswered questions for future studies that are beyond the scope of this manuscript.
Reviewer 2:
Comment 1: The effects of carbachol and EDTA on intracellular calcium levels are inferred, especially in the tissues where fmo-4 is acting. Validating that these agents and fmo-4 itself have an impact on calcium in relevant subcellular compartments is important to support conclusions on how fmo-4 regulates and responds to calcium.
We thank the reviewer for this important suggestion. We agree that carbachol and EDTA can be broad agents and validating that they are altering calcium levels is very useful. While this is technically challenging, we attempted to address this by using neuronally expressed GCaMP7f calcium indicator worms and measuring their GFP fluorescence upon exposure to carbachol and EDTA. Assessing both short term and long term exposure to these agents, we were able to show that carbachol increases GFP fluorescence, indicating an increase in calcium levels, and EDTA decreases GFP fluorescence, indicating a decrease in calcium levels. Unfortunately, because FMO-4 is not neuronally expressed, we were not able to test the effects of FMO-4 on calcium in this strain, which would require hypodermal expression and possibly short-term modification of fmo-4 expression to test. We have made sure to temper our language about the indirect measures we used.
Comment 2: Experiments are generally reliant on RNAi. While in most cases experiments reveal positive results, indicating RNAi efficacy, key conclusions could be strengthened with the incorporation of mutants.
We appreciate and value this suggestion and agree that mutants could be helpful to strengthen our conclusions. We address this caveat in the discussion of the revised manuscript. We explain that we were concerned about knocking out key calcium regulating genes like itr-1 and mcu-1 that either already result in some level of sickness in the worms when knocked down (itr-1) or could lead to confounding metabolic changes if knocked out. We do find that our RNAi lifespan results are robust and reproducible, but we also understand and recognize the caveats that come with using RNAi knockdown instead of full deletion mutants.
Reviewer 3:
Comment 1: no obvious transcriptomic evidence supporting a link between fmo-4 and calcium signaling: either for knockout worms or fmo-4 overexpressing strains.
We thank the reviewer for this feedback. While there is some transcriptomic evidence, we agree that it is not overwhelming evidence. We do think that this evidence, combined with the phenotype observed under thapsigargin (i.e., significant reduction in worm size and significant delay or prevention of development), in addition to the genetic connections to calcium regulation, provide additional compelling evidence that FMO-4 interacts with calcium signaling.
Comment 2: no direct measures of alterations in calcium flux, signalling or binding that strongly support a connection with fmo-4.
As described in reviewer 2 comment 1, we have successfully used GCaMP7f worms to assess calcium flux upon exposure to carbachol and EDTA. This approach confirmed the changes in calcium expected from these compounds. Unfortunately, because FMO-4 is not neuronally expressed, we were not able to test the effects of FMO-4 on calcium in this strain, which would require hypodermal expression and possibly short-term modification of fmo-4 expression to test. We have made sure to temper our language about the indirect measures we used.
Comment 3: no measures of mitochondrial morphology or activity that strongly support a connection with fmo-4.
This is a great point, and something we are currently working on to include for a future manuscript.
Comment 4: lack of a complete model that places fmo-4 function downstream of DR and mTOR signalling (first Results section), fmo-2 (second Results section) and at the same time explains connection with calcium signalling.
We thank the reviewer for this helpful feedback. We have included a more complete working model in our revision.
Recommendations for the authors:
Reviewer 1:
Comment 1: "We utilized fmo-4 (ok294) knockout (KO) animals on five conditions reported to extend lifespan in C. elegans." Here I believe "fmo-4 (ok294)" should be "fmo-4(ok294)". (No space).
We thank the reviewer for this helpful revision. We have made this change as suggested.
Comment 2: "Wild-type (WT) worms on DR experience a ~35% lifespan extension compared to fed WT worms, but when fmo-4 is knocked out this extension is reduced to ~10% and this interaction is significant by cox regression (p-value < 4.50e-6)." Here "cox regression" should be "Cox regression".
We have made this change as suggested.
Comment 3: "Having established this role, we continued lifespan analyses of fmo-4 KO worms exposed to RNAi knockdown of the S6-kinase gene rsks-1 (mTOR signaling), the von hippel lindau gene vhl-1 (hypoxic signaling), the insulin receptor daf-2 (insulin-like signaling), and the cytochrome c reductase gene cyc-1 (mitochondrial electron transport chain, cytochrome c reductase) (Fig 1C-F)." Here "von hippel lindau" should be "Von Hippel-Lindau".
We have made this change as suggested.
Comment 4: In three instances in the caption of Figure 5, the "4" in fmo-4 is not italicized when it should be.
We have made this change as suggested.
Comment 5: In two instances in the caption of Figure 7, the "4" in fmo-4 is not italicized when it should be, and in one instance in the caption of Figure 7, the "6" in atf-6 is not italicized when it should be.
We have made this change as suggested.
Comment 6: "Supplemental Data 3 provides the results of the Log-rank test and Cox regression analysis, which were run in Rstudio." Here Rstudio should be RStudio.
We have made this change as suggested.
Comment 7: In the references, within article titles italicization (e.g. of Caenorhabditis elegans) is frequently missing. While this is often an artifact introduced by reference management software, it should be corrected in the final manuscript.
We thank the reviewer for all the helpful revision suggestions. We have made sure all the references are properly italicized where necessary.
Reviewer 2:
Comment 1: While FMO-4 is clearly placed in the ER calcium pathway genetically, the molecular mechanism by which FMO-4 would alter ER calcium is unclear. Notably, Tuckowski et al. highlight this gap in the discussion as well.
We thank the reviewer for identifying this important caveat. We hope to address the molecular mechanism by which FMO-4 alters ER calcium in upcoming projects.
Comment 2: Determining whether overexpression of catalytically dead FMO-4 or introduction of an inactivating point mutant into the endogenous locus phenocopy FMO-4 OE and KO animals would help distinguish between mechanisms involving protein-protein interactions or downstream metabolic regulation.
We thank the reviewer for this valuable suggestion. This is an experiment we are hoping to do in the near future to better understand molecular mechanisms and protein-protein interactions.
Reviewer 3:
Comment 1: When measuring the effect of thapsigargin on development of fmo-4 mutants it would be great to use a developmental assay rather than quantifying normalized worm area. Also please add scale bars to Figure 3G and 4H, it seems that fmo-4 overexpression decreases worm size even in control conditions, clarify if this is the case.
We thank the reviewer for this feedback. In addition to quantifying normalized worm area in Figure 3G-I, we have added a developmental assay (Figure 3J) that shows the development time of wild-type worms on DMSO or thapsigargin as well as the fmo-4 OE worms on DMSO or thapsigargin. These data validate that the fmo-4 OE worm development is either delayed significantly or even prevented when the worms are treated with thapsigargin.
We have added scale bars to Figure 3G and 4H as suggested.
We also appreciate the reviewer’s observation of the fmo-4 overexpression worms appearing smaller than wild-type worms in control conditions. We looked through the replicates and found that just one replicate showed a significant decrease in worm size, as observed in our unrevised manuscript. We repeated this experiment twice more to gather more data and determined that the fmo-4 overexpression worms were ultimately not significantly different in size compared to wild-type worms. We have included the new images and quantifications in Figure 3G-I and Figure 4H-J in the revised manuscript.
Comment 2: correct or replace Supplementary Table 2, which is not showing a DAVID analysis as the title and text would suggest. We should see biological/molecular processes, effect sizes, p-values, ...
We thank the reviewer for identifying this issue. We have added more detail to the Supplementary Table 2 so that it is clearer what is being shown in each tab.
Comment 3: clarify the data presented in Supplementary Data 2 because it does not clearly explain what is shown
This is a great point, and we have added more detail to the Supplementary Data 2 to make sure the data are more clearly explained in each tab.
Comment 4: in Figure 5B the fluorescent images do not seem to reflect the quantification in panel 5C.
Thank you for this feedback. We re-analyzed our data to make sure the proper fluorescent images are included with their matching quantifications in Figure 5B-C.
Comment 5: where is Supplementary Data 3?
We thank the reviewer for noticing this. Supplementary Data 3 was accidentally missing from the first submission, and has now been added.
Comment 6: conceptually the last results section (regarding atf-6) does not add much to the story, I would consider removing these results
We appreciate this feedback. We have decided to keep Figure 7 because we think it helps to validate fmo-4’s role in calcium movement from the ER. While we show genetic interactions between fmo-4 and key genes involved in calcium regulation (crt-1, itr-1, and mcu-1), we think that showing how fmo-4 also interacts with atf-6, a known regulator of calcium homeostasis, strengthens and supports the genetic mechanisms of fmo-4 proposed in this manuscript.
Comment 7: the model proposed in Figure 7E is not convincingly supported by the results:<br /> o the arrows connecting atf-6, fmo-4 and crt-1 (calreticulin) suggest that fmo-4 is downstream of atf-6 and upstream of crt-1: Berkowitz 2020 showed that atf-6 knockdown downregulates calreticulin, so unless the authors show that this downregulation is mediated directly by fmo-4, the more likely explanation is that atf-6 knockdown affects calcium levels which in turn induces fmo-4 expression.
We thank the reviewer for this helpful feedback. We have addressed this by updating our proposed model. We used a solid arrow leading from the reduction of atf-6 to induction of fmo-4, as this is supported by our data in Figure 7A-B. We then used dashed arrows between fmo-4 and crt-1 as well as between atf-6 and crt-1 to indicate that more data is needed to clarify this part of the pathway.
Comment 8: Avoid pointing at a mitochondrial connection in the title as the only evidence supporting this interaction comes from the mcu-1 RNAi epistasis.
We appreciate the reviewer’s suggestion. We added another piece of evidence suggesting an interaction between fmo-4 and the mitochondria to Supplementary Figure 7G-H. Here we show that while fmo-4 OE worms are resistant to paraquat stress, knocking down vdac-1 (a calcium regulator located in the outer mitochondrial membrane), abrogates this effect. We have kept mitochondria in our title but have made sure to temper our language in the main text to avoid pointing to a strong mitochondrial connection, since we have two pieces of evidence connecting fmo-4 to the mitochondria.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Reviewer #1 (Public review):
Hüppe and colleagues had already developed an apparatus and an analytical approach to capture swimming activity rhythms in krill. In a previous manuscript they explained the system, and here they employ it to show a circadian clock, supplemented by exogenous light, produces an activity pattern consistent with "twilight" diel vertical migration (DVM; a peak at sunset, a midnight sink, and a peak in the latter half of the night).
They used light:dark (LD) followed by dark:dark (DD) photoperiods at two times of the year to confirm the circadian clock, coupled with DD experiments at four times of year to show rhythmicity occurs throughout the year along with DVM in the wild population. The individual activity data show variability in the rhythmic response, which is expected. However, their results showed rhythmicity was sustained in DD throughout the year, although the amplitude decayed quickly. The interpretation of a weak clock is reasonable, and they provide a convincing justification for the adaptive nature of such a clock in a species that has a wide distributional range and experiences various photic environments. These data also show that exogenous light increases the activity response and can explain the morning activity bouts, with the circadian clock explaining the evening and late-night bouts. This acknowledgement that vertical migration can be driven by multiple proximate mechanisms is important.
The work is rigorously done, and the interpretations are sound. I see no major weaknesses in the manuscript. Because a considerable amount of processing is required to extract and interpret the rhythmic signals (see Methods and previous AMAZE paper), it is informative to have the individual activity plots of krill as a gut check on the group data.
The manuscript will be useful to the field as it provides an elegant example of looking for biological rhythms in a marine planktonic organism and disentangling the exogenous response from the endogenous one. Furthermore, as high latitude environments change, understanding how important organisms like krill have the potential to respond will become increasingly important. This work provides a solid behavioral dataset to complement the earlier molecular data suggestive of a circadian clock in this species.
We appreciate the positive evaluation of our work by Reviewer 1, acknowledging our approach to record locomotor activity in krill as well as the importance of the findings in assessing krill’s potential to respond to environmental change in their habitat.
Reviewer #2 (Public review):
Summary:
This manuscript provides experimental evidence on circadian behavioural cycles in Antarctic krill. The krill were obtained directly from krill fishing vessels and the experiments were carried out on board using an advanced incubation device capable of recording activity levels over a number of days. A number of different experiments were carried out where krill were first exposed to simulated light:dark (L:D) regimes for some days followed by continuous darkness (DD). These were carried out on krill collected during late autumn and late summer. A further set of experiments was performed on krill across three different seasons (summer, autumn, winter), where incubations were all DD conditions. Activity was measured as the frequency by which an infrared beam close to the top of the incubation tube was broken over unit time. Results showed that patterns of increased and decreased activity that appeared synchronised to the LD cycle persisted during the DD period. This was interpreted as evidence of the operation of an internal (endogenous) clock. The amplitude of the behavioural cycles decreased with time in DD, which further suggests that this clock is relatively weak. The authors argued that the existence of a weak endogenous clock is an adaptation to life at high latitudes since allowing the clock to be modulated by external (exogenous) factors is an advantage when there is a high degree of seasonality. This hypothesis is further supported by seasonal DD experiments which showed that the periodicity of high and low activity levels differed between seasons.
Strengths
Although there has been a lot of field observations of various circadian type behaviour in Antarctic krill, relatively few experimental studies have been published considering this behaviour in terms of circadian patterns of activity. Krill are not a model organism and obtaining them and incubating them in suitable conditions are both difficult undertakings. Furthermore, there is a need to consider what their natural circadian rhythms are without the overinfluence of laboratory-induced artefacts. For this reason alone, the setup of the present study is ideal to consider this aspect of krill biology.
Furthermore, the equipment developed for measuring levels of activity is well-designed and likely to minimise artefacts.
We would like to thank Reviewer 2 for their positive assessment of our approach to study the influence of the circadian clock on krill behavior. We are delighted, that Reviewer 2 found our mechanistic approach in understanding daily behavioral patterns of Antarctic krill using the AMAZE set-up convincing, and that the challenging circumstances of working with a polar, non-model species are acknowledged.
Weaknesses
I have little criticism of the rationale for carrying out this work, nor of the experimental design. Nevertheless, the manuscript would benefit from a clearer explanation of the experimental design, particularly aimed at readers not familiar with research into circadian rhythms. Furthermore, I have a more fundamental question about the relationship between levels of activity and DVM on which I will expand below. Finally, it was unclear how the observational results made here related to the molecular aspects considered in the Discussion.
(1) Explanation of experimental design - I acknowledge that the format of this particular journal insists that the Results are the first section that follows the Introduction. This nevertheless presents a problem for the reader since many of the concepts and terms that would generally be in the Methods are yet to be explained to the reader. Hence, right from the start of the Results section, the reader is thrown into the detail of what happened during the LD-DD experiments without being fully aware of why this type of experiment was carried out in the first place. Even after reading the Methods, further explanation would have been helpful. Circadian cycle type research of this sort often entrains organisms to certain light cycles and then takes the light away to see if the cycle continues in complete darkness, but this critical piece of knowledge does not come until much later (e.g. lines 369372) leaving the reader guessing until this point why the authors took the approach they did. I would suggest the following (1) that more effort is made in the Introduction to explain the exact LD/DD protocols adopted (2) that a schematic figure is placed early on in the manuscript where the protocol is explained including some logical flow charts of e.g. if behavioural cycle continues in DD then internal clock exists versus if cycle does not continue in DD, the exogenous cues dominate - followed by - major decrease in cyclic amplitude = weak clock versus minor decrease = strong clock and so on
We would like to thank Reviewer 2 for pointing out that the experimental design and the rationale behind it are not becoming clear early in the manuscript, especially for people outside the field of chronobiology. We think that the suggestion to include a schematic figure early in the manuscript is excellent and we plan to implement this in a revised version of the manuscript.
(2) Activity vs kinesis - in this study, we are shown data that (i) krill have a circadian cycle - incubation experiments; (ii) that krill swarms display DVM in this region - echosounder data (although see my later point). My question here is regarding the relationship between what is being measured by the incubation experiments and the in situ swarm behaviour observations. The incubation experiments are essentially measuring the propensity of krill to swim upwards since it logs the number of times an individual (or group) break a beam towards the top of the incubation tube. I argue that krill may be still highly active in the rest of the tube but just do not swim close to the surface, so this approach may not be a good measure of "activity". Otherwise, I suggest a more correct term of what is being measured is the level of "upward kinesis". As the authors themselves note, krill are negatively buoyant and must always be active to remain pelagic. What changes over the day-night cycle is whether they decide to expend that activity on swimming upwards, downwards or remaining at the same depth. Explaining the pattern as upward kinesis then also explains by swarms move upwards during the night. Just being more active at night may not necessarily result in them swimming upwards.
We believe that there is a slight misunderstanding in the way that what we call “activity” is measured. The experimental columns are equipped with five detector modules, evenly distributed over the height of the column. In our analysis we count all beam breaks that are caused by upward movement, i.e. every time a detector module is triggered after a detector module at a lower position has been triggered, and not only when the top detector module is triggered. In this way, we record upward swimming movements throughout the column, and not only when the krill swims all the way to the top of the column. This still means that what we are measuring is swimming activity, caused by upward swimming. We use this measure, to deliberately separate increased swimming activity, from baseline activity (i.e. swimming which solely compensates for negative buoyancy) and inactivity (i.e. passive sinking).
A higher activity is thus at first interpreted as an increase in swimming activity, which in the field may result in upwards directed swimming but also could mean a horizontal increase in activity, for example representing increased foraging and feeding activity. This would explain the daily activity pattern observed under LD cycles (Fig. 2), which shows a general increase in activity during the dark phase. This nighttime increase could be used for both upward directed migration during sunset as well as horizontal directed swimming for feeding and foraging throughout the night.
We will formulate the description of the activity metric more clearly in the revised version of the manuscript.
(3) Molecular relevance - Although I am interested in molecular clock aspects behind these circadian rhythms, it was not made clear how the results of the present study allow any further insight into this. In lines 282 to 284, the findings of the study by Biscontin et al (2017) are discussed with regard to how TIM protein is degraded by light via the clock photreceptor CRYTOCHROME 1. This element of the Discussion would be a lot more relevant if the results of the present study were considered in terms of whether they supported or refuted this or any other molecular clock model. As it stands, this paragraph is purely background knowledge and a candidate for deletion in the interest of shortening the Discussion.
We agree that this part is not directly related to the data presented in the manuscript and will therefore omit this part in the revised version of the manuscript to keep the discussion concise and focused on the results.
Other aspects
(i) 'Bimodal swimming' was used in the Abstract and later in the text without the term being fully explained. I could interpret it to mean a number of things so some explanation is required before the term is introduced.
We thank the Reviewer for pointing this out and will provide an explanation for the term “bimodal swimming” in a revised version of the manuscript.
(ii) Midnight sinking - I was struck by Figure 2b with regards to the dip in activity after the initial ascent, as well as the rise in activity predawn. Cushing (1951) Biol Rev 26: 158-192 describes the different phases of a DVM common to a number of marine organisms observed in situ where there is a period of midnight sinking following the initial dusk ascent and a dawn rise prior to dawn descent. Tarling et al (2002) observe midnight sinking pattern in Calanus finmarchicus and consider whether it is a response to feeding satiation or predation avoidance (i.e. exogenous factors). Evidence from the present study indicates that midnight sinking (and potential dawn rise) behaviour could alternatively be under endogenous control to a greater or lesser degree. This is something that should certainly be mentioned in the Discussion, possibly in place of the molecular discussion element mentioned above - possibly adding to the paragraph Lines 303-319.
We would like to thank the Reviewer for pointing this out and agree that it would be interesting to add the idea of an endogenous control of midnight sinking to the discussion. We plan to implement this in a revised version of the manuscript.
(iii) Lines 200-207 - I struggled to follow this argument regarding Piccolin et al identifying a 12 h rhythm whereas the present study indicates a ~24 h rhythm. Is one contradicting the other - please make this clear.
In our study we found that the circadian clock drives a bimodal pattern of swimming activity in krill, meaning it controls two bouts of activity in a 24 h cycle. Piccolin et al. (2020) identified a swimming activity pattern of ~12 h (i.e. two peaks in 24 h) at the group level, which is in line with our findings at the individual level. We will revisit the mentioned section for more clarity in a revised version.
(iv) Although I agree that the hydroacoustic data should be included and is generally supportive of the results, I think that two further aspects should be made clear for context (a) whether there was any groundtruthing that the acoustic marks were indeed krill and not potentially some other group know to perform DVM such as myctophids (b) how representative were these patterns - I have a sense that they were heavily selected to show only ones with prominent DVM as opposed to other parts of the dataset where such a pattern was less clear - I am aware of a lot of krill research where DVM is not such a clear pattern and it is disingenuous to provide these patterns as the definitive way in which krill behaves. I ask this be made clear to the reader (note also that there is a suggestion of midnight sinking in Fig 5b on 28/2).
To clarify the mentioned points concerning the hydroacoustic data:
a) As mentioned in the Methods section, only hydroacoustic data during active fishing was included in the analysis. E. superba occurs in large monospecific aggregations and the fishery is actively targeting E. superba and monitoring their catch and the proportion of non-target species continuously with cameras. Krill fishery bycatch rates are very low (0.1–0.3%, Krafft et al. 2018), and fishing operations would stop if non-target species were being caught in significant proportions at any time. Therefore, and supported by our own observations when we conducted the experiments, we argue that it is a valid assumption that the backscattering signal shown in Figure 5 is predominantly caused by E. superba.
b) We are aware of the fact that DVM patterns of Antarctic krill are highly variable and that normal DVM patterns do not need to be the rule (e.g. see our cited study on the plasticity of krill DVM by Bahlburg et al. 2023). The visualized data were not selected for their DVM pattern but represent the period directly preceding the sampling for behavioral experiments in four different seasons (namely S1-S4), including the day of sampling. These periods were chosen to assess the DVM behavior of krill swarms in the field in the days before and during the sampling for behavioral experiments.
We will include these aspects in the Methods section in a revised version of the manuscript in order to improve understanding.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
Summary:
The authors' research group had previously demonstrated the release of large multivesicular body-like structures by human colorectal cancer cells. This manuscript expands on their findings, revealing that this phenomenon is not exclusive to colorectal cancer cells but is also observed in various other cell types, including different cultured cell lines, as well as cells in the mouse kidney and liver. Furthermore, the authors argue that these large multivesicular body-like structures originate from intracellular amphisomes, which they term "amphiectosomes." These amphiectosomes release their intraluminal vesicles (ILVs) through a "torn-bag mechanism." Finally, the authors demonstrate that the ILVs of amphiectosomes are either LC3B positive or CD63 positive. This distinction implies that the ILVs either originate from amphisomes or multivesicular bodies, respectively.
Strengths:
The manuscript reports a potential origin of extracellular vesicle (EV) biogenesis. The reported observations are intriguing.
Weaknesses:
It is essential to note that the manuscript has issues with experimental designs and lacks consistency in the presented data. Here is a list of the major concerns:
(1) The authors culture the cells in the presence of fetal bovine serum (FBS) in the culture medium. Given that FBS contains a substantial amount of EVs, this raises a significant issue, as it becomes challenging to differentiate between EVs derived from FBS and those released by the cells. This concern extends to all transmission electron microscopy (TEM) images (Figure 1, 2P-S, S5, Figure 4 P-U) and the quantification of EV numbers in Figure 3. The authors need to use an FBS-free cell culture medium.
Although FBS indeed contains bovine EVs, however, the presence of very large multivesicular EVs (amphiectosomes) that our manuscript focuses on has never been observed and reported. For reported size distributions of EVs in FBS, please find a few relevant references below:
PMID: 29410778, PMID: 33532042, PMID: 30940830 and PMID: 37298194
All the above publications show that the number of lEVs > 350-500 nm is negligible in FBS. The average diameter of MV-lEVs (amphiectosomes) described in our manuscript is around 1.00-1.50 micrometer.
Reviewer #1: These papers evaluated the effectiveness of various methods to eliminate EVs from FBS, emphasizing the challenges associated with the presence of EVs in FBS. They also caution against using FBS in EV studies due to these issues. However, I did not find a clear indication regarding the size distributions of EVs in FBS in these papers.
Please provide accurate reference supporting the claim that 'lEVs > 350-500 nm are negligible in FBS.' The papers cited by the authors do not address this specific point.
In the revised manuscript, we addressed the point that due to sterile filtering of FBS, it cannot contain large >0.22 µm EVs
Our response to Reviewer #1 point 2. When we demonstrated the TEM of isolated EVs, we consistently used serum- free conditioned medium (Fig2 P-S, Fig2S5 J, O) as described previously (Németh et al 2021, PMID: 34665280).
Reviewer #1: This is an important point that is not mentioned in the original main text, figure legend or method. Please address.
We agree and we apologize for it. We added this information to the revised manuscript.
Our response to Reviewer #1 point 3. Our TEM images show cells captured in the process of budding and scission of large multivesicular EVs excluding the possibility that these structures could have originated from FBS.
Reviewer #1: These images may also depict the engulfment of EVs in FBS. Hence, it is crucial to utilize EV-free or EV-depleted FBS.
As we mentioned earlier, we added the information to the revised manuscript that sterile filtering of the FBS presumably removed particles >0.22 µm EVs
Our response to Reviewer #1 point 4. In addition, in our confocal analysis, we studied Palm-GFP positive, cell-line derived MV-lEVs. Importantly, in these experiments, FBS-derived EVs are non-fluorescent, therefore, the distinction between GFP positive MV-lEVs and FBS-derived EVs was evident.
Reviewer #1: I agree that these fluorescent-labeled assays conclusively indicate that the MV-lEVs are originating from the cells. However, the images of concerns are the non- fluorescent-labeled images in (Figure 1, 2P-S, S5, Figure 4 P-U and Figure 3). The MV-lEVs may derive from both the cells and FBS.
Please see above our response to points 1-3.
Our response to Reviewer #1 point 5. In addition, culturing cells in FBS-free medium (serum starvation) significantly affects autophagy. Given that in our study, we focused on autophagy related amphiectosome secretion, we intentionally chose to use FBS supplemented medium.
Reviewer #1 If this is a concern, the authors should use EV-depletive FBS.
As we discussed above, sterile filtration of FBS removes particles >0.22 µm. In addition, based on our preliminary experiments, EV-depleted serum may effect cell physiology.
Our response to Reviewer #1 point 6. Even though the authors of this manuscript are not familiar with the technological details how FBS is processed before commercialization, it is reasonable to assume that the samples are subjected to sterile filtration (through a 0.22 micron filter) after which MV-lEVs cannot be present in the commercial FBS samples.
Reviewer #1This is a fair comment that needs to be included in the manuscript.
As you suggested, this comment is now included in the revised manuscript
(2) The data presented in Figure 2 is not convincingly supportive of the authors' conclusion. The authors argue that "...CD81 was present in the plasma membrane-derived limiting membrane (Figures 2B, D, F), while CD63 was only found inside the MV-lEVs (Fig. 2A, C, E)." However, in Figure 2G, there is an observable CD63 signal in the limiting membrane (overlapping with the green signals), and in Figure 2J, CD81 also exhibits overlap with MV-IEVs.
Both CD63 and CD81 are tetraspanins known to be present both in the membrane of sEVs and in the plasma membrane of cells (for references, please see Uniprot subcellular location maps: https://www.uniprot.org/uniprotkb/P08962/entry#subcellular_location https://www.uniprot.org/uniprotkb/P60033/entry#subcellular_location). However, according the feedback of the reviewer, for clarity, we will delete the implicated sentence from the text.
Reviewer #1 Please also justify the statement questioned in (3) as these arguments are interconnected.
We hope you find our above responses to your comment acceptable.
(3) Following up on the previous concern, the authors argue that CD81 and CD63 are exclusively located on the limiting membrane and MV-IEVs, respectively (Figure 2-A-M). However, in lines 104-106, the authors conclude that "The simultaneous presence of CD63, CD81, TSG101, ALIX, and the autophagosome marker LC3B within the MV-lEVs..." This statement indicates that CD63 and CD81 co-localize to the MV-IEVs. The authors need to address this apparent discrepancy and provide an explanation.
There must be a misunderstanding because we did not claim or implicate in the text that “CD81 and CD63 are exclusively located on the limiting membrane and MV-IEVs”. Here we studied co-localization of the above proteins in the case intraluminal vesicles (ILVs). In Fig 2. we did not show any analysis of limiting membrane co-localization.
Reviewer #1 I have indicated that this statement is found in lines 104-106, where the authors argue, 'The simultaneous presence of CD63, CD81, TSG101, ALIX, and the autophagosome marker LC3B within the MV-lEVs...' If the authors acknowledge the inaccuracy of this statement, please provide a justification for this argument.
For clarity, we modified the description of data shown in Fig2 in the revised manuscript.
(4) The specificity of the antibodies used in Figure 2 should be validated through knockout or knockdown experiments. Several of the antibodies used in this figure detect multiple bands on western blots, raising doubts about their specificity. Verification through additional experimental approaches is essential to ensure the reliability and accuracy of all the immunostaining data in this manuscript.
We will consider this suggestion during the revision of the manuscript.
Reviewer #1:Please do so.
We carefully considered the suggestion, but we realized that it was not feasible for us to perform gene silencing in the case of all our used antibodies before resubmission of our revised manuscript. However, we repeated the Western blot for mouse anti-CD81 (Invitrogen MAA5-13548) and replaced the previous Western blot by it in the revised manuscript (Fig.2-S4H)
(5) In Figures 2P-R, the morphology of the MV-IEVs does not resemble those shown in Figures 1-A, H, and D, indicating a notable inconsistency in the data.
EM images in Figure2 P-R show sEVs separated from serum-free conditioned media as opposed to MV-lEVs, which were in situ captured in fixed tissue cultures (Fig1). Therefore, the two EV populations necessarily have different size and structure. Furthermore, Fig. 1 shows images of ultrathin sections while in Figure 2P-R, we used a negative-positive contrasting of intact sEV-s without embedding and sectioning.
(6) There are no loading controls provided for any of the western blot data.
Not even the latest MISEV 2023 guidelines give recommendations for proper loading control for separated EVs in Western blot (MISEV 2023 , DOI: 10.1002/jev2.12404 PMID: 38326288). Here we applied our previously developed method (PMID: 37103858), which in our opinion, is the most reliable approach to be used for sEV Western blotting. For whole cell lysates, we used actin as loading control (Fig3-S2B).
Reviewer #1: The blots referenced here (Fig2-S3; Fig2-S4B; Fig3-S2B) were conducted using total cell lysates, not EV extracts. Only one blot in Fig3-S2B includes an actin control. All remaining blots should incorporate actin controls for consistency.
Fig2-S3 (corresponding to Fig2-S4 in the revised manuscript) only shows reactivity of the used antibodies. This Western blot is not intended to serve as a basis of any quantitative conclusions. Fig2-S4 (corresponding to Fig2-S5 in the revised manuscript) includes the actin control. Fig3-S2B shows the complete membrane, which was cut into 4 pieces, and the immune reactivity of different antibodies was tested. The actin band was included on the anti-LC3B blot. For clarity, we rephrased the figure legend.
Additionally, for Figures 2-S4B, the authors should run the samples from lanes i-iii in a single gel.
Please note that in Figure 2- S4B, we did run a single gel, and the blot was cut into 4 pieces, which were tested by anti-GFP, anti-RFP, anti-LC3A and anti-LC3B antibodies. Full Western blots are shown in Fig.3_S2 B, and lanes “1”, “2” and “3” correspond to “i”, “ii” and “iii” in Fig.2-S4, respectively.
Reviewer #1: In the original Figure 2- S4B, the blots were sectioned into 12 pieces. If lanes "i," "ii," and "iii" were run on the same blot, the authors are advised to eliminate the grids between these lanes.
Grids separating the lanes have been eliminated on Fig.2_S4 (now Fig.2_S5 in the revised manuscript).
(7) In Figure 2-S4, is there co-localization observed between LC3RFP (LC3A?) with other MV-IFV markers? How about LC3B? Does LC3B co-localize with other MV-IFV markers?
In Supplementary Figure 2-S4, we showed successful generation of HEK293T-PalmGFP-LC3RFP cell line. In this case we tested the cells, and not the released MV-lEVs. LC3A co-localized with the RFP signal as expected.
Reviewer #1: Does LC3RFP colocalize with MV-IFV markers in HEK293T-PalmGFP-LC3RFP cell line? This experiment aims to clarify the conclusion made in lines 104-106, where the authors assert that 'The concurrent existence of CD63, CD81, TSG101, ALIX, and the autophagosome marker LC3B within the MV-lEVs...'
In the case of PalmGFP-LC3RFP cells, LC3-RFP is overexpressed. Simultaneous assessment of this overexpressed protein with non-overexpressed, fluorescent antibod-detected molecules proved to be challenging because of spectral overlaps and inappropriate signal-noise ratios. Furthermore, in association with EVs, the number of antibody-detected molecules is substantially lower than in cells. Therefore, even though we tried, we could not successfully perform these experiments.
(8) The TEM images presented in Figure 2-S5, specifically F, G, H, and I, do not closely resemble the images in Figure 2-S5 K, L, M, N, and O. Despite this dissimilarity, the authors argue that these images depict the same structures. The authors should provide an explanation for this observed discrepancy to ensure clarity and consistency in the interpretation of the presented data.
As indicated in Material and Methods, Fig 2-S5 F, G, H and I are conventional TEM images fixed by 4% glutaraldehyde 1% OsO<sub>4</sub> 2h and embedded into Epon resin with a post contrasting of 3.75% uranyl acetate 10 min and 12 min lead citrate. Samples processed this way have very high structure preservation and better image quality, however, they are not suitable for immune detection. In contrast, Fig.2.-S5 K,L,M,N shows immunogold labelling of in situ fixed samples. In this case we used milder fixation (4% PFA, 0.1% glutaraldehyde, postfixed by 0.5% OsO<sub>4</sub> 30 min) and LR-White hydrophilic resin embedding. This special resin enables immunogold TEM analysis. The sections were exposed to H<sub>2</sub>O<sub>2</sub> and NaBH<sub>4</sub> to render the epitopes accessible in the resin. Because of the different applied techniques, the preservation of the structure is not the same. In the case of Fig.2 J, O, separated sEVs were visualised by negative-positive contrast and immunogold labelling as described previously (PMID: 37103858).
Reviewer #1: Please include this justification in the revised version.
We included this justification in the revised manuscript.
(9) For Figures 3C and 3-S1, the authors should include the images used for EV quantification. Considering the concern regarding potential contamination introduced by FBS (concern 1), it is advisable for the authors to employ an independent method to identify EVs, thereby confirming the reliability of the data presented in these figures.
In our revised manuscript, we will provide all the images used for EV quantification in Figure 3C. Given that Figures 3C and 3-S1 show MV-lEVs released by HEK293T-PlamGFP cells, the possible interference by FBS-derived non-fluorescent EVs can be excluded.
Reviewer #1: Please provide all the images.
Original LASX files are provided (DOI: 10.6019/S-BIAD1456 ).
Reviewer #1: The images raising concerns regarding the contamination of EVs in FBS primarily consist of transmission electron microscopy (TEM) images, namely, Figure 1, 2P-S, S5, and Figure 4 P-U, along with the quantification of EV numbers in Figure 3. These concerns persist despite the use of fluorescent-labeled experiments. While fluorescent-labeled MV-lEVs are conclusively identified as originating from the cells, the MV-lEVs observed in Figure 1, 2P-S, S5, and Figure 4 P-U and Figure 3 may derive from both the cells and FBS.
Large EVs (with diameter >800 nm) derived from FBS were not present in our experiments, as discussed above.
(10) Do the amphiectosomes released from other cell types as well as cells in mouse kidneys or liver contain LC3B positive and CD63 positive ILVs?
Based on our confocal microscopic analysis, in addition the HEK293T-PalmGFP cells, HT29 and HepG2 cells also release similar LC3B and CD63 positive MV-lEVs. Preliminary evidence shows MV-lEV secretion by additional cell types.
The response of Reviewer #1: Please show these data in the revised manuscript. Moreover, do cells in mouse kidneys or liver contain LC3B positive and CD63 positive ILVs?
We have added new confocal microscopic images to Fig2-S3 showing amphiectosomes released also by the H9c2 (ATCC) cardiomyoblast cell line. To preserve the ultrastructure of MV-lEVs in complex organs like kidney and liver, fixation with 4% glutaraldehyde with 1% OsO4 appears to be essential. This fixation does not allow for immune detection to assess LC3B and CD63 positive MV-lEVs in the ultrathin sections.
Reviewer #2 (Public Review):
Summary:
The authors had previously identified that a colorectal cancer cell line generates small extracellular vesicles (sEVs) via a mechanism where a larger intracellular compartment containing these sEVs is secreted from the surface of the cell and then tears to release its contents. Previous studies have suggested that intraluminal vesicles (ILVs) inside endosomal multivesicular bodies and amphisomes can be secreted by the fusion of the compartment with the plasma membrane. The 'torn bag mechanism' considered in this manuscript is distinctly different because it involves initial budding off of a plasma membrane-enclosed compartment (called the amphiectosome in this manuscript, or MV-lEV). The authors successfully set out to investigate whether this mechanism is common to many cell types and to determine some of the subcellular processes involved.
The strengths of the study are:
(1) The high-quality imaging approaches used, seem to show good examples of the proposed mechanism.
(2) They screen several cell lines for these structures, also search for similar structures in vivo, and show the tearing process by real-time imaging.
(3) Regarding the intracellular mechanisms of ILV production, the authors also try to demonstrate the different stages of amphiectosome production and differently labelled ILVs using immuno-EM.
Several of these techniques are technically challenging to do well, and so these are critical strengths of the manuscript.
The weaknesses are:
(1) Most of the analysis is undertaken with cell lines. In fact, all of the analysis involving the assessment of specific proteins associated with amphiectosomes and ILVs are performed in vitro, so it is unclear whether these processes are really mirrored in vivo. The images shown in vivo only demonstrate putative amphiectosomes in the circulation, which is perhaps surprising if they normally have a short half-life and would need to pass through an endothelium to reach the vessel lumen unless they were secreted by the endothelial cells themselves.
Our previous results analyzing PFA-fixed, paraffin embedded sections of colorectal cancer patients provided direct evidence that MV-lEV secretion also occurs in humans in vivo (PMID: 31007874). Regarding your comment on the presence of amphiectosomes in the circulation despite their short half-lives, we would like to point out that Fig1.X shows a circulating lymphocyte which releases MV-lEV within the vessel lumen. Furthermore, in the revised manuscript, an additional Fig.1-S1 is provided. Here, we show the release of MV-lEVs both by an endothelial and a sub-endothelial cell (Fig.1-S1G). In addition, these images show the simultaneous presence of MV-lEVs and sEVs in the circulation (Fig.1-S1.A,C,D,H and I). The transmission electron micrographs of mouse kidney and liver sections provide additional evidence that the MV-lEVs are released by different types of cells, and the “torn bag release” also takes place in vivo (Fig.1.V).
(2) The analysis of the intracellular formation of compartments involved in the secretion process (Figure 2-S5) relies on immuno-EM, which is generally less convincing than high-/super-resolution fluorescence microscopy because the immuno-labelling is inevitably very sporadic and patchy. High-quality EM is challenging for many labs (and seems to be done very well here), but high-/super-resolution fluorescence microscopy techniques are more commonly employed, and the study already shows that these techniques should be applicable to studying the intracellular trafficking processes.
As you suggested, in the revised manuscript, we present additional super-resolution microscopy (STED) data. The intracellular formation of amphisomes, the fragmentation of LC3B-positive membranes and the formation of LC3B-positive ILVs were captured (Fig. 3B-F).
(3) One aspect of the mechanism, which needs some consideration, is what happens to the amphisome membrane, once it has budded off inside the amphiectosome. In the fluorescence images, it seems to be disrupted, but presumably, this must happen after separation from the cell to avoid the release of ILVs inside the cell. There is an additional part of Figure 1 (Figure 1Y onwards), which does not seem to be discussed in the text (and should be), that alludes to amphiectosomes often having a double membrane.
We agree with your comment regarding the amphisome membrane and we added a sentence to the Discussion of the revised manuscript. Fig1Y onwards is now discussed in the manuscript. In addition, we labelled the surface of living HEK293 cells with wheat germ agglutinin (WGA), which binds to sialic acid and N-acetyl-D-glucosamine. After removing the unbound WGA by washes, the cells were cultured for an additional 3 hours, and the release of amphiectosomes was studied. The budding amphiectosome had WGA positive membrane providing evidence that the external limiting membrane had a plasma membrane origin (Fig.3G)
(4) The real-time analysis of the amphiectosome tearing mechanism seemed relatively slow to me (over three minutes), and if this has been observed multiple times, it would be helpful to know if this is typical or whether there is considerable variation.
Thank you for this comment. In the revised manuscript, we highlight that the first released LC3 positive ILV was detected as early as within 40 sec.
Overall, I think the authors have been successful in identifying amphiectosomes secreted from multiple cell lines and demonstrating that the ILVs inside them have at least two origins (autophagosome membrane and late endosomal multivesicular body) based on the markers that they carry. The analysis of intracellular compartments producing these structures is rather less convincing and it remains unclear what cells release these structures in vivo.
I think there could be a significant impact on the EV field and consequently on our understanding of cell-cell signalling based on these findings. It will flag the importance of investigating the release of amphiectosomes in other studies, and although the authors do not discuss it, the molecular mechanisms involved in this type of 'ectosomal-style' release will be different from multivesicular compartment fusion to the plasma membrane and should be possible to be manipulated independently. Any experiments that demonstrate this would greatly strengthen the manuscript.
We appreciate these comments of the reviewer. Experiments are on their way to elucidate the mechanism of the “ectosomal style” exosome release and will be the topic of our next publication.
In general, the EV field has struggled to link up analysis of the subcellular biology of sEV secretion and the biochemical/physical analysis of the sEVs themselves, so from that perspective, the manuscript provides a novel angle on this problem.
Reviewer #3 (Public Review):
Summary:
In this manuscript, the authors describe a novel mode of release of small extracellular vesicles. These small EVs are released via the rupture of the membrane of so-called amphiectosomes that resemble "morphologically" Multivesicular Bodies.
These structures have been initially described by the authors as released by colorectal cancer cells (https://doi.org/10.1080/20013078.2019.1596668). In this manuscript, they provide experiments that allow us to generalize this process to other cells. In brief, amphiectosomes are likely released by ectocytosis of amphisomes that are formed by the fusion of multivesicular endosomes with autophagosomes. The authors propose that their model puts forward the hypothesis that LC3 positive vesicles are formed by "curling" of the autophagosomal membrane which then gives rise to an organelle where both CD63 and LC3 positive small EVs co-exist and would be released then by a budding mechanism at the cell surface that appears similar to the budding of microvesicles /ectosomes. Very correctly the authors make the distinction from migrasomes because these structures appear very similar in morphology.
Strengths:
The findings are interesting despite that it is unclear what would be the functional relevance of such a process and even how it could be induced. It points to a novel mode of release of extracellular vesicles.
Weaknesses:
This reviewer has comments and concerns concerning the interpretation of the data and the proposed model. In addition, in my opinion, some of the results in particular micrographs and immunoblots (even shown as supplementary data) are not of quality to support the conclusions.
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
(1) Highlight MV-IEV, ILV and limiting membrane in Figure-1G, N, and U.
Based on the suggestion, we revised Figure1
(2) Figure 1-Y-AF are not mentioned in the text.
In the revised manuscript, we discuss Figure 1Y-AF
(3) The term "IEVs" in Figure 2-S2 is not defined.
We modified the figure legend: we changed MV-lEV to amphiectosome
(4) Need to quantify co-localization in Figure 2-S2.
As suggested, we carried out the co-localisation analysis (Fig2-S2I), and Fig2-S2 was re-edited
Reviewer #2 (Recommendations For The Authors):
I have two recommendations for improving the manuscript through additional experiments:
(1) I think the description of the intracellular processes taking place in order to form amphiectosomes would be much stronger if some super-resolution imaging could be undertaken. This should label the different compartments before and after fusion with specific markers that highlight the protein signature of the different limiting and ILV membranes much more clearly than immuno-EM. It will also help in characterising the double-membrane structure of amphiectosomes at the point of budding and reveal whether the patchy labelling of the inner membrane emerges after amphiectosome release (the schematic model currently suggests that it happens before).
Thank you for your suggestion. STED microscopy was applied and results are shown in new Fig3 and the schematic model was modified accordingly.
(2) The implications of the manuscript would be more wide-ranging if the authors could test genetic manipulations that are believed to block exosome or ectosome release, eg. Rab27a or Arrdc1 knockdown. This may allow them to determine whether MV-lEVs can be released independently of the classical exosome release mechanism because they use a different route to be released from the plasma membrane. This experiment is not essential, but I think it would start to address the core regulatory mechanisms involved, and if successful, would easily allow the authors to determine the ratio of CD63-positive sEVs being secreted via classical versus amphiectosome routes.
The suggestion is very valuable for us and these studies are being performed in a separate project.
I think there are several other ways in which the manuscript could be improved to better explain some of the approaches, findings and interpretation:
(1) Include some explanation in the text of certain key tools, particularly:
a. Palm-GFP and whether its expression might alter the properties of the plasma membrane since this is used in a lot of experiments and is the only marker that seems to uniformly label the outer membrane of amphiectosomes. One concern might be that its expression drives amphiectosome secretion.
We found evidence for amphiectosome release also in the case of several different cells not expressing Palm-GFP. We believe, this excludes the possibility that Palm-GFP expression is the inducer of the amphiectosome release. Both by fluorescent and electron microscopy, the Palm-GFP non expressing cells showed very similar MV-lEVs. In addition, in the case of non-transduced HEK293 and fluorescent WGA-binding, we made similar observations.
b. Lactadherin - does this label the amphiectosomes after their release or does the wash-off step mean that it only labels cells, which subsequently release amphiectosomes?
Lactadherin labels the amphiectosomes after their release and fixation. Living cells cannot be labelled by lactadherin as PS is absent in the external plasma membrane layer of living cells. We used WGA on HEK293 cells to further support the plasma membrane origin of the external membrane of amphiectosomes.
(2) Explain the EM and confocal imaging approaches more clearly. Most importantly, is a 3D reconstruction always involved to confirm that 'separated' amphiectosomes are not joined to cells in another Z-plane.
Thank you for your suggestion. We have modified the manuscript accordingly
(3) Presenting triple-labelled images with red, green and yellow channels does not allow individual labelling to be determined without single-channel images and even then, it is much more informative to use three distinguishable colours that make a different colour with overlap, eg. CMY? Fig.2_S2D and E do not display individual channels, so definitely need to be changed.
In case of Fig.2_S2D, we now show the individual channels, the earlier E image has been removed. In case of the STED images, CMY colors had been used, as you suggested.
(4) Please discuss in the text the data in Figure 1Y onwards concerning single/double membranes on MV-lEVs.
In the revised manuscript, we discuss the question on single/double membranes and we refer to Figure 1Y-AF
(5) On line 162, reword 'intraluminal TSPAN4 only' to 'one in which TSPAN4 is only intraluminal' to make it clear that other proteins are also marking the intraluminal region, not TSPAN4 only.
We modified the text accordingly.
(6) Points for further discussion and further conclusions:
a. In vivo experiments - discuss the limitations of this part of the analysis - it seems that none of the amphiectosome markers have been analysed in this part of the study and the MV-lEVs are only in the circulation.
b. Can the authors give any further indication of the levels of MV-lEVs relative to free sEVs from any of their studies?
Using our current approach, it is not possible to determine the levels of MV-lEVs to free sEV. Without analyzing serial ultrathin sections, determination of the relative ratio of MV-lEVs and sEVs would depend on the actual section plane. In future projects, we will determine the ratio of LC3 positive and negative sEVs by single EV analysis techniques (such as SP-IRIS). In the revised manuscript, additional TEM images are included to provide evidence for the simultaneous presence of sEVs and MV-lEVs and MV-lEVs both inside and outside of the circulation.
c. Please discuss the single versus double membrane issue (relating to experiments proposed above).
We discuss this question in more details in the revised manuscript.
d. Please point out that the release mechanism (plasma membrane budding) will involve different molecular mechanisms to establish exosome release, and this might provide a route to determine relative importance.
We are currently running a systemic analysis of the release mechanism of amphiectosomes, and this will be the topic of a separate manuscript.
Reviewer #3 (Recommendations For The Authors):
* The model is not supported.
* The data is not of quality.
* The appropriate methods are not exploited.
We are sorry, we cannot respond to these unsupported critiques.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the previous reviews.
eLife Assessment
This important study showing that sleep deprivation increases functional synapses while depleting silent synapses supports previous findings that excitatory signaling increases during wakefulness. This manuscript focuses in particular on AMPA/NMDA ratios. An interesting, although speculative, aspect of the manuscript is the inclusion of a model for the accumulation of sleep need that is based upon the MEF2C transcription factor but also links to the sleep-regulating SIK3-HDAC4/5 pathway. The authors have clarified some questions raised in the previous review, but the evidence for major claims was still found to be incomplete, requiring additional experimentation.
The major claims of this study are: 1) SD increases the AMPA/NMDA receptor ratio and RS restores it; 2) SD decreases silent synapses compared to CS and RS restores their number after SD; 3) the majority of SD-induced DEGs are found in ExIT cells (glutamate pyramidal neurons projecting within the telencephalon); 4) ExIT SD-induced DEGs are enriched for genes encoding synaptic shaping components and for autism spectrum disorder risk and; 5) these DEGs are also enriched for DEGs induced by Mef2c loss of function restricted to forebrain glutamate neurons (ExIT cells comprise a subset of these) and by over-expression of constitutively nuclear HDAC4 that represses MEF2c transcriptional function. The last claim is consistent with an intracellular signaling model (presented as a hypothesis to be tested, in figure 4B).
[The above is added to the start of the discussion section.]
The specific claims are supported by solid evidence provided in this manuscript. The statistical support is now more clearly presented, with several changes in response to queries by reviewer 1.
The technical issues raised by reviewer 1 do not detract from the claims, thus supported. The rationale for this assessment is expanded below in response to reviewer 1.
Summary:
This manuscript by Vogt et al examines how the synaptic composition of AMPA and NMDA receptors changes over sleep and wake states. The authors perform whole-cell patch clamp recordings to quantify changes in silent synapse number across conditions of spontaneous sleep, sleep deprivation, and recovery sleep after deprivation. They also perform single nucleus RNAseq to identify transcriptional changes related to AMPA/NMDA receptor composition following spontaneous sleep and sleep deprivation. The findings of this study are consistent with a decrease in silent synapse number during wakefulness and an increase during sleep. However, these changes cannot be conclusively linked to sleep/wake states. Measurements were performed in motor cortex, and sleep deprivation was achieved by forced locomotion, raising the possibility that recent patterns of neuronal activity, rather than sleep/wake states, are responsible for the observed results.
Strengths:
This study examines an important question. Glutamatergic synaptic transmission has been a focus of studies in the sleep field, but AMPA receptor function has been the primary target of these studies. Silent synapses, which contain NMDA receptors but lack AMPA receptors, have important functional consequences for the brain. Exploring the role of sleep in regulating silent synapse number is important to understanding the role of sleep in brain function. The electrophysiological approach of measuring the failure rate ratio, supported by AMPA/NMDA ratio measurements, is a rigorous tool to evaluate silent synapse number.
The authors also perform snRNAseq to identify genes differentially expressed in the spontaneous sleep and sleep deprivation groups. This analysis reveals an intriguing pattern of upregulated genes controlled by HDAC4 and Mef2c, along with synaptic shaping component genes and genes associated with autism spectrum disorder, across cell types in the sleep deprivation group. This unbiased approach identifies candidate genes for follow-up studies. The finding that ASD-risk genes are differentially expressed during SD also raises the intriguing possibility that normal sleep function is disrupted in ASD.
Weaknesses:
A major consideration to the interpretation of this study is the use of forced locomotion for sleep deprivation. Measurements are made from motor cortex, and therefore the effects observed could be due to differences in motor activity patterns across groups, rather than lack of sleep per se.
Experimentally induced lack of sleep always involves differences in motor activity. As previously noted in revision 1, motor learning is unlikely to occur in this paradigm and inspection of the video (in supplementary materials) shows no repetitive motor behavioral sequences during the sleep deprivation, nor can this be considered exercise due to the very slow speed of treadmill movement employed. The obvious major difference between groups is a lack of sleep per se. (See below in the “Recommendations for authors”, reviewer 1 for comments on localized wake activity inducing localized sleep-need responses)
Considering that other groups have failed to find a difference in AMPA/NMDA ratio in mice with different spontaneous sleep/wake histories (Bridi et al., Neuron 2020), confirmation of these findings in a different brain region would greatly strengthen the study.
The study of Bridi et al., Neuron 2020, is not comparable to our study for several important reasons. First, their compared groups were from different circadian phases (180 degrees out of phase), whereas in our study, the circadian times for each group were matched (ZT=6hours). Second, experimentally induced sleep loss did not occur whereas it was a focus of our study. Third, spontaneous sleep/wake cannot be accurately matched amongst subjects whereas in our study, sleep loss was matched exactly between groups.
We agree that assessment of AMPA/NMDA ratio and silent synapse number in sleep deprived compared to ad libitum sleep in other areas of the neocortex is of great interest and something we hope to pursue. It would not be surprising to find differences as preliminarily reported by Bahl, et al., Nat Commun. 2024 Jan 26;15(1):779. However, such data would not further strengthen our already well supported evidence for the differences we report in the motor cortex.
The electrophysiological measurements and statistical analyses raise several questions. Input resistance (cutoffs and actual values) are not provided, making it difficult to assess recording quality.
As stated in our first reply, these data were omitted (an admitted oversight on our part) but are now supplied in the methods section as, “Series resistance values for the recording pipette ranged between 8 and 15 MOhm and experiments with changes larger than 25% were not used for further analyses”. We have now also added the Rs/Rm (as a separate column) for each recorded neuron in table 1.
Parametric one-way ANOVAs were used, although the data do not appear to be normally distributed.
We have now removed all the One-way ANOVA tests for clarity (non-parametric tests were previously supplied in addition to the one-way ANOVA tests). Determination of significance with Kruskal-Wallis non-parametric test has not altered statistical support for our conclusions.
Reviewer 1 correctly points out that we had not tested for normality of our distributions- the distributions are likely to be normal but the sample size is too small to confidently make this call for the ratio data which is why we removed the one-way ANOVA’s entirely from table 1.
Two-way ANOVA’s are used to assess AMPA and EPSC amplitudes and failure rates (table 1 tab 2&5) across sleep conditions. As now indicated (table 1, tab 2&5), the distributions of AMPA and NMDA amplitudes and FRs passed the D'Agostino & Pearson test for normality and QQ plots provide illustration supporting this claim.
In addition, for the AMPA/NMDA and FRR measurements (Figures 1E, F), the SD group (rather than the control sleep group) was used as the control group for post-hoc comparisons, but it is unclear why.
The label of “control group” is arbitrary. CS and RS groups are similar (sleep density for RS>CS as expected). Since this appears to be confusing, we now compare all groups to one another in table 1 with the same statistical outcome (additional comparison of CS to RS).
While the data appear in line with the authors' conclusions, the number of mice (3/group) and cells recorded is low, and adding more would better account for inter-animal variability and increase the robustness of the findings.
Of course, the larger the sample, the better the approximation to the population. Our sample sizes yielded significant differences at the usual p<=0.05 threshold with non-parametric testing. A larger sample size could allow for normality testing of the distributions of the data, but fortunately, this was not necessary to support our conclusions.
The snRNAseq data are intriguing. However, several genes relevant to the AMPA/NMDA ratio are mentioned, but the encoded proteins would be expected to have variable effects on AMPA/NMDA receptor trafficking and function, making the model presented in Figure 4C oversimplified. A more thorough discussion of the candidate genes and pathways that are upregulated during sleep deprivation, the spatiotemporal/posttranslational control of protein expression, and their effects on AMPA/NMDA trafficking vs function is warranted.
We have not studied the candidate genes at this point and do not yet understand their potential role(s) in sleep-related AMPA/NMDA functional ratio, only that their expression levels are altered with sleep condition. We agree with the reviewer that the data are intriguing and in need of further investigation. An important first step that can help direct such studies is the identification and preliminary characterization of good candidate genes with respect their cell type specificity, significance and fold change as we have done. Their potential roles likely depend on “the spatiotemporal/posttranslational control” and other factors as reviewer 1 notes.
Reviewer #2 (Public review):
Here Vogt et al., provide new insights into the need for sleep and the molecular and physiological response to sleep loss. The authors expand on their previously published work (Bjorness et al., 2020) and draw from recent advances in the field to propose a neuron-centric molecular model for the accumulation and resolution of sleep need and basis of restorative sleep function. While speculative, the proposed model successfully links important observations in the field and provides a framework to stimulate further research and advances on the molecular basis of sleep function. In my review, I highlight the important advances of this current work, the clear merits of the proposed model, and indicate areas of the model that can serve to stimulate further investigation.
Strengths:
Reviewer comment on new data in Vogt et al., 2024
Using classic slice electrophysiology, the authors conclude that wakefulness (sleep deprivation (SD)) drives a potentiation of excitatory glutamate synapses, mediated in large part by "un-silencing" of NMDAR-active synapses to AMPAR-active synapses. Using a modern single nuclear RNAseq approach the authors conclude that SD drives changes in gene expression primarily occurring in glutamatergic neurons. The two experiments combined highlight the accumulation and resolution of sleep need centered on the strength of excitatory synapses onto excitatory neurons. This view is entirely consistent with a large body of extant and emerging literature and provides important direction for future research.
Consistent with prior work, wakefulness/SD drives an LTP-type potentiation of excitatory synaptic strength on principle cortical neurons. It has been proposed that LTP associated with wake, leads to the accumulation of sleep need by increasing neuronal excitability, and by the "saturation" of LTP capacity. This saturation subsequently impairs the capacity for further ongoing learning. This new data provides a satisfying mechanism of this saturation phenomenon by introducing the concept of silent synapses. The new data show that in mice well rested, a substantial number of synapses are "silent", containing an NMDAR component but not AMPARs. Silent synapses provide a type of reservoir for learning in that activity can drive the un-silencing, increasing the number of functional synapses. SD depletes this reservoir of silent synapses to essentially zero, explaining how SD can exhaust learning capacity. Recovery sleep led to restoration of silent synapses, explaining how recovery sleep can renew learning capacity. In their prior work (Bjorness et al., 2020) this group showed that SD drives an increase in mEPSC frequency onto these same cortical neurons, but without a clear change in pre-synaptic release probability, implying a change in the number of functional synapses. This prediction is now born out in this new dataset.
The new snRNAseq dataset indicates the sleep need is primarily seen (at the transcriptional level) in excitatory neurons, consistent with a number of other studies. First, this conclusion is corroborated by an independent, contemporary snRNAseq analysis recently available as a pre-print (Ford et al., 2023 BioRxiv https://doi.org/10.1101/2023.11.28.569011). A recently published analysis on the effects of SD in drosophila imaged synapses in every brain region in a cell-type dependent manner (Weiss et al., PNAS 2024), concluding that SD drives brain wide increases in synaptic strength almost exclusively in excitatory neurons. Further, Kim et al., Nature 2022, heavily cited in this work, show that the newly described SIK3-HDAC4/5 pathway promotes sleep depth via excitatory neurons and not inhibitory neurons.
The new experiments provided in Fig1-3 are expertly conducted and presented. This reviewer has no comments of concern regarding the execution and conclusions of these experiments.
Reviewer comment on model in Vogt et al., 2024
To the view of this reviewer the new model proposed by Vogt et al., is an important contribution. The model is not definitively supported by new data, and in this regard should be viewed as a perspective, providing mechanistic links between recent molecular advances, while still leaving areas that need to be addressed in future work. New snRNAseq analysis indicates SD drives expression of synaptic shaping components (SSCs) consistent with the excitatory synapse as a major target for the restorative basis of sleep function. SD induced gene expression is also enriched for autism spectrum disorder (ASD) risk genes. As pointed out by the authors, sleep problems are commonly reported in ASD, but the emphasis has been on sleep amount. This new analysis highlights the need to understand the impact on sleep's functional output (synapses) to fully understand the role of sleep problems in ASD.
Importantly, SD induced gene expression in excitatory neurons overlap with genes regulated by the transcription factor MEF2C and HDAC4/5 (Fig. 4). In their prior work, the authors show loss of MEF2C in excitatory neurons abolished the SD transcriptional response and the functional recovery of synapses from SD by recovery sleep. Recent advances identified HDAC4/5 as major regulators of sleep depth and duration (in excitatory neurons) downstream of the recently identified sleep promoting kinase SIK3. In Zhou et al., and Kim et al., Nature 2022, both groups propose a model whereby "sleep-need" signals from the synapse activate SIK3, which phosphorylates HDAC4/5, driving cytoplasmic targeting, allowing for the de-repression and transcriptional activation of "sleep genes". Prior work shows that HDAC4/5 are repressors of MEF2C. Therefore, the "sleep genes" derepressed by HDAC4/5 may be the same genes activated in response to SD by MEF2C. The new model thereby extends the signaling of sleep need at synapses (through SIK3-HDAC4/5) to the functional output of synaptic recovery by expression of synaptic/sleep genes by MEF2C. The model thereby links aspects of expression of sleep need with the resolution of sleep need by mediating sleep function: synapse renormalization.
Weaknesses:
Areas for further investigation.
In the discussion section Vogt et al., explore the links between excitatory synapse strength, arguably the major target of "sleep function", and NREM slow-wave activity (SWA), the most established marker of sleep need. SIK3-HDAC4/5 have major effects on the "depth" of sleep by regulating NREM-SWA. The effects of MEF2C loss of function on NREM SWA activity are less obvious, but clearly impact the recovery of glutamatergic synapses from SD. The authors point out how adenosine signaling is well established as a mediator of SWA, but the links with adenosine and glutamatergic strength are far from clear. The mechanistic links between SIK3/HDAC4/5, adenosine signaling, and MEF2C, are far from understood. Therefore, the molecular/mechanistic links between a synaptic basis of sleep need and resolution with NREM-SWA activity require further investigation.
Additional work is also needed to understand the mechanistic links between SIK3-HDAC4/5 signaling and MEF2C activity. The authors point out that constitutively nuclear (cn) HDAC4/5 (acting as a repressor) will mimic MEF2C loss of function. This is reasonable, however, there are notable differences in the reported phenotypes of each. Notably, cnHDAC4/5 suppresses NREM amount and NREM SWA but had no effect on the NREM-SWA increase following SD (Zhou et al., Nature 2022).
We speculate that the effect of cnHDAC4/5 to reduce NREM-SWA together with the reduction of NREM amount may be due to a localized increase in neuronal excitability of arousal centers, which would be expected to mask NREM-SWA. Rebound NREM-SWA may reflect the relative rebound increase of NREM-SWA still present under chronic masking conditions (induced by cnHDAC4/5) of increased arousal system excitability. A similar effect to overcome NREM-SWA masking was reported in a Kcna2 KO mouse (a Shaker homologue) by Douglas, et al. (2007, BMC Biol).
Loss of MEF2C in CaMKII neurons had no effect on NREM amount and suppressed the increase in NREM-SWA following SD (Bjorness et al., 2020). These instances indicate that cnHDAC4/5 and loss of MEF2C do not exactly match suggesting additional factors are relevant in these phenotypes. Likely HDAC4/5 have functionally important interactions with other transcription factors, and likewise for MEF2C, suggesting areas for future analysis.
This is not a surprising outcome since both MEF2c and HDAC4/5 are transcription factors whose function(s) are determined by multiple other factors a subset of which are relevant to sleep conditions while other determining factors are not necessarily relevant to sleep. These factors can include their phosphorylation state, genomic accessibility, and interaction with other transcription factors. All these other factors are known to be both cell type specific and determined by intracellular conditions, that in turn, are affected by extracellular conditions and ligands. We certainly agree there is much future analysis needed.
One emerging theme may be that the SIK3-HDAC4/5 axis are major regulators of the sleep state, perhaps stabilizing the NREM state once the transition from wakefulness occurs. MEF2C is less involved in regulating sleep per se, and more involved in executing sleep function, by promoting restorative synaptic modifications to resolve sleep need.
A useful way to restate the above might be to distinguish between control of arousal levels determining the behavioral states, wake or sleep (including REM sleep) and control of sleep function. The term, sleep, is typically used to describe the behavioral state of sleep that acts as a permissive gate to sleep function (that resolves sleep need). The sleep state should not be conflated with sleep function. There is abundant evidence that control of arousal can be dissociated from sleep need and sleep function.
Finally, advances in the roles of the respective SIK3-HDAC4/5 and MEF2C pathways point towards transcription of "sleep genes", as clearly indicated in the model of Fig.4. Clearly more work is needed to understand how the expression of such genes ultimately lead to resolution of sleep need by functional changes at synapses.
We are in full agreement. We also note the SIK3-HDAC4/5 pathway may have more than one role, i.e., to affect arousal centers to alter behavioral state and, more generally, to control MEF2c’s transcriptional activity thus controlling sleep-related, glutamate, synaptic phenotype.
What are these sleep genes and how do they mechanistically resolve sleep need? Thus, the current work provides a mechanistic framework to stimulate further advances in understanding the molecular basis for sleep need and the restorative basis of sleep function.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
Major comments:
(1) I appreciate the authors' thoughtful discussion of the use of forced locomotion for their sleep deprivation technique in their response, as well as the additional information that was provided regarding use of the treadmill in the manuscript. However, given that previous studies have failed to find a difference in AMPA/NMDA ratio following spontaneous sleep vs wake, confirmation of the findings in a non-motor brain region with the same SD technique (or confirmation within motor cortex with a different technique, although the authors correctly point out that other techniques also increase locomotor activity) would greatly strengthen the paper.
Addressed above
Notably, differences in motor activity patterns, not necessarily overall amount of locomotion, may induce differential synaptic changes between groups. This point at least warrants acknowledgement and discussion, but this has not been incorporated into the text of the manuscript.
We will incorporate the following into the discussion:
There is evidence that learning of a motor task or experience of forced altered motor activity can result in localized increases in NREM (slow wave sleep)-slow wave activity (Huber R, Ghilardi MF, Massimini M, Tononi G. Local sleep and learning. Nature. 2004;430(6995):78-81); Huber et al., 2006) in the motor cortex. Since SWS-SWA is considered a marker for sleep homeostasis, the altered motor activity induced increase of SWS-SWA was considered evidence for sleep-related function. Our earlier work has clearly shown that the treadmill method of SD increases frontal cortical SWS-SWA rebound, indicating a sleep-homeostatic process (Bjorness et al., 2016; Bjorness et al., 2020). Furthermore, we have also shown that this means of experimental SD causes similar glutamate synaptic changes as those observed using other means of SD like gentle handling (Liu, et al., JoNS 2010).
(2) The number of mice and cells used for electrophysiology in this study remains low; more animals should be included to account for inter-animal variability.
For this study, increasing the number of mice and cells will have p<0.05 chance of altering our conclusions by rejecting the null hypotheses of the electrophysiology findings.
(3) The additional methodological information provided allays some of my concerns regarding the electrophysiological data. However, information about the input resistance (cutoffs used and/or actual values) is still not provided, which is important for assessing recording quality.
We have now supplied the experimentally determined input resistance for each neuron used in this study (a separate column in table 1, tabs marked, “data”).
(4) It is not meaningful to compare raw AMPA or NMDA responses because stimulus electrode placement will differ between cells, potentially activating different numbers of afferents. Presenting these comparisons (Figure 1C) has the potential to mislead the reader.
This is not misleading (it didn’t mislead reviewer 1) as we described the conditions. As expected by reviewer 1, the variability using “raw AMPA or NMDA responses…” was too great, but did indicate an interaction between receptor responses and sleep condition. This provided (as stated in the results section) rationale to examine, and to only draw conclusions from the AMPA/NMDA amplitude and FR ratios.
(5) I appreciate clarification on the statistics and the authors' response has answered some of my questions. However, this also raises additional questions. What test was used to determine normality (and therefore whether to perform a parametric vs nonparametrictest)?
Described above.
Why was the FRR data analysis changed to a parametric test, when it does not appear that the data are normally distributed?
Showing the parametric test was a mistake on our part- there are not enough samples to conclusively conclude the distributions are normal as reviewer 1 correctly suspects. However, the non-parametric Kruskal-Wallis tests that we also show in table 1 indicate significant differences between conditions and the non-parametric, two-stage linear step-up procedure of Benjamini, Krieger and Yekutieli, indicates significant differences between CS-SD and RS-SD but not for CS-RS, supporting our conclusions. The (unsupported) parametric tests are now removed in Table 1 leaving behind the non-parametric test.
Why were post-hoc tests chosen to compare to a control group rather than all pairwise comparisons,
We now provide post-hoc all-pairwise comparisons to give the same results using the BKY analysis.
and why was the SD rather than CS group used as the control in Figures 1E and F?
Why were different post-hoc tests chosen for the data in Figures 1E, F?
There was no need for this and we now, only show statistics that are used to draw our conclusions for the AMPA/NMDA EPSC ratios data shown in Figure 1E and Failure Rate Ratios data shown in Figure 1F (the conclusions are supported by the non-parametric post-hoc test and remain unchanged).
(6) Genes in the SSC, ASD, Mef2cKO, and HD4cn categories are almost exclusively upregulated in the SD group compared to the CS group (Figure 4A). As the authors point out in their response, "No claim of mechanism linking the changed expression to altered AMPAR or NMDAR activity can be made at this point," largely due to the fact that we do not know the spatiotemporal or posttranslational modification patterns of the translated proteins, and how they affect receptor trafficking vs function. This is in agreement with my original point: as written (and as illustrated in Figure 4C), the manuscript implies that upregulation during SD increases the AMPA/NMDA ratio via receptor trafficking,
The model indicates a likely (but not necessarily exclusive) role for AMPA/NMDA trafficking to explain the functional electrophysiological data that we do report and which is not in dispute. The SSC-DEGs in ExIT cells are consistent with sleep-altered AMPA/NMDA trafficking but remain only a correlation. However, the point is taken and Figure 4c has been revised to only reflect what we have observed electrophysiologically and the speculated mechanism(s) mediated by observed SSC-DEGs are illustrated with “?’s”.
while in reality the picture is likely much more complicated, and therefore a more thorough discussion is warranted. Some discussion was provided in the authors' response but does not appear to have been incorporated into the text or Figure 4C.
As indicated above the proposed model is changed in Figure 4c to more explicitly indicate which aspects reflect our electrophysiological data and which aspects reflect only an association of observations.
Minor comments:
(1) Please justify only using male mice
We had to start somewhere with our limited resources. Our intentions are to follow up with similar experiments using female mice, should funding be realized.
(2) The model in Figure 4C is oversimplified and remains problematic, for the reasons stated in comment #6, above.
See responses above.
(3) Figure 4D remains confusing
We agree. The unnecessary addition of adenosine effects on cholinergic arousal centers (experimentally well supported), have been removed from the figure to provide a more focused indication of how SWS-SWA can be related to either MEF2c and/or to ADORA1 activation through reduction of glutamate synaptic strength. ADORA1 activation elicits reduced glutamate synaptic activity through pre- and postsynaptic inhibition whereas MEF2c activation is essential to reduce sleep elicited, glutamate EPSC reduction. Reduced glutamate synaptic strength, whatever the cause, is associated with increased SWS-SWA.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Reviewer #1 (Evidence, reproducibility and clarity (Required)):
The study by Aguirre-Botero et al. shows the dynamics of 3D11 anti-CSP monoclonal antibody (mAb) mediated elimination of rodent malaria Plasmodium berghei (Pb) parasites in the liver. The authors show that the anti-CSP mAb could protect against intravenous (i.v.) Pb sporozoite challenge along with the cutaneous challenge, but requires higher concentration of antibody. Importantly, the study shows that the anti-CSP mAb not only affects sporozoite motility, sinusoidal extravasation, and cell invasion but also partially impairs the intracellular development inside the liver parenchyma, indicating a late effect of this antibody during liver stage development. While the study is interesting and conducted well, the only novel yet very important observation made in this manuscript is the effect of the anti-CSP mAb on liver stage development.
Major
This observation is highlighted in the manuscript title but is supported by only limited data. A such it needs to be substantiated and a mechanism should be investigated. The phenomenon of intracellular effects of the anti-CSP mAb should be analyzed in much more detail. For example, can the authors demonstrate uptake of the Ab together with the parasite during hepatocyte invasion? What cellular mechanism leads to elimination?
Lines 234 - 243; 308 - 325: These results are the gist of the entire study and also defined the title of the manuscript. Thus, it would be pre-mature to claim the substantial effect of 3D11 antibody in late killing of the parasite in the infected hepatocytes just by looking at the decreased GFP fluorescence. The authors need to at least verify the fitness of the liver stages by measuring the size of the developing parasites as well as using different parasite specific markers (UIS4, MSP1, HSP70 etc.) in immunofluorescence assays on the infected liver sections and in vitro infections.
We greatly appreciate the comments. We have taken the suggestions into consideration and deepened the characterization of 3D11's late killing of parasites. We first analyzed the presence of 3D11 in the intracellular parasite after the invasion and compared it with the CSP expression on the surface of control parasites (new Fig. 4F). Next, we tested a potential action of 3D11 added in the cell culture after the invasion (new Fig. 4G). The two new panels and the text accompanying them are shown below.
“Post-invasion labeling of 3D11 bound to the membrane of intracellular parasites revealed a strong staining surrounding the parasite at 2 and 15h, but only punctual traces of 3D11 at 44h (Figure 4F, 3D11, 3D11). Of note, CSP was detected surrounding the control parasites at all time-points indicating that the lack of staining at 44h is not due to a decrease in the CSP amount on the parasite surface (Figure 4F, CSP, Control). To evaluate the potential post-invasion entry of 3D11 into the PV of infected cells and posterior neutralization of intracellular parasites, we incubated invaded cells from 2 to 44 h with 3D11, but no effect on the parasite intracellular development was observed (Figure 4G, 2h p.i.). 3D11 incubated for 2 h with sporozoites and cells elicited, as expected, a dose-dependent inhibition of parasite development. Altogether, our results indicate that the late inhibition of parasite development is already achieved at 15h and likely caused by antibodies dragged inside cells bound to sporozoites before or during the invasion.”
Finally, we better characterized the parasite loss of fitness caused by 3D11 in infected cells by quantifying the parasite size, GFP intensity and the presence and intensity of UIS4, a parasitophorous vacuole membrane developmental marker at 2, 4 and 44h as described below in the new figure 5 and accompanying text.
“To further characterize the killing of intracellular parasites by 3D11 in HepG2 cells, we next evaluated the expression of the parasitophorous vacuole membrane (PVM) marker, UIS4 37, to infer the parasite intracellular development at 2, 4 and 44h. HepG2 cells were incubated with Pb-GFP expressing sporozoites in the absence (Control, Figure 5) or presence of 1.25 µg/mL of 3D11 during the first two hours of incubation (3D11, Figure 5). The chosen 3D11 concentration led to ~50% decrease in cell invasion (Figure 4C, 2h) and ~30% decrease in the post-invasion number of EEFs (Figure 4D), leaving enough parasites to be analyzed by microscopy. To distinguish between extracellular and intracellular parasites at 2h, washed and fixed samples were incubated with mouse 3D11 mAb (1µg/mL) and revealed with a fluorescent anti-mouse secondary antibody (Figure 5A, 3D11 in blue). Samples were then permeabilized and incubated with a goat anti-UIS4 polyclonal antibody revealed with a fluorescent anti-goat secondary antibody (Figure 5A, UIS4 in red). DNA was stained with Hoechst (Figure 5A, DNA in white).
Extracellular GFP+ sporozoites were identified by their 3D11+UIS4- phenotype (Figure 5A, 2h, extracellular). Conversely, intracellular parasites were identified by their 3D11- phenotype and stained positive or negative for UIS4 (Figure 5A, 2h and 44h, intracellular). UIS4+ PVM is normally associated with a productive cell infection 37. However, a small number of EEFs can develop in the absence of UIS4 37, likely inside the host cell nucleus (Figure 5A, 44h, intranuclear).
In the control and 3D11-treated groups, the percentage of intracellular UIS4- parasites decreased 2 to 3-fold from 2 to 44h, as expected of a parasite population negative for a marker of productive infection (Figure 5B). However, while at 2h in the control group, this population represented 14% of intracellular parasites, in the 3D11-treated group, it reached 48% (Figure 5B). This ~3-fold increase in the UIS4 negative population could explain the late killing of intracellular sporozoites by 3D11. Whether this population is constituted by intracellular transmigratory sporozoites lacking a PVM or parasites surrounded by a PVM, but incapable of secreting UIS4 still needs to be determined. At 44h, surviving EEFs in the 3D11-treated samples presented a similar area and UIS4 staining intensity than control parasites (Figure 5C, D). However, as observed by flow cytometry (Figure 4D), the GFP intensity of 3D11-treated parasites was significantly lower than control EEFs, indicating that 3D11 can somehow affect protein expression with undetermined effects in the genesis of red blood cell infecting stages.”
Minor<br /> • Line 44 - 43: The statement is applicable only to the rodent infecting Plasmodium parasites. The authors need to clarify that.
This is an important clarification. We have modified the text that now reads:
“The sporozoite surface is covered by a dense coat of the circumsporozoite protein (CSP), shown to be an immunodominant protective antigen using a rodent malaria model”
• Line 68: Replace the second 'against' after the CSP with 'of'.
It is done.
• Line 141 - 143: The 3D11 mAb does affect the homing and killing in the blood of cutaneous injected sporozoites. The authors need to clearly state that the statement is true only for i.v. injected sporozoites.
Thank you for the comment. Now the text reads:
“Altogether, these data indicate that 3D11 rather than having an early effect on i.v. inoculated sporozoites in the blood circulation, e.g. by inhibiting the homing or killing the parasite in the blood, requires more than 4 h to eliminate most parasites in the liver.”
• Figure 3B: The numbers of sporozoites detected in the experiment varies from 0 h (line 172) to 2 h (line 184). Therefore, the numbers need to be mentioned on all the bars of each timepoint.
We have now added the numbers at the top of the graph from Figure 3B.
• Figure 3C: If the authors have used flk1-GFP mice, then how well they were able to detect the Pb-PfCSP GFP parasites in the vessel vs. parenchyma in the intravital imaging? The representative images for Pb-PfCSP GFP should also be included.
Since 3D11 does not target PbPf parasites most of them are motile in the movies, making them easily distinguishable from the endothelial cells. In addition, the stronger GFP intensity of sporozoites makes them detectable in the sinusoids. Representative images were added in the new Figure S3.
• It is not mentioned anywhere how the viability of the sporozoites was determined. This has to be described especially in the methods section.
• Also, the flow acquisition and data analysis of the sporozoites and infected HepG2 cells must be described in the method section.
We briefly mentioned it in the results (line 228- 230): “In addition, by comparing the total number of recovered GFP+ sporozoites at 2 h in the two studied conditions, we measured the early lethality (%viable sporozoites, Figure 4B) of the anti-CSP Ab on the extracellular forms of the parasite (Figure 4A).”
A more detailed description has been added in the methods section that now reads:
“After 2 h, the supernatant was collected, and the culture was washed 2x with 0.5 volume of PBS. The cells were subsequently trypsinized. The supernatant plus the washing steps and the trypsinized cells were analyzed by flow cytometry to quantify the amount of GFP+ events inside and outside cells (Figure 3A and Figure S4). Viability was then quantified by the sum of the total number of sporozoites (GPF+ events) in the supernatant, inside and outside the cells. We calculated the percentage of parasite viability by dividing the average of the total number of sporozoites in the treated samples by the average in controls using three technical replicates for each condition. Additionally, we quantified the percentage of infected cells using the total number of GFP+ events in the HepG2 gate (Figure S4). To compare the biological replicates, we further normalized to the control of each experiment. For the samples used to analyze parasite development, the cells were incubated for 15 or 44 h after sporozoite addition, and the medium was changed after 2 and 24 h. The cells were trypsinized and the percentage of intracellular parasites was determined by flow cytometry as described above (Figure S4). The prolonged effect between 2 h and 15/44 h was calculated by normalizing the percentage of infected cells at 15/44 h to that of 2 h. For all flow cytometry measurements, the same volume was acquired.”
• Figure 4: The flow layouts should be included for at least comparing the 0 vs. 5 μg/ml of 3D11 mAb concentrations.
Flow layouts were added in the supplementary figure 4.
• Line 651 (Figure S1 legend): Typographical error '14'.
Thank you for noticing. We corrected it.
Reviewer #2 (Evidence, reproducibility and clarity (Required)):
Aguirre-Botero and collaborators report on the dynamics of Plasmodium parasite elimination in the liver using the 3D11 anti-CSP monoclonal antibody (mAb). By using microscopy and bioluminescence imaging in the P. berghei rodent malaria model, the authors first demonstrate that higher antibody concentrations are required for protection against intravenous sporozoite challenge, when compared to cutaneous challenge, which is not surprising. The study also shows that the 3D11 mAb reduces sporozoite motility, impairs hepatic sinusoidal barrier crossing, and more relevantly inhibits intracellular development of liver stages through its cytotoxic activity. These findings highlight the role of this specific monoclonal antibody, 3D11 mAb against CSP, in targeting sporozoites in the liver. >
Major Comments
The study provides valuable insights into the mechanisms of protection conferred by the 3D11 anti-CSP monoclonal antibody against P. berghei sporozoites and this finding allow the field to speculate that other monoclonal antibodies against CSP of P. Falciparum may act similarly. However, an important experiment is missing that would significantly strengthen the conclusions. Specifically, the authors should perform experiments where the monoclonal antibody is added immediately after the sporozoites have completed invasion. This should be done both in vitro and in vivo to show whether the antibody has any effect on intracellular development of liver stages when added after invasion.
While the claims are generally supported by the data presented, to comprehensively conclude the late cytotoxic effects of 3D11, the additional experiment of post-invasion antibody application is relevant. This would help determine if the observed effects are due to the antibody's action during invasion or its continued action post-invasion.
The data and methods are presented in a manner that allows for reproducibility. The use of microscopy and bioluminescence imaging is well-documented. The experiments appear adequately replicated, and statistical analyses are appropriate.
We thank reviewer 2 for these important suggestions. To be sure that the effect might not come from the internalization of the antibodies after sporozoite invasion, we tested the amount of 3D11 bound to the parasite following invasion (new Fig. 4F) and the potential post-invasion neutralizing effect of 3D11 in vitro. The results obtained are presented below.
“Post-invasion labeling of 3D11 bound to the membrane of intracellular parasites revealed a strong staining surrounding the parasite at 2 and 15h, but only punctual traces of 3D11 at 44h (Figure 4F, 3D11, 3D11). Of note, CSP was detected surrounding the control parasites at all time-points indicating that the lack of staining at 44h is not due to a decrease in the CSP amount on the parasite surface (Figure 4F, CSP, Control). To evaluate the potential post-invasion entry of 3D11 into the PV of infected cells and posterior neutralization of intracellular parasites, we incubated invaded cells from 2 to 44 h with 3D11, but no effect on the parasite intracellular development was observed (Figure 4G, 2h p.i.). 3D11 incubated for 2 h with sporozoites and cells elicited, as expected, a dose-dependent inhibition of parasite development. Altogether, our results indicate that the late inhibition of parasite development is already achieved at 15h and likely caused by antibodies dragged inside cells bound to sporozoites before or during the invasion.”
Minor Comments
The text and figures are clear and accurate. Some minor typographical errors should be corrected.
Thank you for the remark; we have verified the text again to remove typographical errors.
Reviewer #3 (Evidence, reproducibility and clarity (Required)):
Aguirre-Botero et al have studied the effect of a potent monoclonal antibody against the circumsporozoite protein, the major surface protein of the malaria sporozoite. This is an elegantly designed, performed, and analyzed study. They have efficiently delineated the mode of action of anti-CSP repeat mAb and confirmed previous in vitro work (not cited) that demonstrated the same intracellular effect.
Specific comments
Line 51: The authors claim a correlation between high antibody levels and protection. However, they did not provide direct proof that these antibodies were responsible for protection, nor did they establish a cut-off level of anti-CSP antibodies that would distinguish between protected and unprotected individuals.
We thank reviewer 3 for the comments. Indeed, we agree with reviewer 3, these are correlative studies where the causality cannot be established. We modified the ensuing sentence to specify the causality between anti-CSP mAbs and in vivo protection against sporozoite infection. Now the text reads:
“Extensive research has demonstrated a positive correlation between high levels of anti-CSP antibodies (Abs) induced by the RTS,S/AS01 vaccine and efficacy against malaria(11-13). Remarkably, anti-CSP monoclonal Abs (mAbs) have been proven to protect in vivo against malaria in various experimental settings, including, mice(14-21), monkeys(23), and humans(24-26)”
Line 326: The late intrahepatic effect of mAb against the CSP repeat has been previously reported (see Figure 2, Nudelman et al, J Immunol, 1989). The effect was shown to affect the transition from liver trophozoites to liver schizonts. This study should be cited and discussed.
Thank you for this important remark. We included this seminal reference and now the modified text reads:
“Notably, a similar effect has been previously reported using sera from mice immunized with PfCSP or mAb against P. yoelii (Py) CSP. Incubation of Pf or Py sporozoites with the immune sera or mAbs not only affected sporozoite invasion in vitro but continued to affect intracellular forms for several days after invasion(38,39). Additionally, using anti-PfCSP sera, it was also observed that late EEFs from sera-treated sporozoites had abnormal morphology(38). Altogether, it was thus concluded that the anti-CSP Abs present in the sera had a long-term effect on the parasites(38,39).”
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Public Reviews:
Reviewer #1 (Public review):
Summary:
This manuscript by Kaya et al. studies the effect of food consumption on hippocampal sharp wave ripples (SWRs) in mice. The authors use multiple foods and forms of food delivery to show that the frequency and power of SWRs increase following food intake, and that this effect depends on the caloric content of food. The authors also studied the effects of the administration of various food-intake-related hormones on SWRs during sleep, demonstrating that ghrelin negatively affects SWR rate and power, but not GLP-1, insulin, or leptin. Finally, the authors use fiber photometry to show that GABAergic neurons in the lateral hypothalamus, increase activity during a SWR event.
Strengths:
The experiments in this study seem to be well performed, and the data are well presented, visually. The data support the main conclusions of the manuscript that food intake enhances hippocampal SWRs. Taken together, this study is likely to be impactful to the study of the impact of feeding on sleep behavior, as well as the phenomena of hippocampal SWRs in metabolism.
Weaknesses:
Details of experiments are missing in the text and figure legends. Additionally, the writing of the manuscript could be improved.
We thank the reviewer for their favorable assessment of the work and its potential impact. We will add all requested details in the text and figure legends and will revise the wording of the manuscript to improve its clarity.
Reviewer #2 (Public review):
Summary:
Kaya et al uncover an intriguing relationship between hippocampal sharp wave-ripple production and peripheral hormone exposure, food intake, and lateral hypothalamic function. These findings significantly expand our understanding of hippocampal function beyond mnemonic processes and point a direction for promising future research.
Strengths:
Some of the relationships observed in this paper are highly significant. In particular, the inverse relationship between GLP1/Leptin and Insulin/Ghrelin are particularly compelling as this aligns well with opposing hormone functions on satiety.
Weaknesses:
I would be curious if there were any measurable behavioral differences that occur with different hormone manipulations.
We thank the reviewer for their favorable assessment of the work and its contribution to our understanding of non-mnemonic hippocampal function. Whether there are behavioral differences that occur following administration of the different hormones is a great question, yet unfortunately our study design did not include fine behavioral monitoring to the degree that would allow answering it. While some previous studies have partially addressed the behavioral consequences of the delivery of these hormones (we will include a reference to these studies in the revised manuscript), how these changes may interact with the hippocampal and hypothalamic effects we observe is a very interesting next step.
Reviewer #3 (Public review):
Summary:
The manuscript by Kaya et al. explores the effects of feeding on sharp wave-ripples (SWRs) in the hippocampus, which could reveal a better understanding of how metabolism is regulated by neural processes. Expanding on prior work that showed that SWRs trigger a decrease in peripheral glucose levels, the authors further tested the relationship between SWRs and meal consumption by recording LFPs from the dorsal CA1 region of the hippocampus before and after meal consumption. They found an increase in SWR magnitude during sleep after food intake, in both food restricted and ad libitum fed conditions. Using fiber photometry to detect GABAergic neuron activity in the lateral hypothalamus, they found increased activity locked to the onset of SWRs. They conclude that the animal's satiety state modulates the amplitude and rate of SWRs, and that SWRs modulate downstream circuits involved in regulating feeding. These experiments provide an important step forward in understanding how metabolism is regulated in the brain. However, currently, the paper lacks sufficient analyses to control for factors related to sleep quality and duration; adding these analyses would further support the claim that food intake itself, as opposed to sleep quality, is primarily responsible for changes in SWR activity. Adding this, along with some minor clarifications and edits, would lead to a compelling case for SWRs being modulated by a satiety state. The study will likely be of great interest in the field of learning and memory while carrying broader implications for understanding brain-body physiology.
Strengths:
The paper makes an innovative foray into the emerging field of brain-body research, asking how sharp wave-ripples are affected by metabolism and hunger. The authors use a variety of advanced techniques including LFP recordings and fiber photometry to answer this question. Additionally, they perform comprehensive and logical follow-up experiments to the initial food-restricted paradigm to account for deeper sleep following meal times and the difference between consumption of calories versus the experience of eating. These experiments lay the groundwork for future studies in this field, as the authors pose several follow-up questions regarding the role of metabolic hormones and downstream brain regions.
We thank the reviewer for their appreciation and constructive review of the work.
Weaknesses:
Major comments:
(1) The authors conclude that food intake regulates SWR power during sleep beyond the effect of food intake on sleep quality. Specifically, they made an attempt to control for the confounding effect of delta power on SWRs through a mediation analysis. However, a similar analysis is not presented for SWR rate. Moreover, this does not seem to be a sufficient control. One alternative way to address this confound would be to subsample the sleep data from the ad lib and food restricted conditions (or high calorie and low calorie, etc), to match the delta power in each condition. When periods of similar mean delta power (i.e. similar sleep quality) are matched between datasets, the authors can then determine if a significant effect on SWR amplitude and rate remains in the subsampled data.
This is an important point that we believe we addressed in a few complementary ways. First, the mediation analysis we implemented measures the magnitude and significance of the contribution of food on SWR power after accounting for the effects of delta power, showing a highly significant food-SWR contribution. While the objective of subsampling is similar, mediation is a more statistically robust approach as it models the relationship between food, SWR power, and delta power in a way that explicitly accounts for the interdependence of these variables. Further, subsampling introduces the risk of losing statistical power by reducing the sample size, due to exclusion of data that might contain relevant and valuable information. Mediation analysis, on the other hand, uses the full dataset and retains statistical power while modeling the relationships between variables more holistically. However, as we were not satisfied with a purely analytical approach to test this issue, we carried out a new set of experiments in ad-libitum fed mice, where there is no potential issue of food restriction impairing sleep quality in the pre-sleep session. In these conditions food amount also significantly correlated with, and showed significant mediation of, the SWR power change. Finally, we acknowledge and discuss this point in the Discussion, highlighting that given the known relationship between cortical delta and SWRs, it is challenging to fully disentangle these signals.
(2) Relatedly, are the animals spending the same amount of time sleeping in the ad lib vs. food restricted conditions? The amount of time spent sleeping could affect the probability of entering certain stages of sleep and thus affect SWR properties. A recent paper (Giri et al., Nature, 2024) demonstrated that sleep deprivation can alter the magnitude and frequency of SWRs. Could the authors quantify sleep quantity and control for the amount of time spent sleeping by subsampling the data, similar to the suggestion above?
We will include a comparison of sleep amount in the revised manuscript.
Additionally, we will add details to the Methods section that were missing in the original submission that are relevant to this point. Specifically, within the sleep sessions, the ongoing sleep states were scored using the AccuSleep toolbox (https://github.com/zekebarger/AccuSleep) using the EEG and EMG signals. NREM periods were detected based on high EEG delta power and low EMG power, REM periods were detected based on high EEG theta power and low EMG power, and Wake periods were detected based on high EMG power. Importantly, only NREM periods were included for subsequent SWR detection, quantification and analyses (in particular, reported SWR rates reflect the number of SWRs per second of NREM sleep).
(3) Plot 5I only reports significance but does not clearly show the underlying quantification of LH GABAergic activity. Upon reading the methods for how this analysis was conducted, it would be informative to see a plot of the pre-SWR and post-SWR integral values used for the paired t-test whose p-values are currently shown. For example, these values could be displayed as individual points overlaid on a pair of box-and-whisker plots of the pre- and post-distribution within the session (perhaps for one example session per mouse with the p-value reported, to supplement a plot of the distribution of p-values across sessions and mice). If these data are non-normal, the authors should also use a non-parametric statistical test.
We will include this quantification and visual representation in the revised manuscript.
Minor comments:
(4) A brief explanation (perhaps in the discussion) of what each change in SWR property (magnitude, rate, duration) could indicate in the context of the hypothesis may be helpful in bridging the fields of metabolism and memory. For example, by describing the hypothesized mechanistic consequence of each change, could the authors speculate on why ripple rate may not increase in all the instances where ripple power increases after feeding? Why do the authors speculate that ripple duration does not increase, given that prior work (Fernandez-Ruiz et al. 2019) has shown that prolonged ripples support enhanced memory?
We will include a discussion of these points in the revised manuscript.
(5) The authors suggest that "SWRs could modulate peripheral metabolism" as a future implication of their work. However, the lack of clear effects from GLP-1, leptin and insulin complicates this interpretation. It might be informative for readers if the authors expanded their discussion of what specific role they speculate that SWRs could play in regulating metabolism, given these negative results.
While we provided potential explanations for the lack of effects of the hormone administrations, we will further elaborate on this point in the revised manuscript.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Public Reviews:
Reviewer #1 (Public review):
Summary: <br /> In this manuscript, the authors identified that
(1) CDK4/6i treatment attenuates the growth of drug-resistant cells by prolongation of the G1 phase;
(2) CDK4/6i treatment results in an ineffective Rb inactivation pathway and suppresses the growth of drug-resistant tumors;
(3) Addition of endocrine therapy augments the efficacy of CDK4/6i maintenance;
(4) Addition of CDK2i with CDK4/6 treatment as second-line treatment can suppress the growth of resistant cell;
(5) The role of cyclin E as a key driver of resistance to CDK4/6 and CDK2 inhibition.
Strengths:
To prove their complicated proposal, the authors employed orchestration of several kinds of live cell markers, timed in situ hybridization, IF and Immunoblotting. The authors strongly recognize the resistance of CDK4/6 + ET therapy and demonstrated how to overcome it.
Weaknesses:
The authors need to underscore their proposed results from what is to be achieved by them and by other researchers.
Thank you for your thoughtful review and for highlighting both the strengths and weaknesses of our manuscript. We appreciate your recognition of the methodological rigor and the significance of our findings in addressing resistance to CDK4/6 inhibitors combined with endocrine therapy.
To address your concern regarding the need to delineate our results from those achieved by other researchers, we will incorporate clarifications in the revised manuscript. Specifically, we will:
(1) Clearly distinguish our novel contributions from prior findings in the field.
(2) Explicitly cite and discuss relevant studies to contextualize our work, ensuring that our contributions are appropriately framed within the broader body of knowledge.
These revisions will enhance the transparency and impact of our manuscript, as well as highlight the originality and significance of our findings. Thank you again for your constructive feedback.
Reviewer #2 (Public review):
Summary:
This study elucidated the mechanism underlying drug resistance induced by CDK4/6i as a single agent and proposed a novel and efficacious second-line therapeutic strategy. It highlighted the potential of combining CDK2i with CDK4/6i for the treatment of HR+/HER2- breast cancer.
Strengths:
The study demonstrated that CDK4/6 induces drug resistance by impairing Rb activation, which results in diminished E2F activity and a delay in G1 phase progression. It suggests that the synergistic use of CDK2i and CDK4/6i may represent a promising second-line treatment approach. Addressing critical clinical challenges, this study holds substantial practical implications.
Weaknesses:
(1) Drug-resistant cell lines: Was a drug concentration gradient treatment employed to establish drug-resistant cell lines? If affirmative, this methodology should be detailed in the materials and methods section.
We greatly appreciate the reviewer for raising this important question. In the revised manuscript, we will update the methods section to include a detailed description of how the drug-resistant cell lines were developed. Specifically, we will clarify whether a drug concentration gradient treatment was employed and provide step-by-step details to ensure reproducibility.
(2) What rationale informed the selection of MCF-7 cells for the generation of CDK6 knockout cell lines? Supplementary Figure 3. A indicates that CDK6 expression levels in MCF-7 cells are not notably elevated.
We appreciate the reviewer’s insightful question about the rationale for selecting MCF-7 cells to generate CDK6 knockout cell lines. This choice was guided by prior studies highlighting the significant role of CDK6 in mediating resistance to CDK4/6 inhibitors (1-4). Moreover, we observed a 4.6-fold increase in CDK6 expression in CDK4/6 inhibitor-resistant MCF-7 cells compared to their drug-naïve counterparts (Supplementary Figure 3A). While we did not detect notable differences in CDK4/6 activity between wild-type and CDK6 knockout cells under CDK4/6 inhibitor treatment, these findings point to a potential non-canonical function of CDK6 in conferring resistance to CDK4/6 inhibitors.
(3) For each experiment, particularly those involving mice, the author must specify the number of individuals utilized and the number of replicates conducted, as detailed in the materials and methods section.
We sincerely thank the reviewer for bringing this to our attention. In the revised manuscript, we will provide explicit details regarding the number of replicates and mice used for each experiment. This information will be included in the materials and methods section, figure legends, and relevant text to ensure transparency and clarity.
(4) Could this treatment approach be extended to triple-negative breast cancer?
We greatly appreciate the reviewer’s inquiry about extending our findings to triple-negative breast cancer (TNBC). Based on our data presented in Figure 1 and Supplementary Figure 2, which include the TNBC cell line MDA-MB-231, we anticipate that the benefits of maintaining CDK4/6 inhibitors could indeed be applied to TNBC with an intact Rb/E2F pathway.
Reviewer #3 (Public review):
Summary:
In their manuscript, Armand and colleagues investigate the potential of continuing CDK4/6 inhibitors or combining them with CDK2 inhibitors in the treatment of breast cancer that has developed resistance to initial therapy. Utilizing cellular and animal models, the research examines whether maintaining CDK4/6 inhibition or adding CDK2 inhibitors can effectively control tumor growth after resistance has set in. The key findings from the study indicate that the sustained use of CDK4/6 inhibitors can slow down the proliferation of cancer cells that have become resistant, and the combination of CDK2 inhibitors with CDK4/6 inhibitors can further enhance the suppression of tumor growth. Additionally, the study identifies that high levels of Cyclin E play a significant role in resistance to the combined therapy. These results suggest that continuing CDK4/6 inhibitors along with the strategic use of CDK2 inhibitors could be an effective strategy to overcome treatment resistance in hormone receptor-positive breast cancer.
Strengths:
(1) Continuous CDK4/6 Inhibitor Treatment Significantly Suppresses the Growth of Drug-Resistant HR+ Breast Cancer: The study demonstrates that the continued use of CDK4/6 inhibitors, even after disease progression, can significantly inhibit the growth of drug-resistant breast cancer.
(2) Potential of Combined Use of CDK2 Inhibitors with CDK4/6 Inhibitors: The research highlights the potential of combining CDK2 inhibitors with CDK4/6 inhibitors to effectively suppress CDK2 activity and overcome drug resistance.
(3) Discovery of Cyclin E Overexpression as a Key Driver: The study identifies overexpression of cyclin E as a key driver of resistance to the combination of CDK4/6 and CDK2 inhibitors, providing insights for future cancer treatments.
(4) Consistency of In Vitro and In Vivo Experimental Results: The study obtained supportive results from both in vitro cell experiments and in vivo tumor models, enhancing the reliability of the research.
(5) Validation with Multiple Cell Lines: The research utilized multiple HR+/HER2- breast cancer cell lines (such as MCF-7, T47D, CAMA-1) and triple-negative breast cancer cell lines (such as MDA-MB-231), validating the broad applicability of the results.
Weaknesses:
(1) The manuscript presents intriguing findings on the sustained use of CDK4/6 inhibitors and the potential incorporation of CDK2 inhibitors in breast cancer treatment. However, I would appreciate a more detailed discussion of how these findings could be translated into clinical practice, particularly regarding the management of patients with drug-resistant breast cancer.
We greatly appreciate this opportunity to further contextualize our findings within clinical practice. In the revised manuscript, we will expand the discussion to explore how the identified mechanisms can inform patient stratification and therapeutic combinations. We will also highlight the potential of integrating CDK2 inhibitors with continued CDK4/6 inhibition as a second-line strategy for HR+ breast cancer patients who exhibit resistance to CDK4/6 inhibitors, leveraging insights from current and ongoing clinical trials. This will provide a clearer framework for translating our findings into actionable therapeutic strategies.
(2) While the emergence of resistance is acknowledged, the manuscript could benefit from a deeper exploration of the molecular mechanisms underlying resistance development. A more thorough understanding of how CDK2 inhibitors may overcome this resistance would be valuable.
Thank you for this insightful suggestion. In the revised manuscript, we will delve deeper into the molecular mechanisms by which CDK2 inhibitors counteract resistance to CDK4/6 inhibitors and endocrine therapy. We will emphasize the role of the non-canonical Rb inactivation pathway and upregulated transcriptional activity in reactivating CDK2, which contribute to resistance under CDK4/6 inhibition. Furthermore, we will discuss how dual inhibition of CDK4/6 and CDK2 effectively suppresses this resistance pathway, offering a mechanistic rationale for the therapeutic potential of this combination strategy.
(3) The manuscript supports the continued use of CDK4/6 inhibitors, but it lacks a discussion on the long-term efficacy and safety of this approach. Additional studies or data to support the safety profile of prolonged CDK4/6 inhibitor use would strengthen the manuscript.
We greatly appreciate the reviewer for raising this important point. To address this, we will incorporate a discussion on the long-term safety and efficacy of CDK4/6 inhibitor maintenance therapy. Drawing from clinical trials and retrospective analyses (5-9), we will highlight data supporting the tolerability of prolonged CDK4/6i treatment, particularly in combination with endocrine therapy. We will also discuss its clinical benefits over chemotherapy or endocrine therapy alone, contextualizing these findings with our proposed therapeutic approach (6,8-11).
References:
(1) Yang C, Li Z, Bhatt T, Dickler M, Giri D, Scaltriti M_, et al._ Acquired CDK6 amplification promotes breast cancer resistance to CDK4/6 inhibitors and loss of ER signaling and dependence. Oncogene 2017;36:2255-64
(2) Li Q, Jiang B, Guo J, Shao H, Del Priore IS, Chang Q_, et al._ INK4 Tumor Suppressor Proteins Mediate Resistance to CDK4/6 Kinase Inhibitors. Cancer Discov 2022;12:356-71
(3) Ji W, Zhang W, Wang X, Shi Y, Yang F, Xie H_, et al._ c-myc regulates the sensitivity of breast cancer cells to palbociclib via c-myc/miR-29b-3p/CDK6 axis. Cell Death & Disease 2020;11:760
(4) Wu X, Yang X, Xiong Y, Li R, Ito T, Ahmed TA_, et al._ Distinct CDK6 complexes determine tumor cell response to CDK4/6 inhibitors and degraders. Nature Cancer 2021;2:429-43
(5) Martin JM, Handorf EA, Montero AJ, Goldstein LJ. Systemic Therapies Following Progression on First-line CDK4/6-inhibitor Treatment: Analysis of Real-world Data. Oncologist 2022;27:441-6
(6) Xi J, Oza A, Thomas S, Ademuyiwa F, Weilbaecher K, Suresh R_, et al._ Retrospective Analysis of Treatment Patterns and Effectiveness of Palbociclib and Subsequent Regimens in Metastatic Breast Cancer. J Natl Compr Canc Netw 2019;17:141-7
(7) Basile D, Gerratana L, Corvaja C, Pelizzari G, Franceschin G, Bertoli E_, et al._ First- and second-line treatment strategies for hormone-receptor (HR)-positive HER2-negative metastatic breast cancer: A real-world study. Breast 2021;57:104-12
(8) Kalinsky K, Accordino MK, Chiuzan C, Mundi PS, Sakach E, Sathe C_, et al._ Randomized Phase II Trial of Endocrine Therapy With or Without Ribociclib After Progression on Cyclin-Dependent Kinase 4/6 Inhibition in Hormone Receptor–Positive, Human Epidermal Growth Factor Receptor 2–Negative Metastatic Breast Cancer: MAINTAIN Trial. Journal of Clinical Oncology;0:JCO.22.02392
(9) Kalinsky K, Bianchini G, Hamilton EP, Graff SL, Park KH, Jeselsohn R_, et al._ Abemaciclib plus fulvestrant vs fulvestrant alone for HR+, HER2- advanced breast cancer following progression on a prior CDK4/6 inhibitor plus endocrine therapy: Primary outcome of the phase 3 postMONARCH trial. Journal of Clinical Oncology 2024;42:LBA1001-LBA
(10) Mayer EL, Wander SA, Regan MM, DeMichele A, Forero-Torres A, Rimawi MF_, et al._ Palbociclib after CDK and endocrine therapy (PACE): A randomized phase II study of fulvestrant, palbociclib, and avelumab for endocrine pre-treated ER+/HER2- metastatic breast cancer. Journal of Clinical Oncology 2018;36:TPS1104-TPS
(11) Llombart-Cussac A, Harper-Wynne C, Perello A, Hennequin A, Fernandez A, Colleoni M_, et al._ Second-line endocrine therapy (ET) with or without palbociclib (P) maintenance in patients (pts) with hormone receptor-positive (HR[+])/human epidermal growth factor receptor 2-negative (HER2[-]) advanced breast cancer (ABC): PALMIRA trial. Journal of Clinical Oncology 2023;41:1001-
-
-
-
Author response:
We appreciate the time and thoughtful reviews of all 3 reviewers. Ahead of a full revision of the paper, we would like to address a couple of points the reviewers have raised that we plan to address in more detail in our full revision.
(1) The relationship between membrane tension and interfacial tension: The major request by reviewers was for a better explanation of the relationship between measured mechanical parameters and membrane interfacial tension. We plan to include a schematic of the different forces at play in the membrane and to clarify our discussion and here, provide a brief explanation.
In our study, we identified a relationship between channel activation pressure and two membrane mechanical properties (area expansion modulus (K<sub>A</sub>) and bending rigidity (K<sub>c</sub>)) though we did not find a correlation between channel activation pressure and a third mechanical property (membrane fluidity). Through further computational analysis of the membranes, we identified an additional property called interfacial tension that helps unify and explain our results. Interfacial tension (γ) is a property akin to surface tension that reflects the chemical composition at the interface of the membrane (between the polar headgroups of the lipids and the hydrophobic acyl chains of the lipids) and balances the repulsive interaction of the nonpolar hydrocarbon chains with the polar headgroup regions of the lipids. In the established polymer brush model, the expansion modulus is proportional to the interfacial tension (W. Rawicz, Biophyiscal Journal, 2000)
γ = K<sub>A</sub>/C,
where C is a constant. Interfacial tension occurs at the boundary between the lipid bilayer and external aqueous environment and is different from mechanical tension. While mechanical membrane tension (t) reflects a physical force in plane with the membrane, interfacial tension reflects the chemical composition at each interface of the membrane. While mechanical membrane tension depends on the size and shape of the membrane, interfacial tension is independent of these features and depends on the molecular composition of the liquid-liquid interface. An expanded discussion on this topic was recently provided (Lipowsky. Faraday Discussions. 2024). While distinct, these two properties can be related to one another via the area expansion modulus (K<sub>A</sub>). Typically, one would imagine that upon reducing interfacial tension, and correspondingly reducing the K<sub>A</sub>, it should now take less energy to stretch the membrane to the same extent and should reduce the activation pressure (and corresponding in plane mechanical tension ) required to open an embedded mechanosensitive channel. Interestingly though, interfacial tension also works to pull the channel open so that a reduction in interfacial tension also means more energy will be required to open the channel. We find that reductions in interfacial tension and corresponding increased energy required to open embedded channels outweighs the reduced tension that should be required to stretch the membrane. We plan to more clearly explain this tradeoff in our revision. Overal, our findings identify the exact properties driving mechanosensitive channel behavior in our study. Further, they provide a guide to understanding how and why shifts in mechanosensitive channel activation occur by connecting chemical composition changes to the changes in membrane tension propagation in a given membrane.
(2) Data presentation to support determined area expansion modulus and bending rigidity values: We will show stress strain curves used to derive Ka and kc values
(3) Address why membrane tension data was not shown for ephys experiments: The micropipette and patch clamp setups are different, and we did not use the same system for both measurements. In fact, limitations in tools that would allow for concurrent tension measurements while conducting channel activation measurements have limited our understanding of the role of membrane tension on mechanosensation to date. While recent studies have attempted to resolve this limitation through the design of new tools that enable concurrent monitoring of mechanosensitive channel activation and membrane tension (Lüchtefeld et al. Nature Methods. 2024), these tools were not available to us during our study or now. Because our study also attempted to connect these two features (membrane tension and channel activation) but we lacked tools to do so simultaneously, we used two sets of measurements to separately uncover membrane mechanical properties and channel activation pressure.
One reason it is difficult to measure membrane tension during a typical patch clamp study is because of limitations in the imaging equipment and pipettes used for this assay. The experiment is usually done by looking through the eyepiece and the pipette angle is around 45 degrees from the plane of the stage so it would be hard to visualize changes in the patch geometry in the tip of the pipette. Basically, we are able to see the pipette touch the GMPV, but cannot resolve the patch moving up the pipette. In response to the reviewer comment that tension=pressure difference times pipette radius divided by 2, we were unable to measure the radius and changes in radius of a patch upon increases in applied pressure due to the above mentioned imaging constraints. This limitation is why we were unable to directly measure applied tension with our current patch clamp set up.
(4) Interfacial tension is not experimentally measured: Interfacial tension = K<sub>A</sub> /C where C is a constant (typically C=4 for bilayer membranes). The best way to measure interfacial tension is to determine K<sub>A</sub> (the area expansion modulus), which we have experimentally done by generating stress vs strain curves for GPMVs. In literature, reductions in interfacial tension of a membrane are typically experimentally determined by measuring a corresponding reduction in the associated K<sub>A</sub> value (eg. Ly and Longo. Biophys J. 2004). We have similarly followed this approach.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public Review):
Therefore, their tool may be useful for stimulating multiple populations using a blue excitatory opsin in neuron A and their tool for red excitation of neuron B… Yet, there are no data presented that showcases their new tool for this purpose
We agree with the reviewer that in this manuscript we have not experimentally shown the applicability of our system for dual optical stimulation. However, the suppression of blue-light excitation of ZipV/T-IvfChr-expressing neurons strongly suggests this can be used in experiments exciting populations of neurons similarly shown for BiPOLES. We don’t see a theoretical basis where this experiment cannot be done if sufficient cell targeting mechanisms (such as the use of cre-lox or retroAAV) is utilized. We have started several projects pursuing these utilities in the meantime.
While they do show that red light = excitation and blue light = inhibition, they neither show 1) all-optical on/off modulation of the same cell; nor 2) high-frequency inhibition or excitation (max stim rate of 20hz, which is the same as the BiPOLES paper used for their LC stimulation paradigm; Vierock, as above, Figure 7a-d).
Regarding point 1, we understand that the reviewer asks if we have optically excited (with red light) and inhibited (with blue light) the same neurons. If so, figure 4B1 (optical excitation of ZipT-IvfCh with red light) and figure 5A (optical inhibition of ZipT-IvfCh with blue light) represent largely the same set of neurons.
Regarding point 2, we respectfully disagree with the reviewer’s interpretation of Figure 7a-d) in Vierock et al. As we understand, in this part the authors apply a 20 Hz optical stimulation protocol to the LC neurons in vivo. However, there is no data showing that individual neurons do follow this stimulation protocol. To be clear, we are not saying that BiPOLES cannot drive 20 Hz APs. Very likely it can. It is based on ChrimsonR which is capable of doing so (Klapoetke et al., Figure 2). Although, in this manuscript we have not shown data for optical stimulation above 20Hz, our system is based on vfChrimson, which is known to drive AP of 100Hz and above (Mager et al., figure 2 and 3).
… they must revise the manuscript to show that their approach is both 1) different in some way when compared to BiPOLES (it is my understanding that they did not do this, as per the supplementary alignment of the BiPOLES sequence and the sequence of the BiPOLES-like construct that they did test) and 2) that the properties that the investigators specifically tailored their construct to have confer some sort of experimental advantage when compared to the existing standard.
In the latest version of the manuscript, we have compared our ZipV-IvfChr and the BiPOLES construct adapted with vfChrimson (Fig. 2 Suppl 1). The mean photocurrent amplitude of IvfChr in the ZipV-IvfChr construct is ~2.7 x higher than BiPOLES adapted with vfChrimson (14 randomly selected HEK293 cells in each group) (Fig. 2 Suppl 1B). We conducted this experiment in HEK293 cells to ensure accurate voltage-clamping and less biased cell selection. Even adjusting for the smaller photocurrent of vfChrimson vs ChrimsonR, this would still translate to ~1.6 x greater photocurrent with ZipV-IvfChr compared to the original BiPOLES utilizing ChrimsonR. We believe the increased efficiency of excitation is an important aspect of adapting vfChrimson for red-light excitation of neurons.
Reviewer #2 (Public Review):
(1) In the Introduction or Discussion, the authors could better motivate the need for a red-shifted actuator that lacks blue crosstalk, by giving some specific examples of how the tool could be productively used, e.g. pairing with another blue-shifted excitatory opsin in a different population, or pairing with a GFP-based fluorescent indicator, e.g. GCaMP. The motivation for the current tool is not obvious to non-experts.
In the discussion, we now provided examples for potential use of the tool. For example, one of the key aspects that can be manipulated by the existing tool is the induction of spike-timing dependent plasticity with 2 wavelengths of light with blue light channelrhodopsin such as oChIEF is used to evoke presynaptic release and ZipT-IvfChr expressed in postsynaptic neuron. In this situation, the rapid termination of inhibitory response is critical so it does not interfere with the induction of LTP or LTD. Another experiment is the alternate control of projection neurons and interneurons in cortical areas, independent controls of neurons of direct and indirect pathways in the striatum to manipulate behavior.
(2) Simultaneous excitation and inhibition are not the same as non-excitation. The authors mentioned shunting briefly. Another possible issue is changes in osmotic balance. Activation of a Na+ channel and a Cl- channel will lead to net import of NaCl into the cell, possibly changing osmotic pressure. Please discuss.
We agree with the notion that osmotic, ionic and pH changes in small neuronal structure can be disruptive to the physiology and this is the reason we developed our approach where the fastest channelrhodopsins are used so we can minimize the channel opening time and the flux of ions through the channels when brief light illuminations are applied. Not only the flux of protons, sodium ions and calcium ions are minimized, the flux of chloride should be minimal as well (as the membrane potential should be close to the reversal potential of chloride reversal potential hence low ion flow). Hence our approach should be minimally disruptive compared to most other existing channelrhodopsin-based approaches when short or minimal light pulses were used in conjunction with our tools. This recommendation is included in the updated manuscript .
(3) The authors showed that in ZipT-IvfChr, orange light drives excitation and blue light does not. But what about simultaneous blue and orange light? Can the blue light overwhelm the effect of the orange light? Since the stated goal is to open the blue part of the spectrum for other applications, one is now worried about "negative" crosstalk. Please discuss and, ideally, characterize this phenomenon.
We now have performed this experiment. Simultaneous blue (470nm) and red light (635nm) stimulation does not produce AP (Fig .4 Suppl 1A)). This suggests the inhibitory effect of ACR is more efficient than the excitatory effects of IvfChr due to their higher conductance, this re-emphasizes the rapid termination of the ACR effects is critical for minimal disruption of physiological effects in such pairing strategy.
(3.1) Does the use of the new tool require careful balancing of the expression levels of the ZipT and the IvfChr? Does it require careful balancing of blue and orange light intensities?
As with any optogenetic tool, the users should validate the efficacy of the tool in their own system. Our tool solely relies on the balanced expression of the 2A system, the efficiency of the two opsins and their degradation of the time-span of expression. These aspects of the tool would be better addressed in future versions of the tools or improvement of the BiPOLES-type of tandem expression in subsequent versions. From the instrumentation side, the light intensity and differential penetration depth requires careful consideration. However, this holds true in most optogenetic and fluorescence imaging-based approaches as well. In the current update of the manuscript, we have included further discussion on these aspects as well.
(3.2) Also, many opsins show complex and nonlinear responses to dual-wavelength illumination, so each component should be characterized individually under simultaneous blue + orange light.
We now have performed this experiment (please see our comment to point 3)
(3.3) I was expecting to see photocurrents at different holding potentials as a function of illumination wavelength for the coexpressed construct (i.e. to see at what wavelength it switches from being excitatory to inhibitory); and also to see I-V curves of the photocurrent at blue and orange wavelengths for the co-expressed constructs (i.e. to see the reversal potential under blue excitation). Overall, the patch clamp and spectroscopic characterization of the individual constructs was stronger than that of the combined constructs.
We have added the IV curves for the co-expressed construct at different holding potentials for 470nm and 635nm wavelengths. This shows reverse potential for the two wavelengths that are intended for in vitro and in vivo applications. Performing a similar experiment for a variety of wavelengths would not be as valuable, in part, due to the enormous amount of data generated. As we have shown in the study, the response of any channelrhodopsins vary with different light duration and light intensities in addition to the wavelengths and holding potentials. The results for each recorded cell could include stimulation by different wavelengths, stimulation by different illumination intensities, stimulation with different light duration in addition to different holding potentials. Not only would the results be highly variable from cell-to-cell, there will be potentially hundreds or thousands of combinations to be tested per cell (e.g., 5 light intensities @1, 2.5 , 5 , 10 and 20 mW/mm>sup>2</sup>, 8 different wavelengths @ 450nm, 475nm, 500nm, 525nm, 550nm, 575nm, 600nm and 625nm, 7 light durations @ 1ms, 5ms, 10ms, 50ms, 100ms, 500ms and 1s, and , and 6 holding potentials @ -80mV, -70mV, -60mV, -40mV, -20mV and 0mV would result in 1680 stimulation conditions per recorded cell).Technically, the significant lowering of membrane resistance when both IvfChr and ZipACR variants are activated simultaneously would compromise the quality of voltage-clamping even in HEK293 cells with series resistance compensation. We have yet to see any other studies that had included such ambitious electrophysiology experiment for the channelrhodopsin characterization, likely due to the feasibility of such experiment.
Reviewer #3 (Public Review):
(1) The enhanced vf-Chrimson could potentially be a highlight of the manuscript, serving broader applications. Yet, gauging the overall improvements of ivf-Chrimson in comparison to other Chrimson variants remains intricate due to several reasons. First, photocurrents from ivf-Chrimson seem smaller than those from C-Chrimson (Supplemental Figure 3), and a direct comparison with standard vf-Chrimson is absent.
We appreciate the reviewer’s positive view of our modified variant. We did not emphasize this particular modification as it was identical to our previous published modification and similar to that previously published by others (CsChrimson and C1Chrimson). In all these cases, improved membrane expression was consistently detected. We believe that expression data and our comparison of C-Chrimson and IvfChr is sufficient to justify the improved membrane expression and function.
Second, while membrane expression of ivf-Chrimson appears enhanced in provided brightfield recordings, the quantitative analysis would necessitate confocal microscopy and a membrane marker (Supplemental Figure)
We have now quantified the results with a membrane palmitoylated mCherry using confocal microscopy shown in Fig 2 Suppl1 A. We measured the Pearson Correlation Coefficient of the mCherry with EGFP or Citrine signal for the 6 constructs (vfChrimson, vfChrimson with trafficking sequence, vfChrimson with N-terminal signaling peptide from oChIEF (C-vfChrimson), vfChrimson with trafficking sequence and N-terminal signaling peptide from oChIEF (IvfChr), BiPOLES with EGFP or citrine and vfChrimson) and the results were identical and consistent with the prior results using epifluorescence microscopy.
(2) Finally, other N-terminal modified Chrimson variants, like CsChrimson by Klapoetke et al. in 2014 and C1Chrimson by Oda et al. in 2018, have been generated. Comparing ivf-Chrimson to vf-CsChrimson or vf-C1Chrimson would be important to evaluate the benefits of the applied N-terminal modification.
Our development of IvfChrimson is similar to the approach of vf-CsChrimson and identical to that of vf-C1Chrimson and we do not claim these modifications to be unique or superior. However, we have developed our design independently of these other studies and we have more extensive functional comparison and characterization data of our IvfChrimson variant than the other studies.
(2.1) The action spectra of ZipACR suggest peak absorption of ZipACR WT and its mutant at 525 - 550 nm (Fig. 3). This is even further red-shifted than previously reported by Govorunova et al. Further action spectra recordings differ for all constructs between recordings initiated with blue or red light (Supplementary Fig. 5). This discrepancy is unexpected and should be discussed.
We thank the reviewer for the comment, this was a mistake in the traces used for the figure. The example traces were the spectral response measured from the 400 nm to 650 nm instead of the 650 nm to 400 nm order shown in the spectral data. This has now been corrected.
Additionally, the representative photocurrents of Zip(151V) in Fig. 3D1 do not align with the corresponding action spectrum in Fig. 3D2 as they show maximal photocurrents for 400 nm excitation.
Please, see point above.
(3) The authors introduce two different bicistronic expression cassettes-ZipT-IvfChR and ZipV-IvfChR-without providing clear guidelines on their conditions of use. Although the authors assert that ZipT is slower and further red-shifted than ZipV, the differences in the data for both ACR mutants are small and the benefits of the different final constructs should be explained.
In our testing in neurons, ZipT has less ‘escaped’ spikes after the termination of the light pulses in the cells we have tested. However, this is dependent on the membrane properties such as capacitance and resistance of the cells. ZipV has a faster termination time and in some situations may be necessary due to its faster termination time and reduced disruption of physiological processes.
We have now included this discussion in our updated manuscript.
(4) The ZipT/V-IvfChRs are designed as bicistronic constructs; yet, disparities in membrane trafficking and protein degradation between the two channels could lead to divergences in blue and red light photoresponses. For future applicants, understanding the extent of expression ratio variations across cells using the presented expression cassettes could be of significance and should be discussed.
We now have included this discussion in our responses above.
Reviewer #1 (Recommendations For The Authors):
(1) The Figure 1a mV cartoon traces for chloride are confusing. The chloride currents are depolarizing, not hyperpolarizing. As noted by the authors, these channels largely generate AP blockade through shunting inhibition (division), not hyperpolarization (subtraction).
The figure has been corrected.
(2) Figure 2A does not show where the light is applied. Why are some of the bars blue and some of them not filled?
This has been corrected
(3) Figure 2C1 does not show where the light is applied. There should be an inset to detail the blue-light-cessation-evoked AP. Also doesn't give the holding potential.
The requested details are added.
(4) Figure 2C2 inset is described as showing that "Light-induced currents with 470 nm illumination were initially outward but turned inward immediately following light offset." Is that correct? It looks to me like the current turns inward about half-way through the light pulse and then becomes even stronger after the light turns off. That is also consistent with the CC traces, which appear to show a transition toward depolarization during the light pulse before the AP initiation at light offset.
Yes, the reviewer's observation is correct. There are blue light-induced outward and inward current peaks at the onset and offset of the light. Accordingly, we have modified the phrasing for Fig. 2C2.
(5) Figure 3D1 shows that Zip(151V) has a peak current at 400nm, with a steady increase in current from red to blue, however, this is not the case in the summary data in 3D2. It's also not shown in Supplementary Figure 5B. What's going on?
We apologize for the prior version of the figure associated with the first submission. The example traces from 400nm -> 650 nm were incorrectly included in the figure whereas the 650nm -> 400 nm example traces should be included. This has been corrected.
(6) Figure 3D1 has no time scale.
It is now been included
(7) Figure 3E1 should read "Transduced" and not "Transfected"
This has been corrected.
(8) IvfChr fidelity drops off dramatically at 20hz...down to 50% efficiency of generating APs. This is described in the legend as "high frequency". Maybe the cart came before the horse in this figure...as it looks like in panel C that using less light power density improves fidelity in the dual opsin configuration with red light stimulation...why not use that power for the characterization? Did you try any higher frequencies? Or longer pulse widths? This is an important characterization to inform further use of the tool. This shortcoming isn't a cell-intrinsic limitation, as the 470nm stim with IVfChr was 100% successful at both 10hz and 20hz.
It is known that red but not blue light pulses induce desensitization (optical fatigue) in red-shifted ChR variants. Indeed, one can reinstate the response to red light, by giving violet-blue light pulses (Fig 4. Suppl 2). We think this is the reason that the 470nm stimulation was more effective in inducing AP in cells expressing IvfChR. Higher light intensities induce greater desensitization, but are preferred for faster opening of channels and depolarization of neurons. This can explain why, in some situations, lower light intensities were more effective in producing APs when pulse trains were used. We have recordings from cells firing APs at 40Hz (not included). All these cells had high expression levels of the opsin.
(9) Figure 4D: why use 100ms pulse width? How do you know that this isn't causing depol block? Or some of the nefarious concerns that are raised in the discussion, such as "...disrupt[ion of] normal neuronal physiology and signal processing that occurs in millisecond time scale"?
We used 100ms pulse duration to follow the published protocol that this experiment is based on (Lin et al., 2013, Nature Neuroscience).
(10) Figure 4E-bottom: What is the blue peak at light onset? Is the tool driving early activation before silencing?
There seems to be an early, sharp and brief activation by blue light. We don’t know the definite cause of this, but we speculate this is driven by blue-light activation of ZipACR and not the IvfChr portion of the construct. The reason is that such a sharp rise is absent when only IvfChr is expressed (Fig. 4E, upper panel). Soma-targeted motif tethered to channelrhodopsins is known to result in preferential expression of channels close to soma but does not exclude the expression of channelrhodopsin in axonal and dendritic compartments, especially when animals are allow to recover for long period of time after viral injection. We believe that ZipACR at axonal terminals where the chloride concentration is high can still cause blue-light evoked depolarization and transmitter release. We observed this phenomenon in two mice in their first trial. The data for individual trials for each mouse are included in a supplementary table.
(11) Figure 4G: Earlier in this same figure (B2, C), 470nm light was more effective at stimulating IvfChr than 635nm light. Is it unexpected that 638nm light would in this in vivo context be more effective at driving IvfChr responses than 450 nm light (at least as reflected by the AUC measurements)? Does this reflect fiber placement and light penetration/scattering?
The spectral peaks of Chrimson-based variants including vfChrimson are all centered around 600 nm, and at 635 / 638 nm light, the amplitudes of photo-response decline, the channel onset slows, and the channels suffer greater desensitization. In isolated preparations where the light penetration is similar between 635 / 638 nm and 470 nm, 470 nm responses can outperform 635 / 638 nm responses due to its lack of desensitization and higher consistency in its response. This is also a strong reason that we have developed our current approach. In in vivo preparation shown in Fig. 4D-G, the much higher tissue penetration of 638nm light due to reduced absorption and reduced scattering can offset the performance of IvfChr to 450 nm light.
(12) In the methods, it is noted that different viral batches appear to generate different levels of neuronal toxicity. If that is the case, how did you differentiate between true differences between constructs vs. differential cell health effects?
For figure 4D-F (whisker movement), we determined virus toxicity using NeuN staining. In slice recordings, we used the electrophysiological property of the neurons to assess their health. For this manuscript, we had one batch of virus that produced toxicity. We did not include any data from this batch.
Reviewer #2 (Recommendations For The Authors):
● Define AUC on first use.
It is now defined.
● Figure 3C2: Please explain how the photocurrents were normalized. As presented, it looks like under strong orange light, the ZipACR has higher photocurrent than the ivfChr.
This is due to the fact vfChrimson and other Chrimson-based variants do not fully recover in the dark after 590 nm stimulation. We tested IvfChrimson with both reconditioning light pulse of 405 nm and without 405 nm and we can consistently reach a greater ‘maximal’ response from the same cell after 405 nm reconditioning (see Fig. 4 Suppl 2). We therefore normalize the response to the maximal recorded response of the cell often achieved with 10 or 20 mW/mm<sup>2</sup> 590 nm stimulation after 405 nm reconditioning. We understand this can be confusing and have now replaced the light-intensity response in Fig. 3C2 with the one with 405 nm reconditioning which is easier to interpret for the readers.
● P. 3: "As expected, blue light pulses induce transient membrane suppression..." Unclear what "suppression" means. Shunting? Hyperpolarization?
We rephrased this to “As expected, blue light pulses transiently suppress APs…”
● P. 3: "illumination at 470 nm and 590 nm wavelengths led to similar amounts of courtship song (110.1 {plus minus} 12.8 and 78.5 {plus minus} 11.6,n = 16-17, respectively)". What are the units of "courtship song"?
The unit for courtship song is the number of pulses per 10 seconds. This has been clarified in the figure.
● P. 5: The quantification of photocurrent in terms of pA/pF/A.U. is non-standard. I understand the impetus to normalize by expression to give something proportional to per-molecule conductance, but a user cares about overall photocurrent. Please also give the real photocurrents, either pA or pA/pF.
We have provided the real photocurrent in pA or pA/pF where scientifically appropriate. To avoid selection and experimenter’s bias in our data, we did not set criteria for data elimination for cells with specific fluorescence intensity or photocurrent amplitude. Some resulting response can range from vary up to 20 folds from the same construct in many experiments. We do not believe that averaging absolute photocurrent amplitude would be justified due to the imbalance of weighing in the results. We do acknowledge that not selecting or eliminating data points would introduce higher noise in recordings with smaller responses but this is preferable over the selection or experimenter bias that is likely to be introduced otherwise.
● Please quote illumination intensities wherever possible.
● P. 7: why was the red light crosstalk into Zip(151T) tested at 635 nm instead of 590 nm? Isn't the relevant parameter 590 nm, since that will be used for the excitatory opsin?
In all our characterizations of the constructs using slice electrophysiology recordings, we used 635nm instead of 590nm. The reason is that compared to 590nm wavelength, at 635nm the photocurrent for Zip(151T) and Zip(151V) is significantly reduced (Fig. 3D1,D2).
● P. 10: "we examined the power at which responses to 470 nm and 635 nm lights induce APs in neurons expressing ZipT-IvfChr, ZipV-IvfChr, or IvfChr", but the preceding sentence says you didn't test the ZipT-IvfChr. This is confusing, please clarify.
The previous paragraph refers to the photocurrent recordings in HEK293 cells where our fast LED based illumination system is limited to 590 nm light, whereas the subsequent paragraph refers to the brain slice neuronal recordings. We have now emphasized the difference of the experiments in the rewrite.
● Fig. 4B1, top: Why don't the blue traces return to the same baseline after the stimulus epochs?
We observed this shift in baseline (~4mV more depolarized) in cells expressing IvfChR (or vfChR) only with blue light stimulation. This was observed in the neurons recorded in the CA1 as well (data not shown). There was no such a change following red light stimulation (Fig. 4B1). Therefore, this should not affect the applicability of our construct. The original paper introducing vfChR did not test the responses of their constructs to blue light. There could be another photocycle state that is activated stronger by 470nm than 590nm and it has a slow off-rate, but this is only a speculation from our side. It must be noted we did not observe such a phenomenon in cells expressing ChrimsonR (Fig. 1 Suppl 1C).
● Fig. S3B, right: The two colors are barely distinguishable on the graph. Consider more distinct colors and/or different symbols.
It has been changed accordingly.
● P. 15: "However, we do not recommend the use of orange light pulses, as we observed a significant photocurrent in this wavelength." Not clear what this is referring to. Which construct? Under which circumstances shouldn't one use orange light pulses? Where's the data showing this?
This is referring to Fig. 3D1,D2 and Figure 4 suppl Fig. 2 which show a normalized ~40-50% photocurrent at 590nm. Now in the text, the reference figures for the data are added.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public review):
Summary:
Audio et al. measured cerebral blood volume (CBV) across cortical areas and layers using high-resolution MRI with contrast agents in non-human primates. While the non-invasive CBV MRI methodology is often used to enhance fMRI sensitivity in NHPs, its application for baseline CBV measurement is rare due to the complexities of susceptibility contrast mechanisms. The authors determined the number of large vessels and the areal and laminar variations of CBV in NHP and compared those with various other metrics.
Strengths:
Non-invasive mapping of relative cerebral blood volume is novel for non-human primates. A key finding was the observation of variations in CBV across regions; primary sensory cortices had high CBV, whereas other higher areas had low CBV. The measured CBV values correlated with previously reported neuronal and receptor densities.
Weaknesses:
A weakness of this manuscript is that the quantification of CBV with postprocessing approaches to remove susceptibility effects from pial and penetrating vessels, as well as orientation dependency, is not fully validated, especially on a laminar scale. Further specific comments follow.
We suspect that the comment regarding the lack of validation on laminar level stems from an error made by the corresponding author in the original bioRxiv submission (v1, May 17th https://www.biorxiv.org/content/10.1101/2024.05.16.594068v1?versioned=true), where Figure 3 which contains laminar validation was lost during pdf conversion. After submitting to E-Life, this mistake was quickly identified, and a corrected manuscript was re-uploaded to the bioRxiv (v2, June 5th, https://doi.org/10.1101/2024.05.16.594068). Although we informed the eLife staff about the update, it appears that the revised manuscript may not have reached reviewer #1 in time. We sincerely apologize for any confusion or inconvenience this may have caused.
(1) Baseline CBV indices were determined using contrast agent-enhanced MRI (deltaR2*). Although this approach is suitable for areal comparisons, its application on a laminar scale has not been validated in the literature or in this study. By comparing with histological vascular information of V1, the authors attempted to validate their approach. However, the generalization of their method is questionable. The main issue is whether the large vessel contribution is minimized by processing approaches properly in various cortical areas (such as clusters 1-3 in Figure 5). It would be beneficial to compare deltaR2* with deltaR2 induced by contrast agents in a few selected slices, as deltaR2 is supposed to be sensitive to microvessels, not macrovessels. Please discuss this issue.
The requested validation is presented in Figure 3F, which compares our deltaR2* measurements with previously invasive estimates of large vessel, capillary and cytochrome oxidase (CO) levels in V1 (Weber et al., 2008; doi.org/10.1093/cercor/bhm259). Our deltaR2* values show a stronger correspondence with microvascularity and CO levels than large vessels. Moreover, Figure 3D illustrates relative differences between V1 and V2, which closely align with the relative vascular volume differences reported by Zheng et al., 1991. It is important to note that Weber and colleagues averaged across V2-V5 due to similar vascularity across these areas. In our material, we also observed similar vascularity in these areas, though V5 (e.g., MT) has slightly denser vascularity, in agreement with reports of CO staining.
Additionally, we report similar GM/WM vascular density, and high vascular density in primary sensory areas. Unfortunately, available ground-truth data on vascularity does not provide further (general) validation data for laminar vasculature in macaques (such as those in cluster 1-3; Fig. 5). That said, we have provided substantial evidence linking whole-brain vascular measures with variations in neuron (for data distribution, see Supp. Fig. 6F) and receptor densities, which we believe provides strong support for our approach.
We would like to clarify that the authors do not assert that gradient-echo MRI is exclusively sensitive to microvessels and not macrovessels. This is not stated anywhere in the manuscript. If any sentence appears misleading, please let us know, and we will consider revising it. It is well-established that large vessels contribute to ΔR2* (Ogawa et al., 1993; Boxerman et al., 1995), and this is clearly stated in the manuscript (introduction, methods, results and discussion) and demonstrated in Figures 2A, B, and Supp. Figs. 2, 3, and 4. The primary concern, as the reviewer also noted, is whether we have sufficiently minimized the contribution of large vessels in our parcellated data analysis.
At the parcellated level, we used the median value to avoid skewness in the data distribution, which primarily arises from large vessels, as regions near these vessels exhibit higher ΔR2*. The skewness of ΔR2* is also visible in Figure 1F, G. While this approach mitigates this large-small vessel issue, it does not entirely resolve it, as a slight linear increase toward the cortical surface remains (in all parcels). This is likely due to our inability to delineate all penetrating vessels as shown in Figure 2E and because contrast agents cumulatively accumulate toward superficial layers where blood originates and returns to the pial surface. To mitigate this issue, we detrended across layers the parcellated profiles, obtaining results similar to the ground-truth measures of vascularity in V1-V5 and CO histology in V1.
(2) High-resolution MRI with a critical sampling frequency estimated from previous studies (Weber 2008, Zheng 1991) was performed to separate penetrating vessels, which is considered one of the major advancements in this study. However, this approach is still insufficient to accurately identify the number of vessels due to the blooming effects of susceptibility and insufficient spatial resolution. There was no detailed description of the detection criteria. More importantly, the number of observable penetrating vessels is dependent on imaging parameters and the dose of the contrast agent. If imaging slices were obtained in parallel to the cortex with higher in-plane resolution, it would likely improve the detection of penetrating vessels. Using higher-field MRI would further enhance the detection of penetrating vessels. Therefore, the reported value is only applicable to the experimental and processing conditions used in this study. Detailed selection criteria should be mentioned, and all potential pitfalls should be discussed.
We believe that Figure 2 represents a significant conceptual and data analysis advancement in the field of vascular imaging. To the best of our knowledge, this is the first MRI study attempting to assess vessel density across cortical layers and compare the number of vessels to the known ground-truth. While we do not claim to have achieved a perfect solution (as shown in Figure 2), we offer a robust challenge to the imaging community by introducing this novel benchmarking approach. Our hope is that this conceptual framework will inspire the MR imaging community to tackle this challenge.
Regarding imaging parameters, TE did not have much effect on our results, with a slight effect observed in the superficial layers due to the presence of large pial vessels (blooming effect; Fig. 2C). This also suggests that similar results could be achieved by changing the contrast agent dose, though there are, of course, CNR requirements and limitations at either end of the spectrum.
We completely agree with the reviewer that spatial resolution is critical in resolving the arterio-venous networks, and we have dedicated significant attention to this topic in the introduction, results and discussion sections. We also agree with the reviewer that if imaging slices were obtained in parallel to the cortex with higher in-plane resolution, it would improve the detection of vessels. However, while this approach is ideal for counting vessels in a single plane and isolated region of cortex, it is less suited to the surface mapping of vessels, which is the focus of our study.
Regarding the exclusion of vessels, based on visual comparison of vessels in volume space, Frangi-filter detection of vessels in volume space, and surface detection of vessels, we found no evidence to develop additional exclusion criteria (Supp. Fig. 3). On the contrary, we identified a number of false negatives in both the surface maps and volume maps. Notable exceptions to this rule seemed to occur at premotor areas F2 and F3 (Matelli et al., 1984; Patterns of cytochrome oxidase activity in the frontal agranular cortex of the macaque monkey). In these regions, we observed peculiar “pockets” of signal drop-out in equivolumetric layers 4-5. It is unclear what these signal-voids represent but it is interesting to note that these cortical areas F1-F5 were originally delineated by distinct CO+ positive large cells (Matelli et al., 1984).
(3) Attempts to obtain pial vascular structures were made (Figure 2). As mentioned in this manuscript, the blooming effect of susceptibility contrasts is problematic. In the MRI community, T1-based Gd contrast agents have been used for mapping large vasculature, which is a better approach for obtaining pial vascular structures. Alternatively, computer tomography with a blood contrast agent can be used for mapping blood vasculature noninvasively. This issue should be discussed.
We agree with the reviewer that T1-based contrast agents may offer more precise direct localization of large vessels in pial vasculature. However, the primary focus of our study was not on visualizing pial vascular structures, but rather on measuring vascular volume across cortical layers. For this purpose, we opted to use ferumoxytol, which provides superior T2*-contrast and about ten times longer plasma half-life compared to gadolinium. While we anticipated artifacts from the pial network, we developed a novel method to indirectly map these long-distance susceptibility artifacts arising from large vessels onto the cortical surface (Fig. 2A). If the goal would be to specifically visualize pial vessels, we applaud the high-resolution TOF angiography developed for direct vessel visualization (Bollman et al., 2022; https://doi.org/10.7554/eLife.71186)
Changes in text:
“4.1 Methodological considerations - vessel density informed MRI
While the pial vessels can be directly visualized using high-resolution time-of-flight MRI (Bollmann et al., 2022), and computed tomography (Starosolski et al., 2015), imaging of the dense vascularity within the large and highly convoluted primate gray matter presents other formidable challenges. Here, we used a combination of ferumoxytol contrast agent and cortical layer resolution 3D gradient-echo MRI to map cerebrovascular architecture in macaque monkeys. These methods allowed us to indirectly delineate large vessels and indirectly estimate translaminar variations in cortical microvasculature.”
(4) Since baseline R2* is related to baseline R2, vascular volume, iron content, and susceptibility gradients, it is difficult to correlate it with physiological parameters. Baseline R2* is also sensitive to imaging parameters; higher spatial resolution tends to result in lower R2* values (closer to the R2 value). Therefore, baseline R2* findings need to be emphasized.
We agree with the reviewer's comment on the complexity of correlating baseline R2* with vasculature, given its sensitivity to multiple factors such as venous oxygenation, iron content, and imaging parameters such as image resolution. While our study focuses on vascular measurements, one could also highlight iron’s role in brain energy metabolism. Deoxygenated blood affects R2*, iron in oligodendrocytes supports myelination and neuronal signaling, and iron’s role in cytochrome c oxidase during electron transport impacts mitochondrial energy production. These metabolic factors collectively affect baseline R2* and link it to vasculature. Though quantitative susceptibility mapping (QSM) could help differentiate these different factors, it is beyond the scope of this study.
(5) CBV-weighted deltaR2* is correlated with various other metrics (cytoarchitectural parcellation, myelin/receptor density, cortical thickness, CO, cell-type specificity, etc.). While testing the correlation between deltaR2* and these other metrics may be acceptable as an exploratory analysis, it is challenging for readers to discern a causal relationship between them. A critical question is whether CBV-weighted deltaR2* can provide insights into other metrics in diseased or abnormal brain states. If this is the case, then high-resolution deltaR2* will be useful. Please comment on this possibility.
We agree with the reviewer that correlation deltaR2* with other metrics, such as myelin and cortical thickness, receptors and interneuron types, remains exploratory. Establishing causal relationships requires advanced multivariate analysis across cortical layers, but mapping histological stains to cortical layers is still under development. While this exploratory approach is promising, the ability to apply these insights to diseased or abnormal brain states is not yet clear. Layer-specific analysis of vasculature and function in disease is a future goal, and ongoing work aims to expand this line of inquiry. For now, while high-resolution deltaR2* may indeed offer diagnostic potential, we prefer to refrain from overstating its clinical utility at this stage. We agree that multimodal studies integrating neuroanatomy, function, and vascular metrics will be valuable for deeper insights into brain abnormalities.
Changes in text:
“4.3 The vascular network architecture is intricately connected to the neuroanatomical organization within cerebral cortex
…To comprehensively understand the factors contributing to the vascular organization of the brain, experimental disentanglement through multivariate analysis of laminar cell types and receptor densities is needed (Hayashi et al., 2021, Froudist-Walsh et al., 2023).”
(6) There is no discussion about the deltaR2* difference across subcortical areas (Figure 1). This finding is intriguing and warrants a thorough discussion in the context of the cortical findings.
We thank the reviewer for this comment. We have expanded discussion on subcortical structures:
Section 4.3, 1st paragraph:
“In the cerebral cortex, neurons account for a significant portion (≈80-90%) of energy demand, with most of this energy allocated to signaling (≈80%) and maintaining membrane resting potentials (≈20%) (Attwell and Laughlin, 2001; Howarth et al., 2012). Since firing frequency is modulatory and the neural networks utilize distributed coding, the maintenance of resting-state membrane potential determines the minimal energy budget and the lower-limit for cerebral perfusion. Based on neuronal variability and energy dedicated to maintaining surface potential, this suggest an approximate (4 × 20% ≈) 80% variation in CBF and a resultant 25% variation in CBV across the cortex, in line with Grubbs' law (CBV = 0.80 × CBF0.38) (Grubb et al., 1974). In the cerebellar cortex, neuron density is higher, and the resting potentials are thought to account for more than 50% of energy usage (Howarth et al., 2012), aligning with its higher vascular volume compared to the cerebral cortex (Fig. 1F). However, this is a simplified estimation, and a more comprehensive assessment would need to account for consider an aggregate of biophysical factors such as…”
Section 4.3, 4th paragraph:
“When viewed in terms of information flow, CBV appear to decrease along the canonical circuit pathway (e.g., L4→L2/3→L5) in the primary visual cortex (Douglas and Martin, 2007) and as one ascends the hierarchy (e.g., V1→V2→V3&4→MT→7A) from primary sensory areas (Fig. 3F, Supp. Fig. 8) (Felleman and Van Essen et al., 1991, Markov et al., 2014). A similar pattern is observed in the auditory hierarchy, where the inferior colliculus, an early processing hub, exhibits the highest vascular volume, followed by a gradual reduction along cortical auditory ‘where’ and ‘what’ pathways (Fig. 1F, Fig. 3B).”
(7) Figure 3 is missing. Several statements in the manuscript require statistics (e.g., bimodality in Figure 2D, Figure 3F).
We apologize to the reviewer for the absence of Figure 3 in the initial submission.
As for statistical testing of bimodality, we respectfully disagree and feel that this would not add much value to the manuscript. We think a descriptive, rather than rigorous, approach is sufficient in this context.
Reviewer #2 (Public review):
Summary:
This manuscript presents a new approach for non-invasive, MRI-based measurements of cerebral blood volume (CBV). Here, the authors use ferumoxytol, a high-contrast agent, and apply specific sequences to infer CBV. The authors then move to statistically compare measured regional CBV with the known distribution of different types of neurons, markers of metabolic load, and others. While the presented methodology captures an estimated 30% of the vasculature, the authors corroborated previous findings regarding the lack of vascular compartmentalization around functional neuronal units in the primary visual cortex.
Strengths:
Non-invasive methodology geared to map vascular properties in vivo.
Implementation of a highly sensitive approach for measuring blood volume.
Ability to map vascular structural and functional vascular metrics to other types of published data.
Weaknesses:
The key issue here is the underlying assumption about the appropriate spatial sampling frequency needed to capture the architecture of the brain vasculature. Namely, ~7 penetrating vessels / mm2 as derived from Weber et al 2008 (Cer Cor). The cited work begins by characterizing the spacing of penetrating arteries and ascending veins using a vascular cast of 7 monkeys (Macaca mulatta, same as in the current paper). The ~7 penetrating vessels / mm2 are computed by dividing the total number of identified vessels by the area imaged. The problem here is that all measurements were made in a "non-volumetric" manner and only in V1. Extrapolating from here to the entire brain seems like an over-assumption, particularly given the region-dependent heterogeneity that the current paper reports.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
- For broader readership, it would be beneficial to provide a guide on how to interpret baseline R2* versus ΔR2*.
The text was edited as follows:
“…For quantitative assessment, R<sub>2</sub>* values were estimated from multi-echo gradient-echo images acquired both before and after the administration of ferumoxytol contrast agent (Table 1). Subsequently, the baseline R<sub>2</sub>* and ΔR<sub>2</sub>*, an indirect proxy measure of CBV (Boxerman et al., 1995), volume maps for each subject were mapped onto the twelve native equivolumetric layers (ELs) (Fig. 1C). Each vertex was then corrected for normal of the cortex relative to B<sub>0</sub> direction (Supp. Fig. 1). Surface maps for each subject were registered onto a Mac25Rhesus average surface using cortical curvature landmarks and then averaged across the subjects (Fig. 1D, E). Around cortical midthickness, the distribution of R<sub>2</sub>*, an aggregate measure for ferritin-bound iron, myelin content and venous oxygenation levels (Langkammer et al., 2012), resembled the spatial pattern of ΔR<sub>2</sub>* vascular volume. However, across cortical layers, these measures exhibited reversed patterns: R<sub>2</sub>* increased toward the white matter surface, whereas ΔR<sub>2</sub> decreased (Fig. 1E, G).”
- The legends in Figure 1 describe green/cyan arrows, which are not visible in the figure itself.
We thank the reviewer for noting this discrepancy. The reference to green/cyan arrows was removed from the Figure 1 legend.
- There are typos in Section 3.3: "(Figure 4A, E)" and "(cluster 3; Figure 3)" should be corrected to Figure 5.
We thank the reviewer for noting this error. The references to the Figures were corrected.
Reviewer #2 (Recommendations for the authors):
The work is elegantly presented and very easy to follow. The figures and the data presented there are compelling and well-organized. I have enjoyed reading the paper, despite my disagreement with the validity of the methodology presented.
Validation against MRA methods (high resolution needed here, Bolan et al 2006, cited also by the authors). Certainly, that work used a much higher magnetic field. This could be done through collaboration if such a magnet is not available. In my humble opinion, the current arguments provided in the paper as validation fall short in convincing future readers. Other TOF approaches might be better suited (in combination with line scanning or single plane sequences) for the 3T used in this work.
We appreciate the reviewer’s suggestion regarding time-of-flight (TOF) angiography at ultra-high magnetic fields, such as 9.4T for improved visualization of fast-flowing blood in arterial vessels, as elegantly demonstrated in Bolan et al., 2006. However, our focus was on mapping vasculature across cortical layers and TOF is not optimal for imaging slow capillary blood inflow. To enhance CNR also at capillary level, we used ferumoxytol-contrast agent to create quantitative CBV-weighted cortical layer maps (Boxerman et al., 1995).
We are open to collaborative opportunities to revisit this work using ultra-high magnetic field strengths and more detailed neuroanatomical ground-truth measures. However, the recommended line scanning or single-plane sequences, at least on first impression, seem inadequate for whole-brain coverage and cortical surface mapping.
Some of the methodology can be made more accessible to non-MRI readers. For example, a more elaborate explanation of R2* and ΔR2 could benefit future readers.
Elaborated as requested (see above reply).
A more detailed discussion of the limitations of the methodology could also be beneficial here. Explain the potential implications of under-sampling denser vascular areas (i.e. with potentially more than 7 penetrating vessels per mm2).
V1, with its highest neuronal density, likely also has the highest feeding/draining vessel density. Based on this, we hypothesized that a 0.23 mm isotropic image resolution would sufficiently capture cortical arterio-venous networks, but we did not achieve the expected detection of 7 penetrating vessels per mm<sup>2</sup>. Consequently, we refrained from quantifying vessel density in other areas, albeit we did report the total vessel count.
This under-sampling likely biases our ΔR2* estimates, skewing them toward larger vessels. To address this, we used median parcel values to avoid over-representing large vessels (the long-tail in ΔR2 parcels data distribution represents large vessels) and corrected for the cortical surface bias where blood originates from and returns to the pial network. These steps helped mitigate large vessel bias as described in the methods, results and discussion (see also our response to Reviewer #1, question #1).
To improve clarity for readers, we further clarified:
Methods:
“The effect of blood accumulation in large feeding arteries and draining veins toward in the superficial layers was estimated using linear model and regressed out from the parcellated ΔR<sub>2</sub>* maps.”
Results:
“To mitigate bias resulting from undersampling the large-caliber vessels (Fig. 2A, B), median parcel values were obtained and M132 parcellated ΔR2* profiles were then detrended across ELs in each subject and then averaged.”
Discussion:
“This methodology, however, has known limitations. First, gradient-echo imaging is more sensitized toward large pial vessels running along the cortical surface and large penetrating vessels, which could differentially bias the estimation of Δ R<sub>2</sub>* across cortical layers (Fig. 2A, 2B) (Boxermann et al., 1995; Zhao et al., 2006). Additionally, vessel orientation relative to the B<sub>0</sub> direction introduce strong layer-specific biases in quantitative ΔR<sub>2</sub>* measurements (Supp. Fig. 1C) (Ogawa et al., 1993; Viessmann et al., 2019; Lauwers et al., 2008). To address these concerns, we conducted necessary corrections for B<sub>0</sub>-orientation, obtained parcel median values and regressed linear-trend thereby mitigating the effect of undersampling large-caliber vessels across ELs (Fig. 2C, Supp. Fig. 1).”
Please note, we are currently unable to create BALSA links to the figures due to maintenance issues at the data repository. As a result, we have opted to remove the links:
-
-
osf.io osf.io
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1:
(1) You claim transdiagnostic phenotypes are temporally stable -- since they're relatively new constructs, do we know how stable? In what order?
This is an important question. We have added two recent references to support this claim on page 1 and cite these studies in the references on pages 25 and 28:
“Using factor analysis, temporally stable (see Fox et al., 2023a; Sookud, Martin, Gillan, & Wise, 2024), transdiagnostic phenotypes can be extracted from extensive symptom datasets (Wise, Robinson, & Gillan, 2023).”
Fox, C. A., McDonogh, A., Donegan, K. R., Teckentrup, V., Crossen, R. J., Hanlon, A. K., … Gillan, C. M. (2024). Reliable, rapid, and remote measurement of metacognitive bias. Scientific Reports, 14(1), 14941. https://doi.org/10.1038/s41598-024-64900-0
Sookud, S., Martin, I., Gillan, C., & Wise, T. (2024, September 5). Impaired goal-directed planning in transdiagnostic compulsivity is explained by uncertainty about learned task structure. https://doi.org/10.31234/osf.io/zp6vk
More specifically, Sookud and colleagues found the intraclass correlation coefficient (ICC) for both factors to be high after a 3- or 12 month period (ICC<sub>AD_3</sub> = 0.87; ICC<sub>AD_12</sub> = 0.87; ICC<sub>CIT_3</sub> = 0.81; ICC<sub>CIT_3</sub>= 0.76; see Tables S41 and S50 in Sookud et al., 2024).
(2) On hypotheses of the study:
I didn't understand the logic behind the hypothesis relating TDx Compulsivity -> Metacognition > Reminder-setting
It seems that (a) Compulsivity relates to overconfidence which should predict less remindersetting
Compulsivity has an impaired link between metacognition and action, breaking the B->C link in the mediation described above in (a). What would this then imply about how Compulsivity is related to reminder-setting?
"In the context of our study, a Metacognitive Control Mechanism would be reflected in a disrupted relationship between confidence levels and their tendency to set reminders." What exactly does this predict - a lack of a correlation between confidence and remindersetting, specifically in high-compulsive subjects?
Lastly, there could be a direct link between compulsivity and reminder-usage, independent of any metacognitive influence. We refer to this as the Direct Mechanism Why though theoretically would this be the case?
"We initially hypothesised to find support for the Metacognitive Control Mechanism and that highly compulsive individuals would offload more".
The latter part here, "highly compulsive individuals would offload more" is I think the exact opposite prediction of the Metacognitive control mechanism hypothesis (compulsive individuals offload less). How could you possibly have tried to find support, then, for both?
Is the hypothesis that compulsivity positively predicts reminder setting the "direct mechanism" - if so, please clarify that, and if not, it should be added as a distinct mechanism, and additionally, the direct mechanism should be specified.
There's more delineation of specific hypotheses (8 with caveats) in Methods.
"We furthermore also tested this hypothesis but predicted raw confidence (percentage of circles participants predicted they would remember; H6b and H8b respectively)," What is the reference of "this hypothesis" given that right before this sentence two hypotheses are mentioned? To keep this all organized, it would be good to simply have a table with hypotheses listed clearly.
We agree with the reviewer that there is room to improve the clarity of how our hypotheses are presented. The confusion likely arises from the fact that, since we first planned and preregistered our study, several new pieces of work have emerged, which might have led us to question some of our initial hypotheses. We have taken great care to present the hypotheses as they were preregistered, while also considering the current state of the literature and organizing them in a logical flow to make them more digestible for the reader. We have clarified this point on page 4:
“Back when we preregistered our hypotheses only a limited number of studies about confidence and transdiagnostic CIT were available. This resulted in us hypothesising to find support for the Metacognitive Control Mechanism and that highly compulsive individuals would offload more due to an increased need for checkpoints.”
The biggest improvement we believe comes from our new Table 1, which we have included in the Methods section in response to the reviewer’s suggestion (pp. 21-22):
“We preregistered 8 hypotheses (see Table 1), half of which were sanity checks (H1-H4) aimed to establish whether our task would generally lead to the same patterns as previous studies using a similar task (as reviewed in Gilbert et al., 2023).”
We furthermore foreshadowed more explicitly how we would test the Metacognitive Control Mechanism in the Introduction section on page 4, as requested by the reviewer:
“In the context of our study, a Metacognitive Control Mechanism would be reflected in a disrupted relationship between confidence levels and their tendency to set reminders (i.e., the interaction between the bias to be over- or underconfident and transdiagnostic CIT in a regression model predicting a bias to set reminders).”
To avoid any confusion regarding the term ‘direct’ in the ‘Direct Mechanism’, we now explicitly clarify on page 4 that it refers to any non-metacognitive influences. Additionally, we had already emphasized in the Discussion section the need for future studies to specify these influences more directly.
Page 4: “We refer to this as the Direct Mechanism and it constitutes any possible influences that affect reminder setting in highly-compulsive CIT participants outside of metacognitive mechanisms, such as perfectionism and the wish to control the task without external aids.”
The reviewer was correct in pointing out that, in the Methods section, we incorrectly referred to ‘this hypothesis’ when we actually meant both of the previously mentioned hypotheses. We have corrected this on page 23:
“We furthermore also tested these hypotheses but predicted raw confidence (percentage of circles participants predicted they would remember; H6b and H8b respectively), as well as extending the main model with the scores from the cognitive ability test (ICAR5) as an additional covariate (H6c and H8c respectively).”
Finally, upon revisiting our Results section, we noticed that we had not made it sufficiently clear that hypothesis H6a was preregistered as non-directional. We have now clarified this on page 9:
“We predicted that the metacognitive bias would correlate negatively with AD (Hypothesis 8a; more anxious-depressed individuals tend to be underconfident). For CIT, we preregistered a non-directional, significant link with metacognitive bias (Hypothesis H6a). We found support for both hypotheses, both for AD, β = -0.22, SE = 0.04, t = -5.00, p < 0.001, as well as CIT, β = 0.15, SE = 0.05, t = 3.30, p = 0.001, controlling for age, gender, and educational attainment (Figure 3; see also Table S1). Note that for CIT this effect was positive, more compulsive individuals tend to be overconfident.”
(3) You say special circles are red, blue, or pink. Then, in the figure, the colors are cyan, orange, and magenta. These should be homogenized.
Apologies, this was not clear on our screens. We have corrected this now but used the labels “blue”, “orange” and “magenta” as our shade of blue is much darker than cyan:
Page 16: “These circles flashed in a colour (blue, orange, or magenta) when they first appear on screen before fading to yellow.”
(4) The task is not clearly described with respect to forced choice. From my understanding, "forced choice" was implicitly delivered by a "computer choosing for them". You should indicate in the graphic that this is what forced choice means in the graphic and description more clearly.
This is an excellent point. On pages 17 and 18 we now include a slightly changed Figure 6, which includes improved table row names and cell shading to indicate the choice people gave. Hopefully this clarifies what “forced choice” means.
(5) If I have point (4) right, then a potential issue arises in your design. Namely, if a participant has a bias to use or not use reminders, they will experience more or less prediction errors during their forced choice. This kind of prediction error could introduce different mood impacts on subsequent performance, altering their accuracy. This will have an asymmetric effect on the different forced phases (ie forced reminders or not). For this reason, I think it would be worthwhile to run a version of the experiment, if feasible, where you simply remove choice prior to revealing the condition. For example, have a block of choices where people can "see how well you do with reminders" -- this removes expectation and PE effects.
[See also this point from the weaknesses listed in the public comments:]
Although I think this design and study are very helpful for the field, I felt that a feature of the design might reduce the tasks's sensitivity to measuring dispositional tendencies to engage cognitive offloading. In particular, the design introduces prediction errors, that could induce learning and interfere with natural tendencies to deploy reminder-setting behavior. These PEs comprise whether a given selected strategy will be or not be allowed to be engaged. We know individuals with compulsivity can learn even when instructed not to learn (e.g., Sharp, Dolan, and Eldar, 2021, Psychological Medicine), and that more generally, they have trouble with structure knowledge (eg Seow et al; Fradkin et al), and thus might be sensitive to these PEs. Thus, a dispositional tendency to set reminders might be differentially impacted for those with compulsivity after an NPE, where they want to set a reminder, but aren't allowed to. After such an NPE, they may avoid more so the tendency to set reminders. Those with compulsivity likely have superstitious beliefs about how checking behaviors leads to a resolution of catastrophes, which might in part originate from inferring structure in the presence of noise or from purely irrelevant sources of information for a given decision problem.
It would be good to know if such learning effects exist if they're modulated by PE (you can imagine PEs are higher if you are more incentivized - e.g., 9 points as opposed to only 3 points - to use reminders, and you are told you cannot use them), and if this learning effect confounds the relationship between compulsivity and reminder-setting.
We would like to thank the reviewer for providing this interesting perspective on our task. If we understand correctly, the situation most at risk for such effects occurs when participants choose to use a reminder. Not receiving a reminder in the following trial can be seen as a negative prediction error (PE), whereas receiving one would represent the control condition (zero PE). Therefore, we focused on these two conditions in our analysis.
We indeed found that participants had a slightly higher tendency to choose reminders again after trials where they successfully requested them compared to after trials where they were not allowed reminders (difference = 4.4%). This effect was statistically significant, t(465) = 2.3, p = 0.024. However, it is important to note that other studies from our lab have reported a general, non-specific response ‘stickiness,’ where participants often simply repeat the same strategy in the next trial (Scarampi & Gilbert, 2020), which could have contributed to this pattern.
When we used CIT to predict this effect in a simple linear regression model, we did not find a significant effect (β = -0.05, SE = 0.05, t = -1.13, p = 0.26).
To further investigate this and potentially uncover an effect masked by the influence of the points participants could win in a given trial, we re-ran the model using a logistic mixed-effects regression model. This model predicted the upcoming trial’s choice (reminder or no reminder) from the presence of a negative prediction error in the current trial (dummy variable), the ztransformed number of points on offer, and the z-transformed CIT score (between-subject covariate), as well as the interaction of CIT and negative PE. In this model, we replicated the previous ‘stickiness’ effect, with a negative influence of a negative PE on the upcoming choice, β = -0.24, SE = 0.07, z = -3.44, p < 0.001. In other words, when a negative PE was encountered in the current trial, participants were less likely to choose reminders in the next trial. Additionally, there was a significant negative influence of points offered on the upcoming choice, β = -0.28, SE = 0.03, z = -8.82, p < 0.001. While this might seem counterintuitive, it could be due to a contrast effect: after being offered high rewards with reminders, participants might be deterred from using the reminder strategy in consecutive trials where lower rewards are likely to be offered, simply due to the bounded reward scale. CIT showed a small negative effect on upcoming reminder choice, β = -0.06, SE = 0.04, z = -1.69, p = 0.09, indicating that participants scoring higher on the CIT factor tended to be less likely to choose reminders, thus replicating one of the central findings of our study. It is unclear why this effect was not statistically significant, but this is likely due to the limited data on which the model was based (see below). Finally, and most importantly, the interaction between the current trial’s condition (negative PE or zero PE) and CIT was not significant, contrary to the reviewer’s hypothesis, β = 0.04, SE = 0.07, z = 0.57, p = 0.57.
It should also be noted that this exploratory analysis is based on a limited number of data points: on average, participants had 2.5 trials (min = 0; max = 4) with a negative PE and 6.7 trials (min = 0; max = 12) with zero PE. There were more zero PE trials simply because to maximise the number of trials included in this analysis, each participant’s 8 choice-only trials were included and on those trials the participant always got what they requested (the trial then ended prematurely). Due to the fact that not all cells in the analysed design were filled, only 466 out of 600 participants could be included in the analysis. This may have caused the fit of the mixed model to be singular.
In summary, given that these results are based on a limited number of data points, some models did not fit without issues, and no evidence was found to support the hypotheses, we suggest not including this exploratory analysis in the manuscript. However, if we have misunderstood the reviewer and should conduct a different analysis, we are happy to reconsider.
Unfortunately, conducting an additional study without the forced-choice element is not feasible, as this would create imbalances in trial numbers for the design. The advantage of the current, condensed task is the result of several careful pilot studies that have optimized the task’s psychometric properties.
Scarampi, C., & Gilbert, S. J. (2020). The effect of recent reminder setting on subsequent strategy and performance in a prospective memory task. Memory, 28(5), 677–691. https://doi.org/10.1080/09658211.2020.1764974
(6) One can imagine that a process goes on in this task where a person must estimate their own efficacy in each condition. Thus, individuals with more forced-choice experience prior to choosing for themselves might have more informed choice. Presumably, this is handled by your large N and randomization, but could be worth looking into.
We would like to thank the reviewer for pointing this out, as we had not previously considered this aspect of our task. However, we believe it is not the experience with forced trials per se, but rather the frequency with which participants experience both strategies (reminder vs. no reminder), that could influence their ability to make more informed choices. To address this, we calculated the proportion of reminder trials during the first half of the task (excluding choiceonly trials, where the reminder strategy was not actually experienced). We hypothesized that the absolute distance of this ‘informedness’ parameter should correlate positively with the absolute reminder bias at the end of the task, with participants who experienced both conditions equally by the midpoint of the task being less biased towards or away from reminders. However, this was not the case, r = 0.05, p = 0.21.
Given the lengthy and complex nature of our preregistered analysis, we prefer not to include this exploratory analysis in the manuscript.
(7) Is the Actual indifference calculated from all choices? I believe so, given they don't know only till after their choice whether it's forced or not, but good to make this clear.
Indeed, we use all available choice data to calculate the AIP. We now make this clear in two places in the main text:
Page 5: “The ‘actual indifference point’ was the point at which they were actually indifferent, based on all of their decisions.”
Page 6: “Please note that all choices were used to calculate the AIP, as participants only found out whether or not they would use a reminder after the decision was made.”
(8) Related to 7, I believe this implies that the objective and actual indifference points are not entirely independent, given the latter contains the former.
Yes, the OIP and AIP were indeed calculated in part from events that happened within the same trials. However, since these events are non-overlapping (e.g., the choice from trial 6 contributes to the AIP but the accuracy measured several seconds later from that trial contributes to the OIP) and since our design dictates whether or not reminders can be used on those trials in question (by randomly assigning them to the forced internal/forced external condition) this could not induce circularity.
(9) I thought perfectionism might be a trait that could explain findings and it was nice to see convergence in thinking once I reached the conclusion. Along these lines, I was thinking that perhaps perfectionism has a curvilinear relationship with compulsivity (this is an intuition I'm not sure if it's backed up empirically). If it's really perfectionism, do you see that, at the extreme end of compulsivity, there's more reminder-setting? Ie did you try to model this relationship using a nonlinear function? You might clues simply by visual inspection.
It is interesting to note that the reviewer reached a similar interpretation of our results. We considered this question during our analysis and conducted an additional exploratory analysis to examine how CIT quantile relates to reminder bias (see Author response image 1). Each circle reflects a participant. As shown, no clear nonlinearities are evident, which challenges this interpretation. We believe that adding this to the already lengthy manuscript may not be necessary, but we are of course happy to reconsider if Reviewer 1 disagrees.
Author response image 1.
(10) [From the weaknesses listed in the public comments.] A more subtle point, I think this study can be more said to be an exploration than a deductive test of a particular model -> hypothesis > experiment. Typically, when we test a hypothesis, we contrast it with competing models. Here, the tests were two-sided because multiple models, with mutually exclusive predictions (over-use or under-use of reminders) were tested. Moreover, it's unclear exactly how to make sense of what is called the direct mechanism, which is supported by partial (as opposed to complete) mediation.
The reviewer’s observation is accurate; some aspects of our study did take on a more exploratory nature, despite having preregistered hypotheses. This was partly due to the novelty of our research questions. We appreciate this feedback and will use it to refine our approach in future studies, aiming for more deductive testing.
Reviewer #2:
(1) Regarding the lack of relationship between AD and reminder setting, this result is in line with a recent study by Mohr et al (2023:https://osf.io/preprints/psyarxiv/vc7ye) investigating relationships between the same transdiagnostic symptom dimensions, confidence bias and another confidence-related behaviour: information seeking. Despite showing trial-by-trial under-confidence on a perceptual decision task, participants high in AD did not seek information any more than low AD participants. Hence, the under-confidence in AD had no knock-on effect on downstream information-seeking behaviour. I think it is interesting that converging evidence from your study and the Moher et al (2023) study suggest that high AD participants do not use the opportunity to increase their confidence (i.e., through reminder setting or information seeking). This may be because they do not believe that doing so will be effective or because they lack the motivation (i.e., through anhedonia and/or apathy) to do so.
This is indeed an interesting parallel and we would like to thank the reviewer for pointing out this recently published study, which we unfortunately have missed. We included it in the Discussion section, extending our sub-section on the missing downstream effects of the AD factor, as well as listing it in the references on page 27.
Page 14: “Our findings align with those reported in a recent study by Mohr, Ince, and Benwell (2024). The authors observed that while high-AD participants were underconfident in a perceptual task, this underconfidence did not lead to increased information-seeking behaviour. Future research should explore whether this is due to their pessimism regarding the effectiveness of confidence-modulated strategies (i.e., setting reminders or seeking information) or whether it stems from apathy. Another possibility is that the relevant downstream effects of anxiety were not measured in our study and instead may lie in reminder-checking behaviours.”
Mohr, G., Ince, R.A.A. & Benwell, C.S.Y. Information search under uncertainty across transdiagnostic psychopathology and healthy ageing. Transl Psychiatry 14, 353 (2024). https://doi.org/10.1038/s41398-024-03065-w
(2) Fox et al 2023 are cited twice at the same point in the second paragraph of the intro. Not sure if this is a typo or if these are two separate studies?
Those are indeed two different studies and should have been formatted as such. We have corrected this mistake in the following places and furthermore also corrected one of the references as the study has recently been published:
P. 2 (top): “Previous research links transdiagnostic compulsivity to impairments in metacognition, defined as thinking about one’s own thoughts, encompassing a broad spectrum of self-reflective signals, such as feelings of confidence (e.g., Rouault, Seow, Gillan & Fleming, 2018; Seow & Gillan, 2020; Benwell, Mohr, Wallberg, Kouadio, & Ince, 2022; Fox et al., 2023a;
Fox et al., 2023b; Hoven, Luigjes, Denys, Rouault, van Holst, 2023a).”
P. 2 (bottom): “More specifically, individuals characterized by transdiagnostic compulsivity have been consistently found to exhibit overconfidence (Rouault, Seow, Gillan & Fleming, 2018; Seow & Gillan, 2020; Benwell, Mohr, Wallberg, Kouadio, & Ince, 2022; Fox et al., 2023a; Fox et al., 2023b; Hoven et al., 2023a).”
P. 4: “Prior evidence exists for overconfidence in compulsivity (Rouault et al., 2018; Seow & Gillan, 2020; Benwell et al., 2022; Fox et al., 2023a; Fox et al., 2023b; Hoven et al., 2023a), which would therefore result in fewer reminders.”
P. 23: “Though we did not preregister a direction for this effect, in the light of recent findings it has now become clear that compulsivity would most likely be linked to overconfidence (Rouault et al., 2018; Seow & Gillan, 2020; Benwell et al., 2022; Fox et al., 2023a; Fox et al., 2023b; Hoven et al., 2023a).”
P. 24: “Fox, C. A., Lee, C. T., Hanlon, A. K., Seow, T. X. F., Lynch, K., Harty, S., … Gillan, C. M. (2023a). An observational treatment study of metacognition in anxious-depression. ELife, 12, 1–17. https://doi.org/10.7554/eLife.87193”
P. 24: “Fox, C. A., McDonogh, A., Donegan, K. R., Teckentrup, V., Crossen, R. J., Hanlon, A. K., … Gillan, C. M. (2024). Reliable, rapid, and remote measurement of metacognitive bias. Scientific Reports, 14(1), 14941. https://doi.org/10.1038/s41598-024-64900-0”
(3) Typo in the Figure 1 caption: "The preregistered exclusion criteria for the for the accuracies with....".
Thank you so much for pointing this out. We haved changed the sentence in the caption of Figure 1 to read “The preregistered exclusion criteria for the accuracies with or without reminder are indicated as horizontal dotted lines (10% and 70% respectively).”
Typo in the Figure 5 caption: "Standardised regression coefficients are given for each pat".
Thank you so much for pointing this out to us, we have corrected the typo and the sentence in the caption of Figure 5 now reads “Standardised regression coefficients are given for each path.”
[From the weaknesses listed in the public comments.] Participants only performed a single task so it remains unclear if the observed effects would generalise to reminder-setting in other cognitive domains.
We appreciate the reviewer’s concern regarding the use of a single cognitive task in our study, which is indeed a common limitation in many cognitive neuroscience studies. The cognitive factors underlying offloading decisions are still under active debate. Notably, a previous study found that intention fulfilment in an earlier version of our task correlates with real-world behaviour, lending validity to our paradigm by linking it to realistic outcomes (Gilbert, 2015). Additionally, recent unpublished work (Grinschgl, 2024) has shown a correlation between offloading across two lab tasks, though a null effect was reported in another study with a smaller sample size by the same team (Meyerhoff et al., 2021), likely due to insufficient power. In summary, we agree that future research should replicate these findings with alternative tasks to enhance robustness.
Gilbert, S. J. (2015). Strategic offloading of delayed intentions into the external environment. Quarterly Journal of Experimental Psychology, 68(5), 971–992. https://doi.org/10.1080/17470218.2014.972963
Grinschgl, S. (2024). Cognitive Offloading in the lab and in daily life. 2nd Cognitive Offloading Meeting. [Talk]
Meyerhoff, H. S., Grinschgl, S., Papenmeier, F., & Gilbert, S. J. (2021). Individual differences in cognitive offloading: a comparison of intention offloading, pattern copy, and short-term memory capacity. Cognitive Research: Principles and Implications, 6(1), 34. https://doi.org/10.1186/s41235-021-00298-x
(6) [From the weaknesses listed in the public comments.] The sample consisted of participants recruited from the general population. Future studies should investigate whether the effects observed extend to individuals with the highest levels of symptoms (including clinical samples).
We agree that transdiagnostic research should ideally include clinical samples to determine, for instance, whether the subclinical variation commonly studied in transdiagnostic work differs qualitatively from clinical presentations. However, this approach poses challenges, as transdiagnostic studies typically require large sample sizes, and recruiting clinical participants can be more difficult. With advancements in online sampling platforms, such as Prolific, achieving better availability and targeting may make this more feasible in the future. We intend to monitor these developments closely and contribute to such studies whenever possible.
Tags
Annotators
URL
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
Strengths:
Overall there are some very interesting results that make an important contribution to the field. Notably, the results seem to point to differential recruitment of the PL-DMS pathway in goal-tracking vs sign-tracking behaviors.
Thank you.
Weaknesses:
There is a lot of missing information and data that should be reported/presented to allow a complete understanding of the findings and what was done. The writing of the manuscript was mostly quite clear, however, there are some specific leaps in logic that require more elaboration, and the focus at the start and end on cholinergic neurons and Parkinson's disease are, at the moment, confusing and require more justification.
In the revised paper, we provide additional graphs and information in support of results, and we further clarify procedures and findings. Furthermore, we expanded the description of the proposed interpretational framework that suggests that the contrasts between the cortical-striatal processing of movement cues in sign- versus goal trackers are related to previously established contrasts between the capacity for the cortical cholinergic detection of attention-demanding cues.
Reviewer #2 (Public review):
Strengths:
The power of the sign- and goal-tracking model to account for neurobiological and behavioral variability is critically important to the field's understanding of the heterogeneity of the brain in health and disease. The approach and methodology are sound in their contribution to this important effort.
The authors establish behavioral differences, measure a neurobiological correlate of relevance, and then manipulate that correlate in a broader circuitry and show a causal role in behavior that is consistent with neurobiological measurements and phenotypic differences.
Sophisticated analyses provide a compelling description of the authors' observations.
Thank you.
Weaknesses:
It is challenging to assess what is considered the "n" in each analysis (trial, session, rat, trace (averaged across a session or single trial)). Representative glutamate traces (n = 5 traces (out of hundreds of recorded traces)) are used to illustrate a central finding, while more conventional trial-averaged population activity traces are not presented or analyzed. The latter would provide much-needed support for the reported findings and conclusions. Digging deeper into the methods, results, and figure legends, provides some answers to the reader, but much can be done to clarify what each data point represents and, in particular, how each rat contributes to a reported finding (ie. single trial-averaged trace per session for multiple sessions, or dozens of single traces across multiple sessions).
Representative traces should in theory be consistent with population averages within phenotype, and if not, discussion of such inconsistencies would enrich the conclusions drawn from the study. In particular, population traces of the phasic cue response in GT may resemble the representative peak examples, while smaller irregular peaks of ST may be missed in a population average (averaged prolonged elevation) and could serve as a rationale for more sophisticated analyses of peak probability presented subsequently.
We have added two new Tables to clarify the number of rats per phenotype and sex used for each experiment described in the paper (Table 1), and the number of glutamate traces (range, median and total number) extracted for each analysis of performance-associated glutamate levels and the impact of CNO-mediated inhibition of fronto-striatal glutamate (Table 3).
As the timing of glutamate peaks varies between individual traces and subjects, relative to turn and stop cue onset or reward delivery, subject-and trial-averaged glutamate traces would “wash-out” the essential findings of phenotype- and task event-dependent patterns of glutamate peaks. In the detailed responses to the reviewers, we illustrate the results of an analysis of averaged traces to substantiate this view. Furthermore, as detailed in the section on statistical methods, and as mentioned by the reviewer under Strengths, we used advanced statistical methods to assure that data from individual animals contribute equally to the overall result, and to minimize the possibility that an inordinate number of trials obtained from just one or a couple of rats biased the overall analysis.
Reviewer #3 (Public review):
Strengths:
Overall these studies are interesting and are of general relevance to a number of research questions in neurology and psychiatry. The assessment of the intersection of individual differences in cue-related learning strategies with movement-related questions - in this case, cued turning behavior - is an interesting and understudied question. The link between this work and growing notions of corticostriatal control of action selection makes it timely.
Thank you.
Weaknesses:
The clarity of the manuscript could be improved in several places, including in the graphical visualization of data. It is sometimes difficult to interpret the glutamate results, as presented, in the context of specific behavior, for example.
We appreciate the reviewer’s concerns about the complexity of some of the graphics, particularly the results from the arguably innovative analysis illustrated in Figure 6. Figure 6 illustrates that the likelihood of a cued turn can be predicted based on single and combined glutamate peak characteristics. The revised legend for this figure provides additional information and examples to ease the readers’ access to this figure. In addition, as already mentioned above, we have added several graphs to further illustrate our findings.
(Recommendations for the authors)
Reviewer #1 (Recommendations for the authors):
(1) The differences in behavioral phenotype according to vendor (Figure 1c) are slightly concerning, could the authors please elaborate on why they believe this difference is? Are there any other differences in these stocks- i.e. weight, appearance, other types of behaviors?
Differences in PCA behavior across vendors or specific breeding colonies were documented previously and may reflect the impact of environmental, developmental and genetic factors (references added in the revised manuscript). We included animals from both vendors to increase phenotypic variability and due to animal procurement constraints during COVID-related restrictions.
(2) Possibly related to the above, the rats in Figure 1a and Figure 2 are different strains. Please clarify.
In the revised legend of Figure 2 we clarify that the rat shown in the photographs is a Long-Evans rat that was not part of the experiments described in this paper. This rat was used to generate these photos as the black-spotted fur provided better contrast against the white treadmill belt.
(3) Figure 3c, the pairwise comparison showing a significant increase from Day 1 to Day 3 is hard to understand unless this is a lasting change. Is this increase preserved at Day 4? Examination of either a linear trend across days or a simple comparison of either Day 1 & 2 against Day 3 & 4 or, minimally Day 1 against Day 4 would communicate this message. Otherwise, there doesn't seem to be much of a case for improvement across test sessions, which would also be fine in my view.
As the analysis of post-criterion performance also revealed an effect of DAY, we felt compelled to report and illustrate the results of pairwise comparisons in Fig. 3c. In agreement with the reviewer’s point, we did not further comment on this finding in the manuscript.
(4) Figure 4e. I find it extremely unlikely that every included electrode was located exactly at anterior 0.5mm. Please indicate the range - most anterior and most posterior of the included electrodes in the study.
The schematic section shown in Fig. 4e depicted that AP level of that section and collapsed all placements onto that level. As detailed in Methods, electrode placements needed to be within the following stereotaxic space: AP: -0.3 to 0.6 mm, ML: 2 to 2.5 mm, and DV: -4.2 to -5 mm (see Methods). To clarify this issue, the text in Results and the legend was modified and the 0.5 mm label was removed from Fig. 4e.
(5) The paper generally is quite data light and there are a lot of extra results reported that aren't shown in the figures. There are 17 instances of the phrase "not shown", some are certainly justified, but a lot of results are missing…
We followed the reviewer’s suggestion and added several graphs. The revised Figure 5 includes the new graph 5d that shows the number of glutamate traces with just 1, 2 or 3 peaks occurring during cue presentation period. Likewise, the revised Figure 7 includes the new graph 7h that shows the number of glutamate traces with just 1, 2 or 3 peaks following the administration of CNO or its vehicle. In both cases, we also revised the analysis of peak number data, by counting the number of cases (or traces) with just 1, 2 or 3 peaks and using Chi-squared tests to determine the impact of phenotype and, in the latter case, of CNO. In addition, the revised Figure 7 now includes a graph showing the main effects of phenotype and CNO in reward delivery-locked glutamate maximum peak concentrations (Fig. 7k). In revising these sections, we also removed the prior statement about glutamate current rise times as this isolated observation had no impact on subsequent analyses or the discussion.
Concerning the reviewer’s point 5d (DMS eGFP transfection correlations Figure 8), the manuscript clarifies that the absence of such a correlation was expected given that eGFP expression in the DMS does not accurately reproduce the prelimbic-DMS projection space that was inhibited by CNO. In contrast, the correlations between the efficacy of CNO and DREADD expression measures in prelimbic cortex were significant and are graphed (Figs. 8g and 8j).
(6) Please clarify the exact number of animals in each experiment. The caption of Figure 3 seems to suggest there are 29 GTs and 22 STs in the initial experiment, but the caption of Figure 5b seems to suggest there are N=30 total rats being analyzed (leaving 21 un-accounted for), or is this just the number of GTs (meaning there is one extra)?
We have added Table 1 to clarify the number of animals used across different experiments and stages. Additionally, we have included a new Table 3 that identifies, for each graph showing results from the analyses of glutamate concentrations, the number of rats from which recordings were obtained and the number of traces per rat (range, median, and total).
(7) Relatedly, in Figures 5c-f and Figures 7g-i, the data seem to be analyzed by trial rather than subject-averaged, please clarify and what is the justification for this?
As detailed Experimental design and statistical analyses, we employed linear mixed-effects modeling to analyze the amperometric data that generated figures 5 and 7 to minimize the risk of bias due to an excessive number of trials obtained from specific rats. LMMs were chosen to analyze these repeated (non-independent) data to address issues that may be present with subject-averaged data. For clarity, throughout the results for these figures, the numerator in the F-ratio reflects the degrees of freedom from the fixed effects (phenotype/sex) and the denominator reflects the error term influenced by the number of subjects and the within-subject variance.
Concerning the illustration and analysis of trial- or subject-averaged glutamate traces please see reviewer 2, point 1 and the graph in that section. Within a response bin, such as the 2-s period following turn cues, glutamate peaks – as defined in Methods - occur at variable times relative to cue onset. Averaging traces over a population of rats or trials would “wash-out” the phenotype- and task event-dependent patterns of glutamate concentration peaks, yielding, for example, a single, nearly 2-s long plateau for cue-locked glutamate recordings from STs (see Figure 5b versus the graph shown in response to reviewer 2, point 1).
(8) Likewise on page 22, the number of animals from which these trials were taken should be stated "The characteristics of glutamate traces (maximum peak concentration, number of peaks, and time to peak) were extracted from 548 recordings of turn cue trials, 364 of which yielded a turn (GTs: 206, STs: 158) and 184 a miss (GTs: 112, STs: 72).".
The number of animals is now included in the text and listed in Table 3.
(9) The control group for Figure 7 given the mCherry fluorophore - given the known off-target effects of CNO, this is a very important control. Minimally, this data should be shown, but it is troubling that the ST group has n=2, I don't really understand how any sort of sensible stats can be conducted with a group this size, and obviously it's too small to find any significant differences if they were there.
As discussed on p. 14-15 in the manuscript under the section Clozapine N-Oxide, the conversion rate of CNO to clozapine suggests that approximately 50-100 times the dose of clozapine (compared to our 5.0 mg/kg CNO dosage) would be required to produce effects on rodent behavior (references on p. 14-15).
Regarding evidence from control rats expressing the empty construct, the revised manuscript clarifies that no effects of CNO on cued turns were found in 5 GTs expressing the empty control vector. Although CNO had no effects in STs expressing the DREADD, we also tested the effects of CNO in 2 STs expressing the empty control vector (individual turn rates following vehicle and CNO are reported for these 2 STs). Moreover, we extracted turn cue-locked glutamate traces (vehicle: 18 traces; 16 CNO traces) from an empty vector-expressing GT and found that administration of CNO neither reduced maximum glutamate peak concentrations nor the proportion of traces with just one peak. The absence of effects of CNO on cued turning performance and on turn-cue locked glutamate dynamics are consistent with prior studies showing no effects of 5.0 mg/kg CNO in rats not expressing the DREADD vector (references in manuscript).
(10) Figure 8b - the green circle indicated by 1 is definitely not the DMS, this is the DLS, and animals with virus placement in this region should be excluded.
The reviewer of course is correct and that exactly was the point of that illustration, as such a transfection space would have received the lowest possible rating (as indicated by the “1” in the green space). Fig. 8b was intended to illustrate expression efficacy ratings and does not indicate actual viral transfection spaces. Because the results described in the manuscript did not include data from a brain with a striatal transfection space as was illustrated in green in the original Fig. 8b, we removed that illustration of an off-target transfection space.
(11) Figure 8j, the correlation specifically counts double-labeled PL hM4Di + eGFP neurons. Separating dual-labeled cells from all mCherry-labeled cells seems very strange given the nature of the viral approach. There seems to be an assumption that there are some neurons that express the mCherry-hM4Di that don't also have the AAV-Cre (eGFP). Obviously, if that were true this poses a huge problem for your viral approach and would mean that you're inhibiting a non-selective population of neurons. More likely, the AAV-Cre (eGFP) is present in all of your mCherry-hM4Di cells, just not at levels visible without GFP antibody amplification. Ideally, staining should be done to show that all cells with mCherry also have eGFP, but minimally this correlation should include all cells expressing mCherry with the assumption that they must also have the AAV-Cre.
As noted on page 15 in the Visualization and Quantification of eGFP/mCherry-Expressing Neurons section, eGFP expression in our viral approach was notably bright and did not necessitate signal enhancement. Furthermore, given the topographic organization of prelimbic-DMS projections on the on hand, and the variable transfection spaces in cortex and striatum on the other hand, the speculation that AAV-Cre may have been present in all mCherry cells is without basis. Second, there certainly are mCherry-positive cells that do not also express the retrogradely transported AAV-Cre, and that therefore were not affected by CNO. Third, the entire point of this dual vector strategy was to selectively inhibit prelimbic-striatal projections, and the strong correlation between double-labeled neuron numbers and cued turn scores substantiates the usefulness of this approach.
(12) Discussion, a bit more interpretation of the results would be good. Specifically - does the PL-DMS inhibition convert GTs to STs? There were several instances where the behavior and glutamate signals seemed to be pushed to look like STs but also a lot of missing data so it is hard to say. One would assume this kind of thing if, as I think is being said (please clarify), the ST phenotype is being driven by glutamatergic drive either locally or from sources other than PL cell bodies, presumably silencing the PL cell body inputs in GTs also leaves other glutamatergic inputs as the primary sources?
We agree with the reviewer that one could say, perhaps somewhat colloquially, that PL-DMS inhibition turns GTs to STs, in terms of turning performance and associated glutamate peak dynamics. The newly added data graphs are consistent with this notion. However, there are of course numerous other neurobiological characteristics which differ between GTs and STs and are revealed in the context of other behavioral or physiological functions. In the Discussion, and as noted by the reviewer, we discuss alternative sources of glutamatergic control in STs and the functional implications of bottom-up mechanisms. In the revised manuscript, we have updated references and made minor revisions to improve this perspective.
(13) I found the abstract really detailed and very dense, it is pretty hard to understand in its current form for someone who hasn't yet read the paper. At this level, I would recommend more emphasis on what the results mean rather than listing the specific findings, given that the task is still quite opaque to the reader.
We revised the abstract, in part by deleting two rather dense but non-essential statements of results and by adding a more accessible conclusion statement.
(14) There are a lot of abbreviations: CTTT, PD, PCA, GT, ST, MEA, GO, LMM, EMMs, PL, DMS. Some of these are only mentioned a few times: MEA, LMM, and EMMs are all mentioned less than 5 times. To reduce mental load for the reader, you could spell these ones out, or include a table somewhere with all of the abbreviations.
We added a list of Abbreviations and Acronyms and eliminated abbreviations that were used infrequently.
(15) Generally, the logic that cortico-striatal connections contribute to GT vs ST seems easy to justify, however, the provided justification is missing a line of connection: "As such biases of GTs and STs were previously shown to be mediated in part via contrasting cholinergic capacities for the detection of cues (Paolone et al., 2013; Koshy Cherian et al., 2017; Pitchers et al., 2017a; Pitchers et al., 2017b), we hypothesized that contrasts in the cortico-striatal processing of movement cues contribute to the expression of these opponent biases." Please elaborate on why specifically cholinergic involvement suggests corticostriatal involvement. I think there are probably more direct reasons for the current hypothesis.
Done – see p. 4-5.
(16) Along the same line, paragraph 3 of the intro about Parkinson's disease and cholinergics seems slightly out of place. This is because the specific or hypothesized link between these things and corticostriatal glutamate has not been made clear. Consider streamlining the message specifically to corticostriatal projections in the context of the function you are investigating.
Done – see p. 4-5.
(17) Page 8, paragraph 2. There is a heading or preceding sentence missing from the start of this paragraph: "Contrary to the acclimation training phase, during which experimenters manually controlled the treadmill, this phase was controlled entirely by custom scripts using Med-PC software and interface (MedAssociates).".
Revised and clarified.
(18) Page 13 "We utilized a pathway-specific dual-vector chemogenetic strategy (e.g., Sherafat et al., 2020) to selectively inhibit the activity of fronto-cortical projections to the DMS". The Hart et al (2018) reference seems more appropriate being both the same pathway and viral combination approach.
Yes, thank you, we’ve updated the citation.
(19) Pages 20-21: "Maximum glutamate peak concentrations recorded during the cue period were significantly higher in GTs than in STs (phenotype: F(1,28.85)= 8.85, P=0.006, ηp 2=0.23; Fig. 5c). In contrast, maximum peak amplitudes locked to other task events all were significantly higher in STs." The wording here is misleading, both Figures 5c and 5d report glutamate peaks during the turn cue, the difference is what the animal does. So, it should be something like "Maximum glutamate peak concentrations recorded during the cue period were significantly higher in GTs than in STs when the animal correctly made a turn (stats) but this pattern reversed on missed trials when the animal failed to turn (stats)..." or something similar.
Yes, thank you. We have revised this section accordingly.
(20) Same paragraph: "Contingency tables were used to compare phenotype and outcome-specific proportions and to compute the probability for turns in GTs relative to STs." What is an outcome-specific proportion?
This has been clarified.
.
(21) Page 22 typo: "GTs were only 0.74 times as likely as GTs to turn".
Fixed.
(22) The hypothesis for the DREADDs experiment isn't made clear enough. Page 23 "In contrast, in STs, more slowly rising, multiple glutamate release events, as well as the presence of relatively greater reward delivery-locked glutamate release, may have reflected the impact of intra-striatal circuitry and ascending, including dopaminergic, inputs on the excitability of glutamatergic terminals of corticostriatal projections" As far as I can understand, the claim seems to be that glutamate release might be locally modulated in the case of ST, on account of the profile of glutamate release- more slowly rising, multiple events, and reward-locked. Please clarify why these properties would preferentially suggest local modulation.
We have revised and expanded this section to clarify the basis for this hypothesis.
(23) The subheadings for the section related to Figure 7 "CNO disrupts..." "CNO attenuates..." presumably you mean fronto-striatal inhibition disrupts/attenuates. As it stands, it reads like the CNO per se is having these effects, off-target.
Fixed.
(24) The comparison of the results in the discussion against a "hypothetical" results section had the animals not been phenotyped behaviorally is unnecessary and overly speculative, given that 30-40% of rats don't fall into either of these two categories. I think the point here is to emphasize the importance of taking phenotype into account. This point can surely be made directly in its own sentence, probably somewhere towards the end of the discussion).
We have partly followed the reviewer’s advice and separated the discussion of the hypothetical results from the summary of main findings. However, we did not move this discussion toward the end of the Discussion section as we believe that it justifies the guiding focus of the discussion on the impact of phenotype.
(25) The discussion, like the introduction, talks a lot about cholinergic activity. As noted, this link is unclear - particularly how it links with the present results, please clarify or remove. Likewise high-frequency oscillations.
We have revised relevant sections in the Introduction (see above) and Discussion sections. However, given the considerable literature indicating contrasts between the cortical cholinergic-attentional capacities of GTs and STs, the interpretation of the current findings in that larger context is justified.
(26) Typo DSM in the discussion x 2.
Thanks, fixed.
Reviewer #2 (Recommendations for the authors):
(1) As mentioned in the Public Review, it is challenging to assess what is considered the "n" in each analysis, particularly for the glutamate signal analysis (trial, session, rat, trace (averaged across session or single trial)). Representative glutamate traces are used to illustrate a central finding, while more conventional trial-averaged population activity traces are not presented or analyzed. For example, n = 5 traces, out of hundreds of recorded traces, with each rat contributing 1-27 traces across multiple sessions suggests ~1-2% of the data are shown as time-resolved traces. Representative traces should in theory be consistent with population averages within phenotype, and if not, discussion of such inconsistencies would enrich the conclusions drawn from the study. In particular, population traces of the phasic cue response in GT may resemble the representative peak examples, while smaller irregular peaks of ST may be missed in a population average (averaged prolonged elevation in signal) and could serve as rationale for more sophisticated analyses of peak probability presented subsequently (and relevant to opening paragraph of discussion where hypothetical data rationale is presented).
We have added the new Table 1 to provide a complete account of the number of rats, per phenotype and sex, for each component of the experiments. In addition, the new Table 3 provides the range, median and total number of glutamate traces that were analyzed and formed the foundation of the individual data graphs depicting the results of glutamate concentration analyses.
We chose not to present trial- or subject-averaged traces, as glutamate peaks occur at variable times relative to the onset of turn and stop cues and reward delivery, and therefore averaging across a population of rats or trials would obscure phenotype- and task event-dependent patterns of glutamate peaks. The attached graph serves to illustrate this issue. The graph shows turn cue-locked glutamate concentrations (M, SD) from trials that yielded turns, averaged over all traces used for the analysis of the data shown in Fig. 5d (see also Table 3, top row). Because of the variability of peak times, trial- and subject-averaging of traces from STs yielded a nearly 2-s long elevated plateau of glutamate concentrations (red triangles), contrasting with the presence single and multiple peaks in STs as illustrated in Figs. 5b and 5e. Furthermore, averaging of traces from GTs obscured the presence of primarily single turn cue-locked peaks. Because of the relatively large variances of averaged data points, again reflecting the variability of peak times, analysis of glutamate levels during the cue period did not indicate an effect of phenotype (F(1,190)=1.65, P\=0.16). Together, subject- or trial-averaged traces would not convey the glutamate dynamics that form the essence of the amperometric findings obtained from our study. We recognize, as inferred by the reviewer, that smaller irregular peaks in STs may have been missed given the definition of a glutamate peak (see Methods). It is in part for that reason that we conducted a prospective analysis of the probability for turns given a combination of peak characteristics (maximum peak concentration and peak numbers; Fig. 6).
(2)To this latter point, the relationship between the likelihood to turn and the size of glutamate peak is focused on the GT phenotype, which limits understanding of how smaller multiple peaks relate to variables of interest in ST (missed turns, stops, reward). If it were possible to determine the likelihood for each phenotype, without a direct contrast of one phenotype relative to the other, this would be a more straightforward description of how signal frequency and amplitude relate to relevant behaviors in each group. Depending on the results, this could be done in addition to or instead of the current analysis in Figure 6.
We considered the reviewer’s suggestion but could not see how attempts to analyze the role of maximum glutamate concentrations and number of peaks within a single phenotype would provide any significant insights beyond the current description of results. Moreover, as stressed in the 2nd paragraph of the Discussion (see Reviewer 1, point 24), the removal of the phenotype comparison would nearly completely abolish the relationships between glutamate dynamics and behavior from the current data set.
Author response image 1.
(3) If Figure 6 is kept, a point made in the text is that GT is 1.002x more likely than ST to turn at a given magnitude of Glu signal. 1.002 x more likely is easily (perhaps mistakenly) interpreted as nearly identical likelihood. Looking closely at the data, perhaps what is meant is @ >4uM the difference between top-line labeled {b} and bottom-line labeled {d,e} is 1.002? If not, there may be a better way to describe the difference as 1x could be interpreted as the same/similar.
Concerning the potential for misinterpretation, the original manuscript stated (key phrase marked here in red font): Comparing the relative turn probabilities at maximum peak concentrations >4 µM, GTs were 1.002 times more likely (or nearly exactly twice as likely) as STs to turn if the number of cue-evoked glutamate peaks was limited to one (rhombi in Fig. 6a) when compared to the presence of 2 or 3 peaks (triangles in Fig. 6a). However, we appreciate the reviewer’s concern about the complexity of this statement and, as it merely re-emphasized a result already described, it was deleted.
(4) For Figure 7e, the phenotype x day interaction is reported, but posthocs are looking within phenotype (GT) at treatment effects. Is there a phenotype x day x treatment, or simply phenotype x treatment (day collapsed) to justify within-group treatment posthocs?
We have revised the analysis and illustration of the data shown in Figs 7e and 7f, by averaging the test scores from the two tests, per animal, of the effects of vehicle and CNO, to be able to conduct a simpler 2-way analysis of the effects of phenotype and treatment.
(5) Ideally, viral control is included as a factor in this analysis as well. The separate analysis for viral controls was likely done due to low n, however negative findings from an ANOVA in which an n=2 (ST) should be interpreted with extreme caution. The authors already have treatment control (veh, CNO) and may consider dropping the viral controls completely due to the lack of power to perform appropriate analyses.
This issue has been clarified – see reviewer 1, point 9.
Minor:
(1) In the task description, it could be clearer how reward delivery relates to turns and stops. For example, does the turn cue indicate the rat will be rewarded at the port behind it? Does the stop cue indicate that the rat will be rewarded at the port in front of it? This makes logical sense, but the current text does not describe the task in this way, instead focusing on what is the correct action (seemingly but unlikely independent of reinforcement).
We have updated the task description in Methods and the legend of Figure 2 to indicate the location of reward delivery following turns and stops.
(2) For the peak analysis, what is the bin size for determining peaks? It is indicated that the value before and after the peak is >1 SD below the peak value, so it is helpful to know the temporal bin resolution for this definition.
As detailed on p 11-12 under Amperometry Data Processing and Analysis of Glutamate Peaks, we analyzed glutamate concentrations recorded at a frequency of 5 Hz (200 ms bins) throughout the 2-second-long presentation of turn and stop cues and for a 2-second period following reward delivery.
(3) Long Evans rats are pictured in Figure 2 (presumably contrast with a white background is better here), while SD rats are pictured in Figure 1. Perhaps stating why LE rats are pictured would help clear up any ambiguity about the strains used, as a quick look gives the impression two strains are used in two different tasks.
Yes, see reviewer 1, point 2.
(4) In Figure 7e, the ST and GT difference in turns/turn cue does not seem to replicate prior findings for tracking differences for this measure (Figure 3b). ST from the chemogenetic cohort seems to perform better than rats whose behavior was examined prior to glutamate sensor insertion. What accounts for this difference? Training and testing conditions/parameters?
The reviewer is correct. The absence of a significant difference between vehicle-treated GTs and vehicle-treated STs in Fig. 7e reflects a relatively lower turn rate in GTs than was seen in the analysis of baseline behavior (Fig. 3b; note the different ordinates of the two figures, needed to show the impact of CNO in Fig. 7e). Notably, the data in Fig. 7e are based on fewer rats (12 versus 29 GTs and 10 versus 22 STs; Table 1) and on rats which at this point had undergone additional surgeries to infuse the DREADD construct and implant electrode arrays. We can only speculate that these surgeries had greater detrimental effects in GTs, perhaps consistent with evidence suggesting that immune challenges trigger a relatively greater activation of their innate immune system (Carmen et al., 2023). We acknowledged this issue in the revised Results.
(5) The authors are encouraged to revise for grammar (are vs. is, sentence ending with a preposition, "not only" clause standing alone) and word choice (i.e. in introduction: insert, import, auditorily). Consider revising the opening sentence on page 5 for clarity.
We have revised the entire text to improve grammar and word choice.
(6) Do PD fallers refer to rats or humans? if the latter, this may be a somewhat stigmatizing word choice.
We have replaced such phrases using more neutral descriptions, such as referring to people with PD who frequently experience falls.
(7) Page 27 What does "non-instrumental" behavior mean?
We have re-phrased this statement without using this term.
(8) The opening paragraph of the discussion is focused on comparing reported results (with phenotype as a factor) to a hypothetical description of results (without phenotype as a factor) that were not presented in the results section. There is one reference to a correlation analysis on collapsed data, but otherwise, no reporting of data overall rats without phenotype as a factor. If this is a main focus, including these analyses in the results would be warranted. If this is only a minor point leading to discussion, authors could consider omitting the hypothetical comparison.
We have revised this section - see reviewer 1 point 24.
Reviewer #3 (Recommendations for the authors):
(1) These are really interesting studies. I think there are issues in data presentation/analysis that make it difficult to parse what exactly is happening in the glutamate signals, and when. Overall the paper is just a bit of a difficult read. A generally standard approach for showing neural recording data of many kinds, including, for example, subject-averaged traces, peri-event histograms, heatmaps, etc summarizing and quantifying the results - would be helpful. Beyond the examples in Figure 5, I would suggest including averaged traces of the glutamate signals and quantification of those traces.
We have addressed these issues in multiple ways, see the response to several points of reviewers 1 and 2, particularly reviewer 2, point 1.
(2) Figure 6 (and the description in the response letter) is also very non-intuitive. It's unclear how the examples shown relate to the reported significance indicators/labels/colors etc in the figure. I would suggest rethinking this figure overall, and if there is a more direct quantitative way to connect signal features with behavior. Again, drawing from standard visualization approaches for neural data could be one approach.
See also reviewer 2 points 1 and 3. Furthermore, we have revised the text in Results and the legend to improve the accessibility of Fig. 6.
(3) As far as I can tell, all of the glutamate sensor conclusions reflect analysis collapsed across 100s of trials. Do any of the patterns hold for a subjects-wise analysis? How variable are individual subjects?
We employed linear mixed-effect model analyses and added a random subject intercept to account for subject variability outside fixed effects (phenotype and treatment). The variance of the intercept ranged 0.01-1.71 SEM across outcome (cued turns/cued stops/misses). See also reviewer 1, point 7 and reviewer 2, point 1.
-
-
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
Summary:
This paper investigates the effects of the explicit recognition of statistical structure and sleep consolidation on the transfer of learned structure to novel stimuli. The results show a striking dissociation in transfer ability between explicit and implicit learning of structure, finding that only explicit learners transfer structure immediately. Implicit learners, on the other hand, show an intriguing immediate structural interference effect (better learning of novel structure) followed by successful transfer only after a period of sleep.
Strengths:
This paper is very well written and motivated, and the data are presented clearly with a logical flow. There are several replications and control experiments and analyses that make the pattern of results very compelling. The results are novel and intriguing, providing important constraints on theories of consolidation. The discussion of relevant literature is thorough. In summary, this work makes an exciting and important contribution to the literature.
Weaknesses:
There have been several recent papers that have identified issues with alternative forced choice (AFC) tests as a method of assessing statistical learning (e.g. Isbilen et al. 2020, Cognitive Science). A key argument is that while statistical learning is typically implicit, AFC involves explicit deliberation and therefore does not match the learning process well. The use of AFC in this study thus leaves open the question of whether the AFC measure benefits the explicit learners in particular, given the congruence between knowledge and testing format, and whether, more generally, the results would have been different had the method of assessing generalization been implicit. Prior work has shown that explicit and implicit measures of statistical learning do not always produce the same results (eg. Kiai & Melloni, 2021, bioRxiv; Liu et al. 2023, Cognition).
We agree that numerous papers in the Statistical Learning literature discuss how different test measures can lead to different results and, in principle, using a different measure could have led to varying results in our study. In addition, we believe there are numerous additional factors relevant to this issue including the dichotomous vs. continuous nature of implicit vs. explicit learning and the complexity of the interactions between the (degree of) explicitness of the participants' knowledge and the applied test method that transcend a simple labeling of tests as implicit or explicit and that strongly constrains the type of variations the results of different test would produce. Therefore, running the same experiments with different learning measures in future studies could provide additional interesting data with potentially different results.
However, the most important aspect of our reply concerning the reviewer's comment is that although quantitative differences between the learning rate of explicit and implicit learners are reported in our study, they are not of central importance to our interpretations. What is central are the different qualitative patterns of performance shown by the explicit and the implicit learners, i.e., the opposite directions of learning differences for “novel” and “same” structure pairs, which are seen in comparisons within the explicit group vs. within the implicit group and in the reported interaction. Following the reviewer's concern, any advantage an explicit participant might have in responding to 2AFC trials using “novel” structure pairs should also be present in the replies of 2AFC trials using the “same” structure pairs and this effect, at best, could modulate the overall magnitude of the across groups (Expl/Impl.) effect but not the relative magnitudes within one group. Therefore, we see no parsimonious reason to believe that any additional interaction between the explicitness level of participants and the chosen test type would impede our results and their interpretation.
Given that the explicit/implicit classification was based on an exit survey, it is unclear when participants who are labeled "explicit" gained that explicit knowledge. This might have occurred during or after either of the sessions, which could impact the interpretation of the effects.
We agree that this is a shortcoming of the current design, and obtaining the information about participants’ learning immediately after Phase 1 would have been preferred. However, we made this choice deliberately as the disadvantage of assessing the level of learning at the end of the experiment is far less damaging than the alternative of exposing the participants to the exit survey question earlier and thereby letting them achieve explicitness or influence their mindset otherwise through contemplating the survey questions before Phase 2. Our Experiment 5 shows how realistic this danger of unwanted influence is: with a single sentence alluding to pairs in the instructions of Exp 5, we could completely change participants' quantitative performance and qualitative response pattern. Unfortunately, there is no implicit assessment of explicitness we could use in our experimental setup. We also note that given the cumulative nature of statistical learning, we expect that the effect of using an exit survey for this assessment only shifts absolute magnitudes (i.e. the fraction of people who would fall into the explicit vs. implicit groups) but not aspects of the results that would influence our conclusions.
Reviewer #2 (Public Review):
Summary:
Sleep has not only been shown to support the strengthening of memory traces but also their transformation. A special form of such transformation is the abstraction of general rules from the presentation of individual exemplars. The current work used large online experiments with hundreds of participants to shed further light on this question. In the training phase, participants saw composite items (scenes) that were made up of pairs of spatially coupled (i.e., they were next to each other) abstract shapes. In the initial training, they saw scenes made up of six horizontally structured pairs, and in the second training phase, which took place after a retention phase (2 min awake, 12 h incl. sleep, 12 h only wake, 24 h incl. sleep), they saw pairs that were horizontally or vertically coupled. After the second training phase, a two-alternatives-forced-choice (2-AFC) paradigm, where participants had to identify true pairs versus randomly assembled foils, was used to measure the performance of all pairs. Finally, participants were asked five questions to identify, if they had insight into the pair structure, and post-hoc groups were assigned based on this. Mainly the authors find that participants in the 2-minute retention experiment without explicit knowledge of the task structure were at chance level performance for the same structure in the second training phase, but had above chance performance for the vertical structure. The opposite was true for both sleep conditions. In the 12 h wake condition these participants showed no ability to discriminate the pairs from the second training phase at all.
Strengths:
All in all, the study was performed to a high standard and the sample size in the implicit condition was large enough to draw robust conclusions. The authors make several important statistical comparisons and also report an interesting resampling approach. There is also a lot of supplemental data regarding robustness.
Weaknesses:
My main concern regards the small sample size in the explicit group and the lack of experimental control.
The sample sizes of the explicit participants in our experiments are, indeed, much smaller than those of the implicit participants due to the process of how we obtain the members of the two groups. However, these sample sizes of the explicit groups are not small at all compared to typical experiments reported in Visual Statistical Learning studies, rather they tend to be average to large sizes. It is the sizes of the implicit subgroups that are unusually high due to the aforementioned data collecting process. Moreover, the explicit subgroups have significantly larger effect sizes than the implicit subgroup, bolstering the achieved power that is also confirmed by the reported Bayes Factors that support the “effect” or the “no effect” conclusions in the various tests ranging in value from substantial to very strong. Based on these statistical measures, we think the sample sizes of the explicit participants in our studies are adequate.
As for the lack of experimental control, indeed, we could not fully randomize consolidation condition assignment. Instead, the assignment was a product of when the study was made available on the online platform Prolific. This method could, in theory, lead to an unobserved covariate, such as morningness, being unbalanced between conditions. We do not have any reasons to believe that such a condition would critically alter the effects reported in our study, but as it follows from the nature of unobserved variables, we obviously cannot state this with certainty. Therefore, we added an explicit discussion of these potential pitfalls in the revised version of the manuscript.
Reviewer #3 (Public Review):
In this project, Garber and Fiser examined how the structure of incidentally learned regularities influences subsequent learning of regularities, that either have the same structure or a different one. Over a series of six online experiments, it was found that the structure (spatial arrangement) of the first set of regularities affected the learning of the second set, indicating that it has indeed been abstracted away from the specific items that have been learned. The effect was found to depend on the explicitness of the original learning: Participants who noticed regularities in the stimuli were better at learning subsequent regularities of the same structure than of a different one. On the other hand, participants whose learning was only implicit had an opposite pattern: they were better in learning regularities of a novel structure than of the same one. This opposite effect was reversed and came to match the pattern of the explicit group when an overnight sleep separated the first and second learning phases, suggesting that the abstraction and transfer in the implicit case were aided by memory consolidation.
These results are interesting and can bridge several open gaps between different areas of study in learning and memory. However, I feel that a few issues in the manuscript need addressing for the results to be completely convincing:
(1) The reported studies have a wonderful and complex design. The complexity is warranted, as it aims to address several questions at once, and the data is robust enough to support such an endeavor. However, this work would benefit from more statistical rigor. First, the authors base their results on multiple t-tests conducted on different variables in the data. Analysis of a complex design should begin with a large model incorporating all variables of interest. Only then, significant findings would warrant further follow-up investigation into simple effects (e.g., first find an interaction effect between group and novelty, and only then dive into what drives that interaction). Furthermore, regardless of the statistical strategy used, a correction for multiple comparisons is needed here. Otherwise, it is hard to be convinced that none of these effects are spurious. Last, there is considerable variation in sample size between experiments. As the authors have conducted a power analysis, it would be good to report that information per each experiment, so readers know what power to expect in each.
Answering the questions we were interested in required us to investigate two related but separate types of effects within our data: general above-chance performance in learning, and within- and across-group differences.
Above-chance performance: As typical in SL studies, we needed to assess whether learning happened at all and which types of items were learned. For this, a comparison to the chance level is crucial and, therefore, one-sample t-test is the statistical test of choice. Note that all our t-tests were subject to experiment-wise correction for multiple comparisons using the Holm-Bonferroni procedure, as reported in the Supplementary Materials.
Within- and across-group differences: To obtain our results regarding group and par-type differences and their interactions, we used mixed ANOVAs and appropriate post-hoc tests as the reviewer suggested. These results are reported in the method section.
Concerning power analysis, in the revised version of the manuscript we added analysis of achieved power for the statistical tests most critical to our arguments.
(2) Some methodological details in this manuscript I found murky, which makes it hard to interpret results. For example, the secondary results section of Exp1 (under Methods) states that phase 2 foils for one structure were made of items of the other structure. This is an important detail, as it may make testing in phase 2 easier, and tie learning of one structure to the other. As a result, the authors infer a "consistency effect", and only 8 test trials are said to be used in all subsequent analyses of all experiments. I found the details, interpretation, and decision in this paragraph to lack sufficient detail, justification, and visibility. I could not find either of these important design and analysis decisions reflected in the main text of the manuscript or in the design figure. I would also expect to see a report of results when using all the data as originally planned.
We thank the reviewer for pointing out these critical open questions our manuscript that need further clarification. The inferred “consistency effect” is based on patterns found in the data, which show an increase in negative correlation between test types during the test phase. As this is apparently an effect of the design of the test phase and not an effect of the training phase, which we were interested in, we decided to minimize this effect as far as possible by focusing on the early test trials. For the revised version of the manuscript, we revamped and expanded the discussion of how this issue was handled and also add a short comment in the main text, mentioning the use of only a subset of test trials and pointing the interested reader to the details.
Similarly, the matched sample analysis is a great addition, but details are missing. Most importantly, it was not clear to me why the same matching method should be used for all experiments instead of choosing the best matching subgroup (regardless of how it was arrived at), and why the nearest-neighbor method with replacement was chosen, as it is not evident from the numbers in Supplementary Table 1 that it was indeed the best-performing method overall. Such omissions hinder interpreting the work.
Since our approach provided four different balanced metrics (see Supp. Tables 1-4) for each matching method, it is not completely straightforward to make a principled decision across the methods. In addition, selecting the best method for each experiment separately carries the suspicion of cherry-picking the most suitable results for our purposes. For the revised version, we expanded on our description of the matching and decision process and added supplementary descriptive plots showing what our data looks like under each matching method for each experiment. These plots highlight that the matching techniques produce qualitatively roughly identical results and picking one of them over the other does not alter the conclusions of the test. The plots give the interested reader all the necessary information to assess the extent our design decisions influence our results.
(3) To me, the most surprising result in this work relates to the performance of implicit participants when phase 2 followed phase 1 almost immediately (Experiment 1 and Supplementary Experiment 1). These participants had a deficit in learning the same structure but a benefit in learning the novel one. The first part is easier to reconcile, as primacy effects have been reported in statistical learning literature, and so new learning in this second phase could be expected to be worse. However, a simultaneous benefit in learning pairs of a new structure ("structural novelty effect") is harder to explain, and I could not find a satisfactory explanation in the manuscript.
Although we might not have worded it clearly, we do not claim that our "structural novelty effect" comes from a “benefit” in learning pairs of the novel structure. Rather, we used the term “interference” and lack of this interference. In other words, we believe that one possible explanation is that there is no actual benefit for learning pairs of the novel structure but simply unhindered learning for pairs of the novel structure and simultaneous inference for learning pairs of the same structure. Stronger interference for the same compared to the novel structure items seems as a reasonable interpretation as similarity-based interference is well established in the general (not SL-specific) literature under the label of proactive interference.
After possible design and statistical confounds (my previous comments) are ruled out, a deeper treatment of this finding would be warranted, both empirically (e.g., do explicit participants collapse across Experiments 1 and Supplementary Experiment 1 show the same effect?) and theoretically (e.g., why would this phenomenon be unique only to implicit learning, and why would it dissipate after a long awake break?).
Across all experiments, the explicit participants showed the same pattern of results but no significant difference between pair types, probably due to insufficiency of the available sample sizes. We already included in the main text the collapsed explicit results across Experiments 1-4 and Supplementary Experiment 1 (p. 16). This analysis confirmed that, indeed, there was a significant generalization for explicit participants across the two learning phases. We could re-run the same analysis for only Experiment 1 and Supplementary Experiment 1, but due to the small sample of N=12 in Suppl. Exp. 1, this test will be likely completely underpowered. Obtaining the sufficient sample size for this one test would require an excessive number (several hundreds) of new participants.
In terms of theoretical treatment, we already presented our interpretation of our results in the discussion section, which we expanded on in the revised manuscript.
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
(1) It would be very useful to add individual data points (and/or another depiction of the distribution) to the bar plots. If not in the main figures, as added figures in the supplement.
We added violin plots for all results in the Supplementary.
(2) It would be helpful to include in the supplement some examples of responses that led to the 'explicit' or 'implicit' classification. Specifically, what kind of response was considered to contain a partial recognition of the underlying structure vs. no recognition?
We added example responses used for classification in the Supplementary.
(3) It would be useful to show the results of Experiment 5 as well as the diagonal version as supplemental figures.
We added the requested figures in the Supplementary.
Typos: page 10: "in in the tests", page 15: "rerun"
Fixed.
Reviewer #2 (Recommendations For The Authors):
(1) My strongest reservation relates to the small sample size in the explicit group. The authors do report stats for all experiments together in one analysis and I think this is the only robust finding for this group. I would suggest removing any comparisons between this smaller group and the larger implicit group since they do not make a lot of sense due to the imbalance in sample size in my opinion. If they do want to report the explicit group individually for each experiment, they should at least test for differences between the experiments also for this group using ANOVA.
We do agree that the unbalanced nature of the sample sizes can be problematic for the between-group comparisons. The t-tests reported for between-group comparisons are in fact Welch’s t-test better suited for unequal sample sizes and variances. Previously, we failed to report that these t-tests were Welch’s t-test, which we fixed in the revised version.
In the Supplementary, we previously reported an ANOVA including all explicit participants from all experiments. This showed a significant main effect of Experiment and test type, but no significant interaction. We take this as evidence that although specific levels of learning vary by experimental condition, the overall pattern of learning (i.e. which pairs are learned better) are the same across all experiments.
(2) Moreover, the explicit group does not only differ in the explicitness of their memory but also regarding learning performance per se (as evidenced by performance differences for the first training). This important confound needs to be acknowledged and discussed more thoroughly!
We agree that this topic is important, this is why the subsection “The Type of Transfer Depends on Quality of Knowledge, Not Quantity of Knowledge” deals exclusively with this issue. See our reply to the next point.
(3) The resampling approach is somewhat interesting to solve the issue raised in 2. However, I doubt that the authors actually achieve what they are claiming. Since we have a 2-AFC task the possibility must be considered that participants who chose correctly in the implicit group did so by chance. This means that the assumption that the matched pairs actually have the same amount of memory for the first training period as the explicit group is likely false. Therefore, this analysis is still comparing apples and oranges.
We address this idea in detail in the supplementary materials pointing out first that the matched results showed the same pattern as the full results suggesting that Phase 1 and Phase 2 results are independent for this group, and by arguing that randomly selected subset of participants should not show a significant deviation from null performance in the Same vs. Novel performance in Phase 2.
(4) One important issue, when conducting online experiments is assuring random allocation of participants. How did the authors recruit participants to ensure they did not select participants for the different experiments that differed regarding their preference for wake vs. sleep retention intervals? If no care was taken in this regard, I would suggest reporting this and maybe briefly discussing it.
This shortcoming was now reported and addressed in the discussion section of the revised manuscript.
(5) I could not find any information about the exact questions that were asked about the task rules. Also, there was no information on how the answers were used to assign groups. Both should be added.
The exact questions were added to the revised Supplementary.
(6) I think that the literature on sleep and rule extraction is well-represented in the manuscript. However, I think also referring more thoroughly to the literature on how sleep leads to gist extraction, schemas, and insight would help understand the relevance of the present research.
We subsumed references to the mentioned areas of research under the labels of abstraction and generalization. In the revised section, we listed the appropriate labels along with the already used references to make the connection to a vast literature treating generalization in related but distinct ways more explicit.
(7) It is unclear to me why the items learned in the first learning phase interfere with those learned in the second learning phase (without sleep) and not vice versa. What is the author's explanation for this?
We added a paragraph on this to our revised discussion section. In short, there may also be retroactive interference. However, we would need yet another variation of the paradigm to properly measure it, and this was outside the scope of the current work.
(8) As far as I can tell the study lacks all of the usual control tasks that are used in the field of sleep and memory (especially subjective sleepiness and objective vigilance). In addition, this research has the circadian confound, and therefore additional controls would have been warranted, e.g., morningness-eveningness, retrieval capabilities. Also, performance immediately after training phase 1 was not tested, which would serve as an important control for circadian differences in initial learning of the rule.
The study uses a number of the control measures established in the sleep and memory literature, such as habitual sleep quality and sleep quality during the night of and the night before the experiment. However, there are, of course, more potentially interesting measures, such as the ones named by the reviewer.
Testing performance right after training phase 1 would have been very interesting indeed. However, due to the nature of statistical learning tasks, this would have completely confounded the implicitness of learning by presenting participants with segmented input; i.e. isolated pairs. Therefore, we opted for the lesser of two evils in our design decision.
(9) As far as I can tell, there is no effect of sleep on correctly identifying pairs from training phase 1. This would be expected and thus should be discussed.
As noted and referenced in the discussion section, the effect of sleep on statistical learning per se is a subject of controversy in the literature, where some studies apparently find effects, while others find no effect on statistical learning whatsoever.
(10) The manuscript should explicitly mention if the study was preregistered.
It was not.
Reviewer #3 (Recommendations For The Authors):
The topic of this project is close to my heart, and I commend the authors for conducting numerous variations of the experiment with large sample sizes. I have some suggestions I feel will make the paper stronger, and a few minor comments that caught my eye during reading:
(1) First and foremost, I found the paper's structure cumbersome. For instance, different aspects of Experiment 1 results are reported in (1) the main text, (2) under methods, and (3) in Supplementary. This makes reading unnecessarily difficult. This relates not only to the analysis results - the sample size is reported as 226 in the main text, 226+3 in Methods, and 226+3+19 in Supplementary. I strongly suggest removing all results from the Methods section and merging the supplementary results with the main results.
We overhauled the structure of the paper, moving much more information into the proper method section and out of the Supplementary.
(2) "Attention checks" and "response bias" appear first in Supplementary Experiment 1 but are explained only later under Experiment 1. The same thing for the experimental procedure. I therefore suggest placing Experiment 1 before Supplementary Experiment 1, but related to my previous comment - have one paragraph dedicated to Subject Exclusion of all experiments.
The new structure of the Method sections solves this.
(3) Figure 4 is mentioned but does not appear in the manuscript.
This has been fixed. The paragraph in question now references the correct supplementary figure.
(4) OSF project includes only data with no README file on how to understand the data. The work would also benefit from sharing the experimental and analysis codes.
A README file was added.
(5) This sentence is repeated in relation to four experiments: "Bayes Factors from Bayesian t-tests for implicit participants reported for experiments 1, 2, and 3 used an r-scale parameter of 0.5 instead of the default √2/2, reflecting that Experiment 1 found small effect sizes for this group". First, it is missing an explanation of what the r-scale means. Second, it sounds as if this was a product of the procedure, but in fact it was a decision by the researcher if I am correct. If so, it is missing a description of how and why this choice was made.
This was indeed a decision by the researchers, in line with a Baysian logic of evidence accumulation. We made the explanation in the paper clearer.
(6) Did I understand correctly that each pair was tested 4 times? Was it against the same foil? Did you make sure not to repeat the same pair in back-to-back trials? These details, in addition to what I noted in the public review, are needed.
Each pair was tested 4 times. Each time against a different foil pair. Details have been added to the Method section.
(7) Also in relation to my public review, I could not understand why the sample size was overshot by so much in Experiment 1 (229 instead of 198.15)?
The calculated sample size of 198.15 was for the implicit subgroup alone, while 229 included explicit and implicit participants.
(8) The correlation between phase 1 and phase 2 is only tested in explicit participants. Why is that? A test in implicit participants is needed for completeness.
Correlations for implicit participants have been added.
(9) There is known asymmetry between the horizontal and vertical plains in our visual system (with preference for horizontal stimuli). I was missing a comparison between learning in the two structures, and a report of how many participants received either in Phase 1.
The allocation of participants to horizontal and vertical conditions was balanced. In the Method section we already report an ANOVA testing for a potential effect of orientation condition, which was not significant.
Minor/aesthetic comments:
(1) "In Phase 2, explicit participants performed above chance for learning pairs that shared their higher level orientation structure with that of pairs in Phase 1". This sounds as if there was a separate test following the two learning phases. Perhaps reword to "for phase 2 pairs".
Fixed
(2) "the two asleep-consolidation groups (Exp. 3 and 4)" - I think you mean Exp. 2 and 4.
Fixed.
(3) "acquiring explicitness in Experiment 5 as compared to 1" I think you mean Supplementary Experiment 1 as compared to 1.
Fixed
(4) "without such a redescription, the previously learned patterns in Phase 1 interfere with new ones in Phase 2, when redescription occurs..." The comma should be a dot.
Fixed
(5) In Experiment 4, did 168 or 169 participants survive exclusion? Both accounts exist, and so do reports of degrees of freedom that allow both 23 and 24 explicit participants.
Fixed.
(6) "Implicit learners also performed above chance.." in Experiment 2 is missing (n=XX).
Fixed.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public reviews:
We are grateful to the reviewers and the editorial team for their feedback and thorough revisions of our paper. We also appreciate their acknowledgement that this study represents a significant advancement in the field of reproductive neuroendocrinology and offers insights on the contribution of obesity vs melanocortin signaling in women’s fertility. In the revised version, we will provide a more detailed clarification of the data and methodology and adhere to the reviewers’ suggestions.
Please find below our answers to specific concerns in the public review:
Given the fact that mice lacking MC4R in Kiss1 neurons remained fertile despite some reproductive irregularities, the overall tone and some of the conclusions of the manuscript (e.g., from the abstract: "... Mc4r expressed in Kiss1 neurons is required for fertility in females") were overstated. Perhaps this can be described as a contributing pathway, but other mechanisms must also be involved in conveying metabolic information to the reproductive system.
We will tone down these statements throughout the manuscript to indicate that MC4R in Kiss1 neurons plays a role in the metabolic control of fertility (rather than “…is required for fertility”)
The mechanistic studies evaluating melanocortin signalling in Kiss1 neurons were all completed in ovariectomised animals (with and without exogenous hormones) that do not experience cyclical hormone changes. Such cyclical changes are fundamental to how these neurons function in vivo and may dynamically alter the way they respond to neuropeptides. Therefore, eliminating this variable makes interpretation difficult.
Mice lack true follicular and luteal phases and therefore it is impossible to separate estrogen-mediated changes from progesterone-mediated changes (e.g., in a proestrous female). Therefore, we use an ovariectomized female model in which we can generate a LH surge with an E2-replacement regimen [1]. This model enables us to focus on estrogen effects, exclude progesterone effects, and minimize variability. Inclusion of cycling females would make interpretation much more difficult.
(1) Bosch et al., 2013 Mol & Cell Endo; https://doi.org/10.1016/j.mce.2012.12.021
Use of the POMC-Cre to target ontogenetic inputs to Kiss1 neurons might have targeted a wider population of cells than intended.
POMC is transiently expressed during embryonic development in a portion of cells fated to be Kiss1 or NPY/AgRP neurons [1-2]. Therefore, this is a valid concern when crossing with a floxed mouse. However, use of AAVs in adult animals avoids this issue and leads to specific expression in POMC neurons [3]. This POMC-Cre mouse has been used extensively with AAVs to drive specific expression in POMC neurons by other laboratories [4-7]. Therefore, we are confident that our optogenetic studies have narrowly targeted POMC inputs.
(1) Padilla et al., 2010 Nat Med; https://doi.org/10.1038/nm.2126
(2) Lam et al., 2017 Mol Metab; https://doi.org/10.1016/j.molmet.2017.02.007
(3) Stincic et al., 2018 eNeuro; https://doi.org/10.1523/eneuro.0103-18.2018
(4) Fenselau et al., 2017 Nat Neuro; https://doi.org/10.1038/nn.4442
(5) Rau & Hentges, 2019 J Neuro; https://doi.org/10.1523/jneurosci.3193-18.2019
(6) Fortin et al., 2021 Nutrients; https://doi.org/10.3390/nu13051642
(7) Villa et al., 2024 J Neuro; https://doi.org/10.1523/jneurosci.0222-24.2024
Recommendations for Authors
We thank the reviewers and the editorial team for their comments and thorough revisions of our paper. We have now addressed their comments and edited the manuscript accordingly:
Reviewer #1 (Recommendations For The Authors):
L80 -This is an awkward sentence; it isn't an inverse agonist of the AgRP; this may read better just to say that the inverse agonist, AgRP.
Thank you for this comment. This has now been changed in the text (L80).
L86 - This text reads as if mice have an inherent obesity issue.
This has also now been addressed in the text (L86).
L131 - The numbers of digits past the decimal point should match for both mean and SEM.
This has also now been addressed throughout the text.
Figure 1D: Revise the bar graphs with distinct SEM bars, as these data are not generated within the same mice.
The graphs are now changed, and they include distinct SEM and individual data points.
Figure 2I-L - An n of 3 for controls is pretty minimal, though the clustering of data points is tight.
We thank the reviewer for this comment, and we emphasize that while we agree that an n=3 for controls is minimal, the mRNA level values of this group are close, therefore the clustering of the data points is tight. We are happy to provide the raw data value for these groups if the reviewer wishes to.
L159 - The role of reduced dynorphin mRNA is pretty speculative with regard to basal levels of LH, especially since no other indices of LH secretion were affected. It should also be recognized that mRNA levels do not always equate to activity.
We agree with the reviewer that our explanation of the role of the reduced dynorphin with regards to the elevated basal LH is speculative, however, we only report that the higher LH levels correlates with the lower expression of the Pdyn gene expression, which is in line with the well documented role of Dynorphin on inhibiting LH secretion. We also recognize that mRNA levels don’t necessarily reflect activity. We have now added this statement to the text (L159).
L164 - Given the ovary data, it seems that the increase seen in KO mice isn't quite sufficient, but is it known how much of a surge is necessary for ovulation in mice?
We agree with the reviewer’s comment that the LH surge in Kiss1MC4RKO group is not enough to consistently induce ovulation, which is supported by the decrease in the numbers of corpora lutea data (Figure 2, O).
According to literature, an LH surge in the female mice is estimated by a LH value >4 ng/ml (Bahougne et al., 2020). According to this rule, our data show that only two females out of six had LH surge in the KO group, while four females out of five had LH surge in the control group.
L211 - According to the figure, LH pulses were not recovered and remained similar to KO levels. Looking at the LH secretory patterns presented, it seems like the pulse frequency data should be interpreted with some caution, given that some of the pulses identified are tenuous at best.
We agree that the LH pulses identified by our software (criteria described in the methods) are variable in shape (LH pulses are difficult to detect clearly in gonad intact females) and did not differ in number between groups; however, the reinsertion of Mc4r within Kiss1 neurons restored LH basal levels, amplitude and total secretory mass, which are clear indicatives of a significant improvement in the ability of these mice to release LH.
L218 - Is there a reason why the surge was not looked at in these groups?
Ovarian histology is the best indicator of ovulation. In these mice, corpora lutea were absent, indicating impaired ovulation, thus, we did not consider performing an LH surge protocol was necessary.
L244 - This would also fit with previous findings in sheep that not all Kiss neurons express MC receptors
We agree with this comment.
L329 - Given the rapidity of its actions, how would this membrane ER function during a normal surge?
Rapid estrogen signaling can act to ease transitions between states. Membrane delimited E2 actions can quickly attenuate or enhance coupling between receptors and signaling cascades. These effects will precede E2-driven changes in gene expression that produce more stable alterations in signaling. This combination of mechanisms will reduce any lag between rises in serum E2 and physiological effects. Considering the abbreviated mouse reproductive cycle, parallel mechanisms acting on different timescales are particularly important.
L365 - I'm a little confused as to how this particular work sheds light on a role for MC3R. Is the relative distribution of the two isoforms within Kiss neurons known?
In the present study, we report that hypothalamic Mc3r expression decreases leading up to the age of puberty onset (p30), in line with the profile of expression of Mc4r and a recent publication involving Mc3r in puberty onset (Lam et al., 2021), suggesting that both receptors may be involved in the control of reproductive function, potentially through the direct regulation of Kiss1 neurons as characterized in our present study.
L422 - While I understand the nature of this statement, the receptor may simply reflect the activity of what binds to it, i.e., AgRP vs. alpha-MSH, suggesting that maybe the prepubertal period is more AgRP-dominated.
We agree with this statement, and this needs to be further investigated.
L495 - Reinsertion of Mc4R in Kiss1 neurons
Thank you for this comment. This is now corrected in the text (L501).
L524 - Bilateral ovariectomy of 6-month
Thank you for this comment. This is now corrected in the text (L530).
L538 - Is it known what stage of the cycle these mice were in when samples were collected?
Yes, the samples were collected in diestrus. This is now mentioned in the text (L548)
L556 - Pulse amplitude is usually measured relative to the preceding nadir.
The method that we have been consistently using in our lab is the average of the 4 highest LH values in the samples collection period for each animal. We have found this to be consistent and representative of the overall amplitude (McCarthy et al., 2021; Talbi et al., 2021).
L594 - This is a little confusing - the whole MBH would contain the ARH, but only the ARH was collected from the KO mice. If the whole MBH, dynorphin and Tac3, and Tac3 are expressed outside of the ARC, making interpretation of changes specifically within the ARH is difficult.
Here (L592), we describe two different experiments, as mentioned by i) and ii).
For experiment 1 (i): MBH was used in the WT mice at ages P10, P15, P22 and P30 to investigate the expression of the melanocortin genes (Agrp, Pomc, Mc3r and Mc4r).
For experiment 2 (ii): In both KO and control groups, only the micro-dissected ARH was used to investigate genes expressions of Pdyn, Kiss1, Tac2, Tacr3.
Reviewer #2 (Recommendations For The Authors):
The validation experiments for the various manipulations are currently presented in the supplementary data. Still, in my opinion, these are critically important for interpreting the data, and it should be considered to present these more comprehensively in the main body of the manuscript. In Figure S1, it seems that the exposure of the two images is not the same, with a higher background in the control. Has this image been adjusted to highlight the staining, while the other has not? It looks like there remains a low level of expression still present in at least some of the KO cells - this may reflect difficulties using RNAscope (with its extreme amplification) to detect the absence of a signal, or it could also be that the knockout is incomplete. A percentage of cells still express MC4R. I think this should be acknowledged or discussed.
We thank the reviewer for the feedback. While we agree that the validation of the mouse model is critical, we would like to keep it in the supplemental data.
We also agree that the exposure looks different between the KO and WT controls, and we thank the reviewer for this comment. The quality of the photograph decreased when transferring to the manuscript. This has now been improved in the revised figure.
As for the MC4R expression in some of the KO cells, we believe that MC4R is expressed in non Kiss1 cells as shown in the merged figure. Therefore, we believe that the Knockout of Mc4r in Kiss1 neurons is complete in these mice.
The clear difference from the PVN's lack of effect is convincing and indicates that a specific knockout has been achieved. Is equivalent data also available for the AVPV population of cells that are examined later in the manuscript? Do those Kiss1 neurons also express the MC4R? The same question applies to the knock-in experiment: Was the expression of MC4R also driven in the AVPV population using this approach
Yes, Kiss1 neurons in the AVPV also express MC4R as indicated in this study, and thus Mc4r is removed/reinserted in the AVPV as well in this mouse model.
The quantitative RT-qPCR data on developmental changes in metabolic signaling molecules are really peripheral to the paper's main question. Relative to the validation experiments (as discussed above), I think these are less important data and could be placed into a supplementary figure. The discussion of these data becomes problematic, e.g., on line 359, the changes are described as "a low melanocortin tone..." but this seems problematic when referring to reduced expression of AgRP, an inverse agonist at the MC4R. If you are going to present these data, individual data points should be shown. Similarly, the question about whether this is a PCOS-like phenotype is perhaps worth asking. Still, the simple assessment of T and AMH could also be reported in a sentence without necessarily showing the data (or placing it in a supplementary figure). Better to focus on the key question - which is the role of MC4R signaling in Kiss1 neurons.
We understand this reviewer’s concerns, however, due to the impact of MC4R signaling (particularly in the context of AgRP) on puberty, we strongly believe that the reader will benefit from expression profile across ages so we will respectfully disagree and keep in the main figure.
Per this reviewer’s comment, we have now added individual data points to Figure 1D.
We also agree with the reviewer that the T and AMH data are not in the main scope of the paper, but since we uncovered a PCOS-like phenotype in female mice with specific deletion of Mc4r from Kiss1 neurons, it is important to keep these data in the main figure to show that the phenotype does not fully resemble a PCOS model.
Having praised the experimental design, I think it is fair to acknowledge that the reproductive data from these experiments remain difficult to interpret. I understand that it is difficult to illustrate estrous cycles, but the "quantitative" data on percentages of time spent in any one stage are not as informative as seeing the actual individual patterns in Figure 2B. Were all of the animals consistently like the one illustrated, with persistent diestrus and only occasional evidence of ovulation?
We agree that Figure 2C may be difficult to interpret but it is the best way to capture the all the data points for each group.
All the 5 Kiss1MC4RKO females had persistent diestrus phases with only one or two estrus phases over 15 days (except for one female who had 4 estrous days), compared to control females who had 7 to 9 days of estrous, as shown in the graph (except for one female who had 5 days of estrus over 15 days period).
Given that LH pulses appear to be normal, does this, in fact, suggest an ovarian problem? Is that possible? Are MC4R and Kiss1 co-expressed in the ovary? Or do you think this suggests an ovulation problem, perhaps driven by the impaired LH surge?
This reviewer is correct in that our findings suggest a central defect in ovulation based on the deficit observed in the preovulatory LH surge. Thus, it is possible to have normal LH pulses, which are driven by one population of Kiss1 neurons (ARH) and the LH surge, driven by a distinct population of Kiss1 neurons (AVPV).
Similarly, the response to the "LH surge induction protocol" is impaired (why not look at endogenous LH surges?). It seems that ovulation should be an all-or-none phenomenon in that if the LH surge is sufficient to induce ovulation, then all available follicles would be ovulated. If it is not, then no follicles will be ovulated. Why fewer follicles are ovulated in the gene-targeted animals seems more likely to be due to impaired follicular development rather than a subthreshold LH surge. So, this again points back to the ovary. Or perhaps we need a more thorough assessment of the pattern of LH pulses throughout the cycles in these animals.
An LH surge induction protocol allows us to submit all female mice to the same conditions and expect a similar response, which is then optimal to compare with animals with an expected ovulation deficit, as it eliminates external factors. We disagree in that ovulation is an all-or-none phenomenon because in mice numerous follicles mature at the same time and thus a decrease in the number of ovulated oocytes may be significant between groups even if the animals are not completely infertile.
Collectively, my assessment of these data is that there are effects on reproduction, but they are actually relatively subtle. There were abnormal cycles and impaired LH surge in response to exogenous estrogen. But the animals are not actually infertile, so can ovulate and express normal reproductive behavior. So while there is a role for MC4R signalling in Kiss1 neurons, it may be a contributing modulatory role rather than a major regulatory mechanism. I think the tone of the descriptions should reflect this. I like the way it is framed in some parts of the discussion ("reproductive impairments...mediated by MC4R in Kiss1 neurons and not by their obese phenotype"), but the overall significance of this is overstated in some places, such as the abstract and in other parts of the discussion ("this population is tightly controlled by melanocortins").
As mentioned in previous responses, ovulation in mice is not all-or nothing, so while the mice can reproduce, the disruption in the central mechanisms that control ovulation and irregular estrous cycles are a significant advancement in the field with strong translational potential to species where only one oocyte is usually ovulated, like in humans, where reproductive disorders in MC4R patients had been attributed to the obesity phenotype rather than to a central action of MC4R (as the reviewer captured in their comment). This is one of the main findings of this study.
The overstatement has been now addressed throughout the text.
For in vitro studies, all mice were ovariectomized and given estradiol "replacement." What was the rationale for this? Wouldn't this suppress the basal activity of these neurons? Then it appears that some of the animals were studied as ovariectomised (for an unspecified time but apparently ">7 days", without hormone replacement. The basal activity of these cells would be dramatically different. I think these artificial manipulations make these data quite difficult to interpret. How does this reflect the situation in a normal (or abnormal) estrous cycle? My understanding is that the brain slice approach already compromises the ability of this population of cells to function as a coordinated network (i.e., coordinated episodes of activity that are seen in vivo have not been observed in vitro in brain slices). Ovariectomizing and providing exogenous hormones also removes the additional regulatory elements of the cyclical changes in hormone inputs, so the cells may or may not behave like they would in vivo. Perhaps the authors could justify their choice of experimental model.
We have clarified that the mice were ovariectomized for 7-10 days. A group of 3 mice are OVXed at once and then used on subsequent days a week later. This delay is both for the recovery of the animal and to allow for “washout” of endogenous ovarian hormones. For optogenetic studies, we were not measuring basal activity. Rather, we prioritized the ability to detect a postsynaptic response. While E2 decreases the networked activity of Kiss1- ARH neurons, the Hcn channels, calcium channels, and Vglut2 expression are all increased. This leads to increased excitability and more glutamate release. Mice lack true follicular and luteal phases and therefore it is impossible to separate estrogen-mediated changes from progesterone-mediated changes (e.g., in a proestrous female). Therefore, we use an ovariectomized female model in which we can generate a LH surge with an E2-replacement regimen (Bosch et al., J Mol Cell Endocrinology 2013). This model enables us to focus on estrogen effects, exclude progesterone effects, and minimize variability. Finally, we have documented that Kiss1<sup>ARH</sup> neurons retain the synchronization of their neuronal firing in the hypothalamic slice preparation (Qiu et al., eLife 2016).
Figure 4E shows neurons' staining after expressing a Cre-dependent channel rhodopsin vector into POMC-Cre mice. The number of labelled cells looks markedly larger than expected for adult POMC neurons. Was the specificity of this approach to neurons expressing POMC checked? I understand that the POMC-Cre mice have been criticised for ectopic expression of Cre during development in other populations of neurons in the arcuate nucleus that does not express POMC, such as the AgRP neurons (e.g., PMID: 22166984). Is it possible that this is not a problem in adult animals? Has that been validated in these animals? The description of the method suggests that it is acknowledged that some of the expression driven in these animals might be in AgRP neurons. Still, optogenetic activation of these cells will include all cells expressing Cre at the time of AAV administration.
POMC is transiently expressed during embryonic development in a portion of cells fated to be Kiss1 or NPY/AgRP neurons. Therefore, this is a valid concern when crossing with a floxed mouse. However, use of AAVs in adult animals avoids this issue and leads to specific expression in POMC neurons. This POMC-Cre mouse has been used extensively with AAVs to drive specific expression in POMC neurons by other laboratories (Padilla et al., Nat Med 2010; Lam et al., Mol Metab 2017; Stincic et al., eNeuro 2018 eNeuro; Fenselau et al., Nat Neuro 2017). We have previously shown that AAV-driven mCherry expression is limited to cells labeled with a beta-endorphin antibody (Stincic et al., 2018 eNeuro). Therefore, we are confident that our optogenetic studies have narrowly targeted POMC inputs.
Some additional explanation of the electrophysiology result may be required. For example, on Line 292, I'm confused by Fig 4M. Why is the response to 20Hz stimulation different in this cell (compared to the one in 4L) before administering naloxone? What proportion of cells showed this opposite response? On line 307: Is 5 cells sufficient for testing the POMC inputs onto AVPV and PeN Kiss1 neurons? How many slices/animals are included in collecting these 5 cells? The rapid action of STX illustrates the ability to modulate the response to MTII, but I am struggling to understand the implications of this in a physiological context. Suppose this response is desensitized by longer-term treatment with E2 (as indicated in the manuscript). Is it relevant to normal regulation during the cycle (particularly in the AVPV, where the key regulatory step seems to be the prolonged exposure to high estradiol as part of the preovulatory signals leading up to the LH surge)?
As stated in the text, E2 has been shown to increase POMC expression and beta-Endorphin immunostaining. We do not know the effects of E2 on aMSH expression and release. E2 also tends to attenuate the coupling between inhibitory postsynaptic metabotropic (Gi,o-coupled) receptors and signaling cascades. So, there is likely a combination of pre- and post-synaptic mechanisms contributing to these responses. However, the focus of the current studies was on the predominant melanocortin signaling and, as such, we chose to eliminate the influence of opioid signaling. We have added two more cells to this group, both of which were successfully rescued for a total of 5 of 6 cells (6 slices, 5 animals). Between the labeling of b-endorphin fibers and high rate of rescue, we do believe that this is sufficient evidence to support a direct POMC input to Kiss1<sup>AVP/PeN</sup> neurons.
Line 52: "Here, we show that Mc4r expressed in Kiss1 neurons is required for fertility in females." The knockout animals remain fertile, so this conclusion needs to be re-worded.
Thank you for this comment. This has now been changed (L52).
Line 80: "The melanocortin 4 receptor (MC4R) binds α-melanocyte stimulating hormone (αMSH), an agonist product of the pro-opiomelanocortin (Pomc) gene, and the inverse agonist of the agouti-related peptide (AgRP) to regulate food intake and energy expenditure" Is this the correct wording? I think it should be stated that AgRP is an inverse agonist at the MC4R, not that αMSH is the inverse agonist of AgRP. Re-work this sentence.
Thank you for this comment. This has now been changed (L79-80).
Line 88: "... however, conflicting reports exist". Describe what these conflicting reports show. Many MC4 variants ("mutations") are expressed in humans, but few will fully inactivate signalling like the mouse knockout.
We thank the reviewer for this comment. By conflicting data, we refer to the studies that report no reproductive impairments in women with MC4R mutations. Either because the metabolic impairments (obesity, hyperphagia, hyperinsulinemia, hyperleptinemia, etc) are so strong that the focus is skewed to these issues, without a full reproductive assessment in these women, or simply because the reviewer mentioned, not all MC4R mutations fully inactivate its signaling in humans - as opposed to mouse models where reproductive disruption has been described previously in full body MC4RKOs.
Line 91: "...that largely affects females". Is this a genuine sex difference, or are reproductive deficits simply more overt in female rodents? I think the Coss paper (reference 19 in the manuscript) showed a greater effect of diet-induced obesity in males than in females.
We believe that sex differences exist with regards to the role of MC4R in the regulation of fertility, as we show that most of this effect is mediated by MC4R signaling in Kiss1 AVPV neurons, a neuronal population that is specific to the female brain.
As far as we can tell, the Coss paper (Villa et al., 2024) has only tested males but not females. Moreover, they investigated the effect of diet induced obesity in mice on their fertility (specifically LH secretion), while in this study we are specifically looking at the deletion of MC4R from Kiss1 neurons, and these mice were not obese (Figure 2A). While both these conditions induce impaired fertility, the mechanisms and signaling pathways are different (our mice lack MC4R signaling while the obese mice have a decrease in MC4R expression but the signaling is still functional).
Line 392: also Hessler et al. PMID: 32337804.
This reference is now added to the text (Line 393).
Line 433. The discussion of how advanced puberty onset (seen in the Kiss1-specific KO animals) might be caused by MC4R signalling in AVPV Kiss1 neurons, which are sexually dimorphic, which might explain sex differences in puberty timing in mammals seems extremely speculative and based on limited data. More targeted experiments would be needed to address this, and I think this speculation should be removed here.
This speculation has now been removed from the text.
Line 438: "Furthermore, our findings suggest that metabolic cues, through the regulation of the melanocortin output onto Kiss1AVPV/PeN neurons, are essential for the timing and magnitude of the GnRH/LH surge." Again, I think this is overstating the present data, which has only looked at an artificial hormone administration regime. The animals are fertile and, thus, must be able to mount a sufficient LH surge. The major effect, in fact, seems to be on their cycle, perhaps leading to impaired follicular development. Please acknowledge that this will be one of the multiple pathways by which metabolic information is fed into the HPG axis.
In addition to the effect on their cycles as mentioned by the reviewer, the Kiss1MC4RKO females also display impaired fertility (Figure 2, S-T) and fewer corpora lutea which is in line with the impaired mounting of LH surge (Figure 2, M). Even if the LH surge is induced by the hormone administration protocol, it only reflects the natural ability of the HPG axis to mount the surge, as this regimen is only there to mimic the endogenous hormonal changes leading to LH surge and therefore ovulation, in a controlled manner. Nonetheless, we agree with this reviewer that this is not the sole mechanism by which metabolism regulates reproductive function and this has been emphasized in the paper. (line 443)
Reviewer #3 (Recommendations For The Authors):
The decreased melanocortin tone drives puberty onset (Figure 1D), and this is correlative. The transgenic animals' hypothalamic expression of Agrp, Pomc, Mc4r, and Mc3r can be measured to strengthen the claim. Hprt expression should be demonstrated, as this housekeeping gene was used as a common denominator.
We thank the reviewer for this comment. While we think that indeed, measuring Agrp, Pomc, Mc4r, and Mc3r gene expressions in the transgenic mice will strengthen our claim and give more insights into the melanocortins tone during pubertal maturation, this is unfortunately not feasible as it will involve generating a lot of mice (at least n=40 pups for an n=5/group, KO and control littermates, females only -which will require setting up lots of breeding pairs-) during different ages throughout puberty.
As for the gene expression of Hprt, because we have 6 mice per age, 4 ages total, every gene (Agrp, Pomc, Mc4r, Mc3r) was run in a separate plate with Hprt as its own housekeeping gene. Samples were run in duplicates for each Hprt and melanocortin genes in a 96 well = 48 wells for Hprt and 48 wells for each of the melanocortin genes. Therefore, it won’t be possible to represent one Hprt expression for all the four genes, however every gene was normalized to the Hprt gene expression that was ran in the same plate).
In Figures 4 and 5, dot plots can be used (as opposed to the bar graphs) to better reflect the individual data points.
Figures 4 and 5 have been revised to include individual data points.
The electrophysiology experiment requires more details in the method section. In addition to the publication cited, a brief recap of the methodology used in this paper, such as the focal application of MTII (Figure 4B), is also needed.
We have added more details to the Methods.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the previous reviews.
Reviewer #1 (Public review):
Summary:
In the manuscript the authors describe a new pipeline to measure changes in vasculature diameter upon optogenetic stimulation of neurons. The work is useful to better understand the hemodynamic response on a network /graph level.
Strengths:
The manuscript provides a pipeline that allows to detect changes in the vessel diameter as well as simultaneously allows to locate the neurons driven by stimulation.
The resulting data could provide interesting insights into the graph level mechanisms of regulating activity dependent blood flow.
Weaknesses:
(1) The manuscript contains (new) wrong statements and (still) wrong mathematical formulas.
The symbols in these formulas have been updated to disambiguate them, and the accompanying statements have been adjusted for clarity.
(2) The manuscript does not compare results to existing pipelines for vasculature segmentation (opensource or commercial). Comparing performance of the pipeline to a random forest classifier (illastik) on images that are not preprocessed (i.e. corrected for background etc.) seems not a particularly useful comparison.
We’ve now included comparisons to Imaris (a commercial) for segmentation and VesselVio (open-source) for graph extraction software.
For the ilastik comparison, the images were preprocessed prior to ilastik segmentation, specifically by doing intensity normalization.
Example segmentations utilizing Imaris have now been included. Imaris leaves gaps and discontinuities in the segmentation masks, as shown in Supplementary Figure 10. The Imaris segmentation masks also tend to be more circular in cross-section despite irregularities on the surface of the vessels observable in the raw data and identified in manual segmentation. This approach also requires days to months to generate per image stack.
“Comparison with commercial and open-source vascular analysis pipelines
To compare our results with those achievable on these data with other pipelines for segmentation and graph network extraction, we compared segmentation results qualitatively with Imaris version 9.2.1 (Bitplane) and vascular graph extraction with VesselVio [1]. For the Imaris comparison, three small volumes were annotated by hand to label vessels. Example slices of the segmentation results are shown in Supplementary Figure 10. Imaris tended to either over- or under-segment vessels, disregard fine details of the vascular boundaries, and produce jagged edges in the vascular segmentation masks. In addition to these issues with segmentation mask quality, manual segmentation of a single volume took days for a rater to annotate. To compare to VesselVio, binary segmentation masks (one before and one after photostimulation) generated with our deep learning models were loaded into VesselVio for graph extraction, as VesselVio does not have its own method for generating segmentation masks. This also facilitates a direct comparison of the benefits of our graph extraction pipeline to VesselVio. Visualizations of the two graphs are shown in Supplementary Figure 11. Vesselvio produced many hairs at both time points, and the total number of segments varied considerably between the two sequential stacks: while the baseline scan resulted in 546 vessel segments, the second scan had 642 vessel segments. These discrepancies are difficult to resolve in post-processing and preclude a direct comparison of individual vessel segments across time. As the segmentation masks we used in graph extraction derive from the union of multiple time points, we could better trace the vasculature and identify more connections in our extracted graph. Furthermore, VesselVio relies on the distance transform of the user supplied segmentation mask to estimate vascular radii; consequently, these estimates are highly susceptible to variations in the input segmentation masks.We repeatedly saw slight variations between boundary placements of all of the models we utilized (ilastik, UNet, and UNETR) and those produced by raters. Our pipeline mitigates this segmentation method bias by using intensity gradient-based boundary detection from centerlines in the image (as opposed to using the distance transform of the segmentation mask, as in VesselVio).”
(3) The manuscript does not clearly visualize performance of the segmentation pipeline (e.g. via 2d sections, highlighting also errors etc.). Thus, it is unclear how good the pipeline is, under what conditions it fails or what kind of errors to expect.
On reviewer’s comment, 2D slices have been added in the Supplementary Figure 4.
(4) The pipeline is not fully open-source due to use of matlab. Also, the pipeline code was not made available during review contrary to the authors claims (the provided link did not lead to a repository). Thus, the utility of the pipeline was difficult to judge.
All code has been uploaded to Github and is available at the following location: https://github.com/AICONSlab/novas3d
The Matlab code for skeletonization is better at preserving centerline integrity during the pruning of hairs from centerlines than the currently available open-source methods.
- Generalizability: The authors addressed the point of generalizability by applying the pipeline to other data sets. This demonstrates that their pipeline can be applied to other data sets and makes it more useful. However, from the visualizations it's unclear to see the performance of the pipeline, where the pipelines fails etc. The 3d visualizations are not particularly helpful in this respect . In addition, the dice measure seems quite low, indicating roughly 20-40% of voxels do not overlap between inferred and ground truth. I did not notice this high discrepancy earlier. A thorough discussion of the errors appearing in the segmentation pipeline would be necessary in my view to better assess the quality of the pipeline.
2D slices from the additional datasets have been added in the Supplementary Figure 13 to aid in visualizing the models’ ability to generalize to other datasets.
The dice range we report on (0.7-0.8) is good when compared to those (0.56-86) of 3D segmentations of large datasets in microscopy [2], [3], [4], [5], [6]. Furthermore, we had two additional raters segment three images from the original training set. We found that the raters had a mean inter class correlation of 0.73 [7]. Our model outperformed this Dice score on unseen data: Dice scores from our generalizability tests on C57 mice and Fischer rats on par or higher than this baseline.
Reviewer #2 (Public review):<br /> The authors have addressed most of my concerns sufficiently. There are still a few serious concerns I have. Primarily, the temporal resolution of the technique still makes me dubious about nearly all of the biological results. It is good that the authors have added some vessel diameter time courses generated by their model. But I still maintain that data sampling every 42 seconds - or even 21 seconds - is problematic. First, the evidence for long vascular responses is lacking. The authors cite several papers:
Alarcon-Martinez et al. 2020 show and explicitly state that their responses (stimulus-evoked) returned to baseline within 30 seconds. The responses to ischemia are long lasting but this is irrelevant to the current study using activated local neurons to drive vessel signals.
Mester et al. 2019 show responses that all seem to return to baseline by around 50 seconds post-stimulus.
In Mester et al. 2019, diffuse stimulations with blue light showed a return to baseline around 50 seconds post-stimulus (cf. Figure 1E,2C,2D). However, focal stimulations where the stimulation light is raster scanned over a small region focused in the field of view show longer-lasting responses (cf. Figure 4) that have not returned to baseline by 70 seconds post-stimulus [8]. Alarcon-Martinez et al. do report that their responses return baseline within 30 seconds; however, their physiological stimulation may lead to different neuronal and vessel response kinetics than those elicited by the optogenetic stimulations as in current work.
O'Herron et al. 2022 and Hartmann et al. 2021 use opsins expressed in vessel walls (not neurons as in the current study) and directly constrict vessels with light. So this is unrelated to neuronal activity-induced vascular signals in the current study.
We agree that optogenetic activation of vessel-associated cells is distinct from optogenetic activation of neurons, but we do expect the effects of such perturbations on the vasculature to have some commonalities.
There are other papers including Vazquez et al 2014 (PMID: 23761666) and Uhlirova et al 2016 (PMID: 27244241) and many others showing optogenetically-evoked neural activity drives vascular responses that return back to baseline within 30 seconds. The stimulation time and the cell types labeled may be different across these studies which can make a difference. But vascular responses lasting 300 seconds or more after a stimulus of a few seconds are just not common in the literature and so are very suspect - likely at least in part due to the limitations of the algorithm.
The photostimulation in Vazquez et al. 2014 used diffuse photostimulation with a fiberoptic probe similar to Mester et al. 2019 as opposed to raster scanning focal stimulation we used in this study and in the study by Mester et al. 2019 where we observed the focal photostimulation to elicited longer than a minute vascular responses. Uhlirova et al. 2016 used photostimulation powers between 0.7 and 2.8 mW, likely lower than our 4.3 mW/mm2 photostimulation. Further, even with focal photostimulation, we do see light intensity dependence of the duration of the vascular responses. Indeed, in Supplementary Figure 2, 1.1 mW/mm2 photostimulation leads to briefer dilations/constrictions than does 4.3 mW/mm2; the 1.1 mW/mm2 responses are in line, duration wise, with those in Uhlirova et al. 2016.
Critically, as per Supplementary Figure 2, the analysis of the experimental recordings acquired at 3-second temporal resolution did likewise show responses in many vessels lasting for tens of seconds and even hundreds of seconds in some vessels.
Another major issue is that the time courses provided show that the same vessel constricts at certain points and dilates later. So where in the time course the data is sampled will have a major effect on the direction and amplitude of the vascular response. In fact, I could not find how the "response" window is calculated. Is it from the first volume collected after the stimulation - or an average of some number of volumes? But clearly down-sampling the provided data to 42 or even 21 second sampling will lead to problems. If the major benefit to the field is the full volume over large regions that the model can capture and describe, there needs to be a better way to capture the vessel diameter in a meaningful way.
In the main experiment (i.e. excluding the additional experiments presented in the Supplementary Figure 2 that were collected over a limited FOV at 3s per stack), we have collected one stack every 42 seconds. The first slice of the volume starts following the photostimulation, and the last slice finishes at 42 seconds. Each slice takes ~0.44 seconds to acquire. The data analysis pipeline (as demonstrated by the Supplementary Figure 2) is not in any way limited to data acquired at this temporal resolution and - provided reasonable signal-to-noise ratio (cf. Figure 5) - is applicable, as is, to data acquired at much higher sampling rates.
It still seems possible that if responses are bi-phasic, then depth dependencies of constrictors vs dilators may just be due to where in the response the data are being captured - maybe the constriction phase is captured in deeper planes of the volume and the dilation phase more superficially. This may also explain why nearly a third of vessels are not consistent across trials - if the direction the volume was acquired is different across trials, different phases of the response might be captured.
Alternatively, like neuronal responses to physiological stimuli, the vascular responses elicited by increases in neuronal activity may themselves be variable in both space and time.
I still have concerns about other aspects of the responses but these are less strong. Particularly, these bi-phasic responses are not something typically seen and I still maintain that constrictions are not common. The authors are right that some papers do show constriction. Leaving out the direct optogenetic constriction of vessels (O'Herron 2022 & Hartmann 2021), the Alarcon-Martinez et al. 2020 paper and others such as Gonzales et al 2020 (PMID: 33051294) show different capillary branches dilating and constricting. However, these are typically found either with spontaneous fluctuations or due to highly localized application of vasoactive compounds. I am not familiar with data showing activation of a large region of tissue - as in the current study - coupled with vessel constrictions in the same region. But as the authors point out, typically only a few vessels at a time are monitored so it is possible - even if this reviewer thinks it unlikely - that this effect is real and just hasn't been seen.
Uhlirova et al. 2016 (PMID: 27244241) observed biphasic responses in the same vessel with optogenetic stimulation in anesthetized and unanesthetized animals (cf Fig 1b and Fig 2, and section “OG stimulation of INs reproduces the biphasic arteriolar response”). Devor et al. (2007) and Lindvere et al. (2013) also reported on constrictions and dilations being elicited by sensory stimuli.
I also have concerns about the spatial resolution of the data. It looks like the data in Figure 7 and Supplementary Figure 7 have a resolution of about 1 micron/pixel. It isn't stated so I may be wrong. But detecting changes of less than 1 micron, especially given the noise of an in vivo prep (brain movement and so on), might just be noise in the model. This could also explain constrictions as just spurious outputs in the model's diameter estimation. The high variability in adjacent vessel segments seen in Figure 6C could also be explained the same way, since these also seem biologically and even physically unlikely.
Thank you for your comment. To address this important issue, we performed an additional validation experiment where we placed a special order of fluorescent beads with a known diameter of 7.32 ± 0.27um, imaged them following our imaging protocol, and subsequently used our pipeline to estimate their diameter. Our analysis converged on the manufacturer-specified diameters, estimating the diameter to be 7.34 ± 0.32. The manuscript has been updated to detail this experiment, as below:
Methods section insert
“Second, our boundary detection algorithm was used to estimate the diameters of fluorescent beads of a known radius imaged under similar acquisition parameters. Polystyrene microspheres labelled with Flash Red (Bangs Laboratories, inc, CAT# FSFR007) with a nominal diameter of 7.32um and a specified range of 7.32 ± 0.27um as determined by the manufacturer using a Coulter counter were imaged on the same multiphoton fluorescence microscope set-up used in the experiment (identical light path, resonant scanner, objective, detector, excitation wavelength and nominal lateral and axial resolutions, with 5x averaging). The images of the beads had a higher SNR than our images of the vasculature, so Gaussian noise was added to the images to degrade the SNR to the same level of that of the blood vessels. The images of the beads were segmented with a threshold, centroids calculated for individual spheres, and planes with a random normal vector extracted from each bead and used to estimate the diameter of the beads. The same smoothing and PSF deconvolution steps were applied in this task. We then reported the mean and standard deviation of the distribution of the diameter estimates. A variety of planes were used to estimate the diameters.”
Results Section Insert
“Our boundary detection algorithm successfully estimated the radius of precisely specified fluorescent beads. The bead images had a signal-to-noise ratio of 6.79 ± 0.16 (about 35% higher than our in vivo images): to match their SNR to that of in vivo vessel data, following deconvolution, we added Gaussian noise with a standard deviation of 85 SU to the images, bringing the SNR down to 5.05 ± 0.15. The data processing pipeline was kept unaltered except for the bead segmentation, performed via image thresholding instead of our deep learning model (trained on vessel data). The bead boundary was computed following the same algorithm used on vessel data: i.e., by the average of the minimum intensity gradients computed along 36 radial spokes emanating from the centreline vertex in the orthogonal plane. To demonstrate an averaging-induced decrease in the uncertainty of the bead radius estimates on a scale that is finer than the nominal resolution of the imaging configuration, we tested four averaging levels in 289 beads. Three of these averaging levels were lower than that used on the vessels, and one matched that used on the vessels (36 spokes per orthogonal plane and a minimum of 10 orthogonal planes per vessel). As the amount of averaging increased, the uncertainty on the diameter of the beads decreased, and our estimate of the bead's diameter converged upon the manufacturer's Coulter counter-based specifications (7.32 ± 0.27um), as tabulated in Table 1.”
Reviewer #1 (Recommendations for the authors):
Comments to the authors replies to the reviews:
- Supplementary Figure 13:
As indicated before the 3d images + scale makes it impossible to judge the quality of the outputs.
As aforementioned, 2D slices have been added to the Supplementary Figure 13.
- Supplementary Table 3:
There is a significant increase in the Hausdorrf and Mean Surface Distance measures for the new data, why ?
A single vessel being missed by either the rater or the model would significantly affect the Hausdorff distance (HD) and by extension Mean Surface Distance: this is particularly pertinent in the LSFM image with its much larger FOV and thus a potential for much larger max distances to result from missed vessels in the prediction or ground truth data. Large Hausdorff distances may indicate a vessel was missed in either the ground truth or the segmentation mask.
Of note, a different rater annotated these additional datasets from the raters labeling the ground truth data. There is a high variability in boundary placements between raters. On a test where three raters segmented the same three images from the original dataset, we computed a ICC of 0.73 across their segmentations. Our model Dice scores on predictions in out-of-distribution data sets were on par with the inter-rater ICC on the Thy1ChR2 2PFM data.
- Supplementary Figure 2: The authors provide useful data on the time responses. However, looking at those figures, it is puzzling why certain vessels were selected as responding as there seems almost no change after stimulation. In addition, some of the responses seem to actually start several tens of seconds before the actual stimulus (particularly in A).
Only some traces in C and D (dark blue) seem to be actually responding vessels.
This is not discussed and unclear.
Supplementary Figure 2 displays the time courses of vessel calibre for all vessels in the FOV, not just those deemed responders.
The aforementioned effects are due to the loess smoothing filter having been applied to the time courses for the preliminary response, which has been rectified in the updated figures. In particular, Supplementary Figure 2 has been updated with separate loess smoothing before and after photostimulation. The (pre-stimulation) effect is gone once the loess smoothing has been separated.
- R Point 7: As indicated before and in agreement with the alternative reviewer, the quality of the results in 3d is difficult to judge. No 2d sections that compare 'ground truth' with inferred results are shown in the current manuscript which would enable a much better judgment. The provided video is still 3d and not a video going through 2d slices. Also, in the video the overlap of vasculature and raw data seems to be very good and near 100%, why is the dice measure reported earlier so low ? Is this a particularly good example ?
Some examples, indicating where the pipeline fails (and why) would be helpful to see, to judge its performance better (ideally in 2d slices).
As discussed in the public comments, the 2D slices are now included in Suppl. Fig. 4 and suppl. Fig 13 to facilitate visual assessment. The vessels are long and thin so that slight dilations or constrictions impact the Dice scores without being easily visualizable.
- Author response images 6 and 7. From the presented data the constrictions measured in the smaller vessels may be a result (at least partly) of noise. This seems to be particularly the case in Author response image 7 left top and bottom for example. It would be helpful to show the actual estimates of the vessels radii overlaid in the (raw) images. In some of the pictures the noise level seems to reach higher values than the 10-20% of noise used in the tests by the authors in the revision.
The vessel radii are estimated as averages across all vertices of the individual vessels: it is thus not possible to overlay them meaningfully in 2D slices: in Figure 2B, we do show a rendering of sample vessel-wise radii estimates.
- "We tested the centerline detection in Python, scipy (1.9.3) and Matlab. We found that the Matlab implementation performed better due to its inclusion of a branch length parameter for the identification of terminal branches, which greatly reduced the number of false branches; the Python implementation does not include this feature (in any version) and its output had many more such "hair" artifacts. Clearmap skeletonization uses an algorithm by Palagyi & Kuba(1999) to thin segmentation masks, which does not include hair removal. Vesselvio uses a parallelized version of the scipy implementation of Lee et al. (1994) algorithm which does not do hair removal based on a terminal branch length filter; instead, Vesselvio performs a threshold-based hair removal that is frequently overly aggressive (it removes true positive vessel branches), as highlighted by the authors."
This statement is wrong. The removal of small branches in skeletons is algorithmically independent of the skeletonization algorithm itself. The authors cite a reference concerned with the algorithm they are currently employing for the skeletonization. Careful assessment of that reference shows that this algorithm removes small length branches after skeletonization is performed. This feature is available in open-source packages as well, or could be easily implemented.
We appreciate that skeletonization is distinct from hair removal and have reworded this paragraph for clarity. We are currently working with SciPy developers to implement hair removal in their image processing pipeline so as to render our pipeline fully open-source.
The removal of hairs after skeletonization with length based thresholding leads to the possibility of removing parts of centerlines in the main part of vessels after branch points with hairs. The Matlab implementation does not do this and leaves the main branches intact.
This text has been updated to:
“Hair” segments shorter than 20 μm and terminal on one end were iteratively removed, starting with the shortest hairs and merging the longest hairs at junctions with 2 terminal branches with the main vessel branch to reduce false positive vascular branches and minimize the amount of centerlines removed. This iterative hair removal functionality of the skeletonization algorithm is currently unavailable in python, but is available in Matlab [9].
- "On the reviewer's comment, we did try inputting normalized images into Ilastik, but this did not improve its results." This is surprising. Reasonable standard preprocessing (e.g. background removal, equalization, and vessel enhancement) would probably restore most of illastik's performance in the indicated panel.
While the improvement may be present in a particular set of images, the generalizability of such improvement to other patches is often poor in our experience, as reflected by aforementioned results and the widespread uptake of DL approaches to image segmentation. The in vivo datasets also contain artifacts arising from eg. bleeding into the FOV that ilastik is highly sensitive to. This is an example of noise that is not easily removed by standard preprocessing.
- "Typical pre-processing/standard computer vision techniques with parameter tuning do not generalize on out-of-distribution data with different image characteristics, motivating the shift to DL-based approaches."
I disagree with this statement. DL approaches can generalize typically when trained with sufficient amount of diverse data. However, DL approaches can also fail with new out of distribution data. In that situation they only be 'rescued' via new time intensive data generation and retraining. Simple standard image pre-processing steps (e.g. to remove background or boost vessel structures) have well defined parameter that can be easily adapted to new out of distribution data as clear interpretations are available. The time to adapt those parameters is typically much smaller than retraining of DL frameworks.
We find that the standard image processing approaches with parameter tuning work robustly only if fine-tuned on individual images; i.e., the fine-tuning does not generalize across datasets. This approach thus does not scale to experiments yielding large image sizes/having high throughput experiments. While DL models may not generalize to out-of-distribution data, fine-tuning DL models with a small subset of labels generally produce superior models to parameter tuning that can be applied to entire studies. Moreover, DL fine-tuning is typically an efficient process due to very limited labelling and training time required.
- It is still unclear how the authors pipeline performs compared with other (open source or commercially) available pipelines. As indicated before, comparing to illastik, particularly when feeding non preprocessed data, does not seem to be a particularly high bar.
This question has also been raised by the other reviewer who asked to compare to commercially available pipelines.
This question was not answered by the authors, and instead the authors reply by claiming to provide an open source pipeline. In fact, the use of matlab in their pipeline does not make it fully open-source either. Moreover, as mentioned before, open-source pipelines for comparisons do exists.
As discussed above, the manuscript now includes comparisons to Imaris for segmentation and Vesselvio for graph extraction. The pipeline is on github.
-"We agree with the review that this question is interesting; however, it is not addressable using present data: activated neuronal firing will have effects on their postsynaptic neighbors, yet we have no means of measuring the spread of activation using the current experimental model."
Distances to the closest neuron in the manuscript are measured without checking if it's active. Thus, distances to the first set of n neurons could be measured in the same way, ignoring activation effects.
Shorter distances to an entire ensemble of neurons would still be (more) informative of metabolic demands.
This could indeed be done within the existing framework. The connected-components-3d can be used to extract individual occurrences of neurons in the FOV from the neuron segmentation mask. Each neuron could then have its distance calculated to each point on the vessel centerlines.
- model architecture:
It is unclear from the description if any positional encoding was used for the image patches.
It is unclear if the architecture / pipeline can handle any volume sizes or is trained on a fixed volume shapes? In the latter case how is the pipeline applied?
The model includes positional encoding, as described in Hatamizadeh et al. 2021.
The model can be applied to images of any size, as demonstrated on larger images in Supplementary Figure 9 and on smaller images in Supplementary Figure 2. The pipeline is applied in the same way. It will read in the size of an input image and output an image of the same size.
- transformer models often show better results when using a learning rate scheduler that adjust the learning rate (up and down ramps typically). Did the authors test such approaches?
We did not use a learning rate scheduler, as we found we were getting good results without using one.
- formula (4): The 95% percentile of two numbers is the max, and thus (5) is certainly not what the HD95 metric is. The formula is simply wrong as displayed.
Thank you. The formula has been updated.
- formula (5): formula 5 is certainly wrong: n_X, n_y are either integer numbers as indicated by the sum indices or sets when used in the distances, but can't be both at the same time.
Thank you for your comment. The Formula has been updated.
- The statement:
"this functionality of the skeletonization algorithm is currently unavailable in any python implementation, but is available in Matlab [56]."
is not correct (see reply above)
Please see the response above. This text has been updated to:
“Hair” segments shorter than 20 μm and terminal on one end were iteratively removed, starting with the shortest hairs and merging the longest hairs at junctions with 2 terminal branches with the main vessel branch to reduce false positive vascular branches and minimize the amount of centerlines removed. This iterative hair removal functionality of the skeletonization algorithm is currently unavailable in Python, but is available in Matlab [9].
- the centerline extraction is performed after taking the union of smoothed masks. The union operation can induce novel 'irregular' boundaries that degrade skeletonization performance. I would expect to apply smoothing after the union?
Indeed the images were smoothed via dilation after taking the union, as described in the previous set of responses to private comments.
- "The radius estimate defined the size of the Gaussian kernel that was convolved with the image to smooth the vessel: smaller vessels were thus convolved with narrower kernels."
It's unclear what image were filtered ?
We have updated this text for clarity:
The radius estimate defined the size of the Gaussian kernel that was convolved with the 2D image slice to smooth the vessel: smaller vessels were thus convolved with narrower kernels.
- Was deconvolution on the raw images applied or after Gaussian filtering ?
The deconvolution was applied before Gaussian filtering.
- ",we extracted image intensities in the orthogonal plane from the deconvolved raw registered image. A 2D Gaussian kernel with sigma equal to 80% of the estimated vessel-wise radius was used to low-pass filter the extracted orthogonal plane image and find the local signal intensity maximum searching, in 2D, from the center of the image to the radius of 10 pixels from the center."
Would it not be better to filter the 3d image before extracting a 2d plane and filter then ?
That could be done, but would incur a significant computational speed penalty. 2D convolutions are faster, and produced excellent accuracy when estimating radii in our bead experiment.
What algorithm was used to obtain the 2d images.
The 2d images were obtained using scipy.ndimage.map_coordinates.
- Figure 2: H is this the filtered image or the raw data ?
Panel H is raw data.
- It would be good to see a few examples of the raw data overlaid with the radial estimates to evaluate the approach (beyond the example in K).
Additional examples are shown in Figure 5.
- Figure 2 K: Why are boundary points greater than 2 standard deviations away from the mean excluded ?
They are excluded to account for irregularities as vessels approach junctions [10], [11] REF.
- Figure 2 L: what exactly is plotted here ? What are vertex wise changes, is that the difference between the minimum and maximum of all the detected radii for a single vertex? Why do some vessels (red) show high values consistently throughout the vessel ?
Figure 2L displays change in the radius of vertices - in this FOV- following photostimulation in relation to baseline.
- Assortativity: to calculate the assortativity, are radius changes binned in any form to account for the fact that otherwise, $e_{xy}$ and related measures will be likely be based on single data points?
Assortativity is not calculated from single data points. It can be calculated by either binning into categories or computing it on scalars i.e. average radius across a vessel segment:
See here for info on calculating assortativity from binned categories (ie classifying a vessel as a constrictor, dilator or non-responder):
And see here for calculating assortativity from scalar values:
We calculated the assortativity using scalar values.
In both cases, one uses all nodes and calculates the correlation between each node and its neighbours with an attribute that can be binned or is a scalar. Binning the value on a given node would not affect the number of nodes in a graph.
- "Ilastik tended to over-segment vessels, i.e. the model returned numerous false positives, having a high recall (0.89{plus minus}0.19) but low precision (0.37{plus minus}0.33) (Figure 3, Supplementary Table 3)."
As indicated before, and looking at Figure 4, over segmentation seems due to too high background. A suggested preprocessing step on the raw images to remove background could have avoided this.
The images were normalized in preprocessing.
- Figure 4: The 3d panels are not much easier to read in the revised version. As suggested by other reviewers, 2d sections indicating the differences and errors would be much more helpful to judge the pipelines quality more appropriately.
As discussed above, 2D sections are now available in a supplementary figure.
- Figure 3: What would be the dice score (and other measures) between two ground truths extracted by two annotations by two humans (assisted e.g. by illastik).
Two additional human rates annotated images. We observed a ICC of 0.73 across a total of three raters on the three images.
- Figure 5: The authors only provide the absolute value of SU for the sigma noise levels. This only has some meaning when compared to the mean or median SU of the images. In the text the maximal intensity of 1023 SU is mentioned, but what are those values in images with weaker / smaller vessels (as provided in the constriction examples in the revision)/
I am unclear why this validation figure should be part of the main manuscript while generalization performance is left out.
The manuscript has been updated with the mean SNR value of 5.05 ± 0.15 to provide context for the quality of our images.
Bibliography
(1) J. R. Bumgarner and R. J. Nelson, “Open-source analysis and visualization of segmented vasculature datasets with VesselVio,” Cell Rep. Methods, vol. 2, no. 4, Apr. 2022, doi: 10.1016/j.crmeth.2022.100189.
(2) G. Tetteh et al., “DeepVesselNet: Vessel Segmentation, Centerline Prediction, and Bifurcation Detection in 3-D Angiographic Volumes,” Front. Neurosci., vol. 14, Dec. 2020, doi: 10.3389/fnins.2020.592352.
(3) N. Holroyd, Z. Li, C. Walsh, E. Brown, R. Shipley, and S. Walker-Samuel, “tUbe net: a generalisable deep learning tool for 3D vessel segmentation,” Jul. 24, 2023, bioRxiv. doi: 10.1101/2023.07.24.550334.
(4) W. Tahir et al., “Anatomical Modeling of Brain Vasculature in Two-Photon Microscopy by Generalizable Deep Learning,” BME Front., vol. 2020, p. 8620932, Dec. 2020, doi: 10.34133/2020/8620932.
(5) R. Damseh, P. Delafontaine-Martel, P. Pouliot, F. Cheriet, and F. Lesage, “Laplacian Flow Dynamics on Geometric Graphs for Anatomical Modeling of Cerebrovascular Networks,” ArXiv191210003 Cs Eess Q-Bio, Dec. 2019, Accessed: Dec. 09, 2020. [Online]. Available: http://arxiv.org/abs/1912.10003
(6) T. Jerman, F. Pernuš, B. Likar, and Ž. Špiclin, “Enhancement of Vascular Structures in 3D and 2D Angiographic Images,” IEEE Trans. Med. Imaging, vol. 35, no. 9, pp. 2107–2118, Sep. 2016, doi: 10.1109/TMI.2016.2550102.
(7) T. B. Smith and N. Smith, “Agreement and reliability statistics for shapes,” PLOS ONE, vol. 13, no. 8, p. e0202087, Aug. 2018, doi: 10.1371/journal.pone.0202087.
(8) J. R. Mester et al., “In vivo neurovascular response to focused photoactivation of Channelrhodopsin-2,” NeuroImage, vol. 192, pp. 135–144, May 2019, doi: 10.1016/j.neuroimage.2019.01.036.
(9) T. C. Lee, R. L. Kashyap, and C. N. Chu, “Building Skeleton Models via 3-D Medial Surface Axis Thinning Algorithms,” CVGIP Graph. Models Image Process., vol. 56, no. 6, pp. 462–478, Nov. 1994, doi: 10.1006/cgip.1994.1042.
(10) M. Y. Rennie et al., “Vessel tortuousity and reduced vascularization in the fetoplacental arterial tree after maternal exposure to polycyclic aromatic hydrocarbons,” Am. J. Physiol.-Heart Circ. Physiol., vol. 300, no. 2, pp. H675–H684, Feb. 2011, doi: 10.1152/ajpheart.00510.2010.
(11) J. Steinman, M. M. Koletar, B. Stefanovic, and J. G. Sled, “3D morphological analysis of the mouse cerebral vasculature: Comparison of in vivo and ex vivo methods,” PLOS ONE, vol. 12, no. 10, p. e0186676, Oct. 2017, doi: 10.1371/journal.pone.0186676.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the previous reviews.
Reviewer 1:
The authors explain that an action potential that reach an axon terminal emits a small electrical field as it "annihilates". This happens even though there is no gap junction, at chemical synapses. The generated electrical field is simulated to show that it can affect a nearby, disconnected target membrane by tens of microvolts for tenths of a microsecond. Longer effects are simulated for target locations a few microns away.
To simulate action potentials (APs), the paper does not use the standard HodgkinHuxley formalism because it fails to explain AP collision. Instead it uses the Tasaki and Matsumoto (TM) model which is simplified to only models APs with three parameters and as a membrane transition between two states of resting versus excited. The authors expand the strictly binary, discrete TM method to a Relaxing Tasaki Model (RTM) that models the relaxation of the membrane potential after an AP. They find that the membrane leak can be neglected in determining AP propagation and that the capacitive currents dominate the process.
The strength of the work is that authors identified an important interaction between neurons that is neglected by the standard models. A weakness of the proposed approach is the assumptions that it makes. For instance, the external medium is modeled as a homogeneous conductive medium, which may be further explored to properly account for biological processes. To the authors’ credit, the external medium can be largely varying and could be left out from the general model, only to be modeled specific instances.
The authors provide convincing evidence by performing experiments to record action potential propagation and collision properties and then developing a theoretical framework to simulate effect of their annihilation on nearby membranes. They provide both experimental evidence and rigorous mathematical and computer simulation findings to support their claims. The work has a potential of explaining significant electrical interaction between nerve centers that are connected via a large number of parallel fibers.
Comments on revisions:
The authors responded to all of my previous concerns and significantly improved the manuscript.
We thank the reviewer for his comments and are pleased that we were able to adequately address all of his previous concerns. As a small comment to the remark of the reviewer “potential of explaining ... interaction ... via a large number of parallel fibers” we would like to add: The ephaptic coupling is prominent when APs annihilate at axon terminals, as we illustrate in Figure 4 and 5. Across parallel fibers, the impact of propagating APs is much lower but still may result in synchronization of APs.
Reviewer 2:
In this study, the authors measured extracellular electrical features of colliding APs travelling in different directions down an isolated earthworm axon. They then used these features to build a model of the potential ephaptic effects of AP annihilation, i.e. the electrical signals produced by colliding/annihilating APs that may influence neighbouring tissue. The model was then applied to some different hypothetical scenarios involving synaptic connections. In a revised version of the manuscript, it was also applied, with success, to published experimental data on the cerebellar basket cell-to-Purkinje cell pinceau connection. The conclusion is that an annihilating AP at a presynaptic terminal can emphatically influence the voltage of a postsynaptic cell (this is, presumably, the ’electrical coupling between neurons’ of the title), and that the nature of this influence depends on the physical configuration of the synapse.
As an experimental neuroscientist who has never used computational approaches, I am unable to comment on the rigour of the analytical approaches that form the bulk of this paper. The experimental approaches appear very well carried out, and the data showing equal conduction velocity of anti- and orthodromically propagating APs in every preparation is now convincing.
The conclusions drawn from the synaptic modelling have been considerably strengthened by the new Figure 5. Here, the authors’ model - including AP annihilation at a synaptic terminal - is used to predict the amplitude and direction of experimentally observed effects at the cerebellar basket cell-to-Purkinje cell synapse (Blot & Barbour 2014). One particular form of the model (RTM with tau=0.5ms and realistic non-excitability of the terminal) matches the experimental data extremely well. This is a much more convincing demonstration that the authors’ model of ephaptic effects can quantitatively explain key features of experimental data pertaining to synaptic function. As such, the implications for the relevance of ephaptic coupling at different synaptic contacts may be widespread and important.
However, it appears that all of the models in the new Fig5 involve annihilating APs, yet only one fits the data closely. A key question, which should be addressed if at all possible, is what happens to the predictive power of the best-fitting model in Fig5 if the annihilation, and only the annihilation, is removed? In other words, can the authors show that it is specifically the ephaptic effects of AP annihilation, rather than other ephaptic effects of, say AP waveform/amplitude/propagation, that explain the synaptic effects measured in Blot & Barbour (2014)? This would appear to be a necessary demonstration to fully support the claims of the title.
Reviewer 2 (Recommendations for the authors):
Can you clarify whether all models shown in Fig5 involve an annihilating AP? Is it possible to plot the predicted effects of the most successful model (RTM 0.5ms in B) with *only* the annihilation selectively removed?
We are grateful for the reviewer’s comments and the specific suggestion for improvement (’...can the authors show that it is specifically the ephaptic effects of AP annihilation, rather than other ephaptic effects...’). For illustrating the importance of annihilation, we added the results of our calculation when no annihilation occurs, i.e. for propagating APs in the source neuron (Figure 5A) and we modified the geometry of the source neuron in Figure 5B such that only the annihilation takes place. Together with the source neuron with similar properties to the Basket cell (Figure 5C), we now show the effect of annihilation and the effect of Basket cell specific geometry and physiology. We added and edited in the main text the following 4 sentences:
ll 271: In our two models (TM and RTM), the modulation of not terminating but propagating APs along the source axon on the AP rate of the target cell is minute (Figure 5A). Note that this geometry does not correspond to the Purkinje cell-Basket cell connectivity. For annihilating APs at the axon terminal, with excitable segments up to the very end, our models reveal a moderate modulation, and only about half of what was reported for the Purkinje cell by Blot and Barbour (2014). This illustrates the importance of AP annihilation for ephaptic coupling (Figure 5B). We added and edited the figure legend:
Figure 5. ... (A) excluding the annihilation of an AP at the source neuron, i.e. a propagating AP, cause only minute modulation of the predicted AP rate in the target neuron. Note that this example does not represent the Basket cell terminal with annihilating APs. (B) annihilation of an AP at the terminal of the source neuron, with all segments being excitable in our calculation, cause moderate modulation. (C) source neuron with similar properties to the Basket cell, i.e. a bouton and last segments non-excitable (corresponding to 15 µm with no switch from resting state to excited state), cause inhibition and rebound that is very similar as described by Blot and Barbour (2014).
In the discussion, we extended one sentence to refer to Figure 5:
ll 346: This may cause synchronization of APs and our proposed model also can be used to study the observed phenomena of synchronization due to ephaptic coupling, even in the case of zero discharge (see Figure 4A, and local impact on the target, integrated on timescales >1 ms in Figure 5).
-
-
-
Author response:
The following is the authors’ response to the previous reviews.
We sincerely appreciate the time and effort you and the reviewers have invested in evaluating our work.
We are grateful for the constructive criticism of the reviewers. Building up on their feedback we have made additions to the reviewed preprint. Specifically, we have added information to the supplementary materials to give additional context on the impact of the fixed experimental design on infants’ looking behavior. Further, we have adapted the text throughout the manuscript to incorporate a thorough discussion of the impact of the experimental design.
We believe that these revisions and the inclusion of supplementary analyses provide a clearer understanding of our findings.
Tags
Annotators
URL
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
The authors observed a decline in autophagy and proteasome activity in the context of Milton knockdown. Through proteomic analysis, they identified an increase in the protein levels of eIF2β, subsequently pinpointing a novel interaction within eIF subunits where eIF2β contributes to the reduction of eIF2α phosphorylation levels. Furthermore, they demonstrated that overexpression of eIF2β suppresses autophagy and leads to diminished motor function. It was also shown that in a heterozygous mutant background of eIF2β, Milton knockdown could be rescued. This work represents a novel and significant contribution to the field, revealing for the first time that the loss of mitochondria from axons can lead to impaired autophagy function via eIF2β, potentially influencing the acceleration of aging. To further support the authors' claims, several improvements are necessary, particularly in the methods of quantification and the points that should be demonstrated quantitatively. It is crucial to investigate the correlation between aging and the proteins eIF2β and eIF2α.
Thank you so much for your review and comments. We included analyses of protein levels of eIF2α, eIF2β, and eIF2γ at 7 days and 21 days (Figure 4D). The manuscript was revised as below;
Lines 242-245 ‘As for the other subunits of eIF2 complex, proteome analysis did not detect a significant difference in the protein levels of eIF2α and eIF2γ between milton knockdown and control flies at 7 and 21 days (Figure 4D).’
Reviewer #2 (Public Review):
In the manuscript, the authors aimed to elucidate the molecular mechanism that explains neurodegeneration caused by the depletion of axonal mitochondria. In Drosophila, starting with siRNA depletion of Milton and Miro, the authors attempted to demonstrate that the depletion of axonal mitochondria induces the defect in autophagy. From proteome analyses, the authors hypothesized that autophagy is impacted by the abundance of eIF2β and the phosphorylation of eIF2α. The authors followed up the proteome analyses by testing the effects of eIF2β overexpression and depletion on autophagy. With the results from those experiments, the authors proposed a novel role of eIF2β in proteostasis that underlies neurodegeneration derived from the depletion of axonal mitochondria.
The manuscript has several weaknesses. The reader should take extra care while reading this manuscript and when acknowledging the findings and the model in this manuscript.
The defect in autophagy by the depletion of axonal mitochondria is one of the main claims in the paper. The authors should work more on describing their results of LC3-II/LC3-I ratio, as there are multiple ways to interpret the LC3 blotting for the autophagy assessment. Lysosomal defects result in the accumulation of LC3-II thus the LC3-II/LC3-I ratio gets higher. On the other hand, the defect in the early steps of autophagosome formation could result in a lower LC3-II/LC3-I ratio. From the results of the actual blotting, the LC3-I abundance is the source of the major difference for all conditions (Milton RNAi and eIF2β overexpression and depletion). In the text, the authors simply state the observation of their LC3 blotting. The manuscript lacks an explanation of how to evaluate the LC3-II/LC3-I ratio. Also, the manuscript lacks an elaboration on what the results of the LC3 blotting indicate about the state of autophagy by the depletion of axonal mitochondria.
Thank you for pointing it out, and we apologize for an insufficient description of the result. We included quantitation of the levels of LC3-I and LC3-II in Figure 2A, 2D, 3D, 6B and 7B. As the reviewer pointed out, changes in the LC3-II/LC3-I ratio do not necessarily indicate autophagy defects. However, since p62 accumulation (Figure 2B, 2E, 3E, 6C, 7C in the original manuscript), these results collectively suggest that autophagy is lowered. We revised the manuscript to include this discussion as below:
Lines 174-186 ‘During autophagy progression, LC3 is conjugated with phosphatidylethanolamine to form LC3-II, which localizes to isolation membranes and autophagosomes. LC3-I accumulation occurs when autophagosome formation is impaired, and LC3-II accumulation is associated with lysosomal defects(31,32). p62 is an autophagy substrate, and its accumulation suggests autophagic defects(31,32). We found that milton knockdown increased LC3-I, and the LC3-II/LC3-I ratio was lower in milton knockdown flies than in control flies at 14-day-old (Figure 2A). We also analyzed p62 levels in head lysates sequentially extracted using detergents with different stringencies (1% Triton X-100 and 2% SDS). Western blotting revealed that p62 levels were increased in the brains of 14-day-old of milton knockdown flies (Figure 2B). The increase in the p62 level was significant in the Triton X-100-soluble fraction but not in the SDS-soluble fraction (Figure 2B), suggesting that depletion of axonal mitochondria impairs the degradation of less-aggregated proteins.’
Line 189-190 : ‘At 30 day-old, LC3-I was still higher, and the LC3-II/LC3-I ratio was lower, in milton knockdown compared to the control (Figure 2D).’
Line 199-201: ‘However, in contrast with milton knockdown, Pfk knockdown did not affect the levels of LC3-I, LC3-II or the LC3-II/LC3-I ratio (Figure 3D).’
Line 275-281: ‘Neuronal overexpression of eIF2β increased LC3-II, while the LC3-II/LC3-I ratio was not significantly different (Figure 6A and B). Overexpression of eIF2β significantly increased the p62 level in the Triton X-100-soluble fraction (Figure 6C, 4-fold vs. control, p < 0.005 (1% Triton X-100)) but not in the SDS-soluble fraction (Figure 6C, 2-fold vs. control, p = 0.062 (2% SDS)), as observed in brains of milton knockdown flies (Figure 2B). These data suggest that neuronal overexpression of eIF2β accumulates autophagic substrates.’
Line 307-315: ‘Neuronal knockdown of milton causes accumulation of autophagic substrate p62 in the Triton X-100-soluble fraction (Figure 2B), and we tested if lowering eIF2β ameliorates it. We found that eIF2β heterozygosity caused a mild increase in LC3-I levels and decreases in LC3-II levels, resulting in a significantly lower LC3-II/LC3-I ratio in milton knockdown flies (Figure 7B). eIF2β heterozygosity decreased the p62 level in the Triton X-100-soluble fraction in the brains of milton knockdown flies (Figure 7C). The p62 level in the SDS-soluble fraction, which is not sensitive to milton knockdown (Figure 2B), was not affected (Figure 7C). These results suggest that suppression of eIF2β ameliorates the impairment of autophagy caused by milton knockdown.’
Another main point of the paper is the up-regulation of eIF2β by depleting the axonal mitochondria leads to the proteostasis crisis. This claim is formed by the findings from the proteome analyses. The authors should have presented their proteomic data with much thorough presentation and explanation. As in the experiment scheme shown in Figure 4A, the author did two proteome analyses: one from the 7-day-old sample and the other from the 21-day-old sample. The manuscript only shows a plot of the result from the 7-day-old sample, but that of the result from the 21-day-old sample. For the 21-day-old sample, the authors only provided data in the supplemental table, in which the abundance ratio of eIF2β from the 21-day-old sample is 0.753, meaning eIF2β is depleted in the 21-day-old sample. The authors should have explained the impact of the eIF2β depletion in the 21-day-old sample, so the reader could fully understand the authors' interpretation of the role of eIF2β on proteostasis.
Thank you for pointing it out. We included plots of the results of 21-day-old proteome as a part of the main figure (Figure 4C). As the reviewer pointed out, eIF2β protein levels are reduced at the 21-day-old. Since a reduction in the eIF2_β_ ameliorated milton knockdown-induced locomotor defects in aged flies (Figure 7D), the reduction in eIF2β observed in the 21-day-old milton knockdown flies is not likely to negatively contribute to milton knockdown-induced defects. We included this discussion in the manuscript as below:
Lines 337-341:‘eIF2β protein levels are reduced at the 21-day-old; however, since a reduction in the eIF2β ameliorated milton knockdown-induced locomotor defects in aged flies (Figure 7), the reduction in eIF2β observed in the 21-day-old is not likely to negatively contribute to milton knockdown-induced defects.’
The manuscript consists of several weaknesses in its data and explanation regarding translation.
(1) The authors are likely misunderstanding the effect of phosphorylation of eIF2α on translation. The P-eIF2α is inhibitory for translation initiation. However, the authors seem to be mistaken that the down-regulation of P-eIF2α inhibits translation.
We are sorry for our insufficient explanation in the previous version. As the reviewer pointed out, it is well known that the phosphorylated form of eIF2α inhibits translation initiation. Neuronal knockdown of milton caused a reduction in p-eIF2α (Figure 4J and K), and it also lowered translation (Figure 5); the relationship between these two events is currently unclear. We do not think that a reduction in the p-eIF2α suppressed translation; rather, we propose that the unbalance of expression levels of the components of eIF2 complexes negatively affects translation. We revised discussion sections to describe our interpretation more in detail as below:
Line 368-378: ‘eIF2β is a component of eIF2, which meditates translational regulation and ISR initiation. When ISR is activated, phosphorylated eIF2α suppresses global translation and induces translation of ATF4, which mediates transcription of autophagy-related genes(39,40). Since ISR can positively regulate autophagy, we suspected that suppression of ISR underlies a reduction in autophagic protein degradation. We found neuronal knockdown of milton reduced phosphorylated eIF2α, suggesting that ISR is reduced (Figure 4). However, we also found that global translation was reduced (Figure 5). It may be possible that increased levels of eIF2β disrupt the eIF2 complex or alter its functions. The stoichiometric mismatch caused by an imbalance of eIF2 components may inhibit ISR induction. Supporting this model, we found that eIF2β upregulation reduced the levels of p-eIF2α (Figure 6).’
We have revised the graphical abstract and removed the eIF2 complex since its role in the loss of proteostasis caused by milton knockdown has not been elucidated yet.
(2) The result of polysome profiling in Figure 4H is implausible. By 10%-25% sucrose density gradient, polysomes are not expected to be observed. The authors should have used a gradient with much denser sucrose, such as 10-50%.
Thank you for pointing it out. It was a mistake of 10-50%, and we apologize for the oversight. It was corrected (Figure 5).
(3) Also on the polysome profiling, as in the method section, the authors seemed to fractionate ultra-centrifuged samples from top to bottom and then measured A260 by a plate reader. In that case, the authors should have provided a line plot with individual data points, not the smoothly connected ones in the manuscript.
Thank you for pointing it out. We revised the graph (Figure 5).
(4) For both the results from polysome profiling and puromycin incorporation (Figure 4H and I), the difference between control siRNA and Milton siRNA are subtle, if not nonexistent. This might arise from the lack of spatial resolution in their experiment as the authors used head lysate for these data but the ratio of Phospho-eIF2α/eIF2α only changes in the axons, based on their results in Figure 4E-G. The authors could have attempted to capture the spatial resolution for the axonal translation to see the difference between control siRNA and Milton siRNA.
Thank you for your comment. We agree that it would be an interesting experiment, but it will take a considerable amount of time to analyze axonal translation with spatial resolution. We will try to include such analyses in the future. For this manuscript, we revised the discussion section to include the reviewer's suggestion as below;
Lines 351-353: ‘Further analyses to dissect the effects of milton knockdown on proteostasis and translation in the cell body and axon by experiments with spatial resolution would be needed.’
Recommendations for the authors:
From the Reviewing Editor:
As the Reviewing Editor, I have read your manuscript and the associated peer reviews. I have concerns about publishing this work in its current form. I think that your manuscript cannot claim to have found a novel function of eIF2beta because of technical uncertainties and conceptual problems that should be addressed.
Thank you so much for your review and comments. We addressed all the concerns raised by the reviewers. Point-by-point responses are listed below.
First, your manuscript is based partly on what appears to be a mistaken understanding of the mechanistic basis of the ISR. Specifically, eIF2 is a heterotrimeric complex of alpha, beta, and gamma subunits. When eIF2a is phosphorylated, the heterotrimer adopts a new conformation. This conformation directly binds and inhibits eIF2B, the decameric GEF that exchanges the GDP bound to the gamma subunit of the eIF2 complex for GTP. Unless I misunderstood your paper, you seem to propose that decreasing levels of phospho-eIF2a will inhibit translation, but this is backward from what we know about the ISR.
Thank you for your insightful comment, and we are sorry for the confusion. We did not mean to propose that decreasing levels of phospho-eIF2_a_ inhibits translation. We apologize for our insufficient explanation, which might have caused a misunderstanding (Lines 312-318 in the original version). We agree with the reviewer that ‘mismatch due to elevated eIF2-beta could change the behavior of the ISR’. We revised the text in the result section as follows:
Lines 259-264 (in the Result section) ‘Phosphorylation of eIF2α induces conformational changes in the eIF2 complex and inhibits global translation(36). To analyze the effects of milton knockdown on translation, we performed polysome gradient centrifugation to examine the level of ribosome binding to mRNA. Since p-eIF2α was downregulated, we hypothesized that milton knockdown would enhance translation. However, unexpectedly, we found that milton knockdown significantly reduced the level of mRNAs associated with polysomes (Figure 5A and B).’
Lines 368-378 (in the Discussion section): ‘eIF2β is a component of eIF2, which meditates translational regulation and ISR initiation. When ISR is activated, phosphorylated eIF2α suppresses global translation and induces translation of ATF4, which mediates transcription of autophagy-related genes(39,40). Since ISR can positively regulate autophagy, we suspected that suppression of ISR underlies a reduction in autophagic protein degradation. We found neuronal knockdown of milton reduced phosphorylated eIF2α, suggesting that ISR is reduced (Figure 4). However, we also found that global translation was reduced (Figure 5). It may be possible that increased levels of eIF2β disrupt the eIF2 complex or alter its functions. The stoichiometric mismatch caused by an imbalance of eIF2 components may inhibit ISR induction. Supporting this model, we found that eIF2β upregulation reduced the levels of p-eIF2α (Figure 6).’
It may be possible that a stoichiometric mismatch due to elevated eIF2-beta could change the behavior of the ISR, but your paper doesn't adequately address the expression levels of all three eIF2 subunits: alpha, beta, and gamma. The proteomic data shown in Fig 4B is unconvincing on its own because the changes in the beta subunit are subtle. The Western blot in Figure 4C suggests that the KD changes the mass or mobility of the beta subunit, and most importantly, there are no Western blots measuring the levels of eIF2a, eIF2a-phospho, or eIF2-gamma.
We appreciate the reviewer’s comment and agree that the stoichiometric mismatch due to elevated eIF2β may interfere with ISR. We found overexpression of eIF2β lowered p-eIF2 alpha (Figure S2 in V1), which supports this model. We included this data in the main figure in the revised manuscript (Figure 6D) and revised the text as below:
Lines 279-281: ‘Since milton knockdown reduced the p-eIF2α level (Figure 4K), we asked whether an increase in eIF2β affects p-eIF2α. Neuronal overexpression of eIF2β did not affect the eIF2α level but significantly decreased the p-eIF2α level (Figure 6D, E).’
Expression data of eIF2α and eIF2γ from proteomic analyses has been extracted from proteome analyses and included as a table (Figure 4D). Western blots of phospho-eIF2a (Figure S1 in V1) in the main figure (Figure 4G). The result section was revised as below;
Lines 242-245: ‘As for the other subunits of eIF2 complex, proteome analysis did not detect a significant difference in the protein levels of eIF2α and eIF2γ between milton knockdown and control flies at 7 and 21 days (Figure 4D).’
Reviewer #1 (Recommendations For The Authors):
L125-128: In this section, while the efficiency of Milton knockdown is referenced from a previous publication, it is necessary to also mention that the Miro knockdown has been similarly reported in the literature. Additionally, the Methods section lacks details on the Miro RNAi line used, and Table 2 does not include the genotype for Miro RNAi. This information should be included for clarity and completeness.
Thank you for pointing it out. Knockdown efficiency with this strain has been reported (Iijima-Ando et al., PLoS Genet, 2012). We revised the text to include citation and knockdown efficiency as follows:
Lines 139-147: ‘There was no significant increase in ubiquitinated proteins in milton knockdown flies at 1-day old, suggesting that the accumulation of ubiquitinated proteins caused by milton knockdown is age-dependent (Figure S1). We also analyzed the effect of the neuronal knockdown of Miro, a partner of milton, on the accumulation of ubiquitin-positive proteins. Since severe knockdown of Miro in neurons causes lethality, we used UAS-Miro RNAi strain with low knockdown efficiency, whose expression driven by elav-GAL4 caused 30% reduction of Miro mRNA in head extract(24). Although there was a tendency for increased ubiquitin-positive puncta in Miro knockdown brains, the difference was not significant (Figure 1B, p>0.05 between control RNAi and Miro RNAi). These data suggest that the depletion of axonal mitochondria induced by milton knockdown leads to the accumulation of ubiquitinated proteins before neurodegeneration occurs.’
L132-L136: The current phrasing in this section suggests an increase in ubiquitinated proteins for both Milton and Miro knockdowns. However, since there is no significant difference noted for Miro, it is incorrect to state an increase in ubiquitin-positive puncta. Furthermore, combining the results of Milton knockdown to claim an increase in ubiquitinated proteins prior to neurodegeneration is misleading. At the very least, the expression here needs to be moderated to accurately reflect the findings.
Thank you for pointing it out. We revised the text as above.
L137-L141: Results in Figure 1 indicate that Milton knockdown leads to an increase in ubiquitinated proteins at 14 days, while Miro knockdown shows no difference from the control at either 14 or 30 days. Conversely, both the control and Miro exhibit an increase in ubiquitinated proteins with aging, but this trend does not seem to apply to Milton knockdown. This observation suggests that Milton KD may not affect the changes in protein quality control associated with aging. It implies that Milton's function might be more related to protein homeostasis in younger cells, or that changes due to aging might overshadow the effects of Milton knockdown. These interpretations should be included in the Results or Discussion sections for a more comprehensive analysis.
Thank you for your insightful comment. We revised the text to include those points as follows:
Lines 152-153: ‘These results suggest that depletion of axonal mitochondria may have more impact on proteostasis in young neurons than in old neurons.’
Lines 355-362: ‘The depletion of axonal mitochondria and accumulation of abnormal proteins are both characteristics of aged brains(37,38). Our results suggest that the loss of axonal mitochondria is an event upstream of proteostasis collapse during aging. Neuronal knockdown of milton had more impact on proteostasis in young neurons than the old neurons (Figure 1). Proteome analyses also showed that age-related pathways, such as immune responses, are enhanced in young flies with milton knockdown (Table 2). The reduction in axonal transport of mitochondria may be one of the triggering events of age-related changes and accelerates the onset of aging in the brain.’
L143 : Please remove the erroneously included quotation mark.
Thank you for pointing it out. We corrected it.
L145-L147:
- While it is understood that Milton knockdown results in a reduction of mitochondria in axons, as reported previously and seemingly indicated in Figure 1E, this paper repeatedly refers to axonal depletion of mitochondria. Therefore, it would be beneficial to quantitatively assess the number of mitochondria in the axonal terminals located in the lamina via electron microscopy. Such quantification would robustly reinforce the argument that mitochondrial absence in axons is a consequence of Milton knockdown.
Thank you for pointing it out. We included quantitation of the number of mitochondria in the synaptic terminals (Figure 1E).
The text and figure legend was revised accordingly:
Lines 156-157: ‘As previously reported(24), the number of mitochondria in presynaptic terminals decreased in milton knockdown (Figure 1E).’
- The knockdown of Milton is known to reduce mitochondrial transport from an early stage, but what about swelling? By observing swelling at 1 day and 14 days, it may be possible to confirm the onset of swelling and discuss its correlation with the accumulation of ubiquitinated proteins.
Quantitation of axonal swelling has also been included (Figure 1F).
We appreciate reviewer’s comments on the correlation between the accumulation of ubiquitinated proteins and axonal swelling. Axonal swelling was not observed at 3-days-old (Iijima-Ando et al., PLoS Genetics, 2012), indicating that axonal swelling is an age-dependent event. Dense materials are found in swollen axons more often than in normal axons, suggesting a positive correlation between disruption of proteostasis and axonal damage. It would be interesting to analyze the time course of events further; however, we feel it is beyond the scope of this manuscript. We revised the text as below to include this discussion:
Lines 157-159: ‘The swelling of presynaptic terminals, characterized by the enlargement and roundness, was not reported at 3-day-old(24) but observed at this age with about 4% of total presynaptic terminals (Figure 1F, asterisks).’
Lines 162-167: ‘Dense materials are rarely found in age-matched control neurons, indicating that milton knockdown induces abnormal protein accumulation in the presynaptic terminals (Figure 1G and H). In milton knockdown neurons, dense materials are found in swollen presynaptic terminals more often than in presynaptic terminals without swelling, suggesting a positive correlation between the disruption of proteostasis and axonal damage (Figure 1G).’
Lines 362-365: ‘Disruption of proteostasis is expected to contribute neurodegeneration(38), and it would be interesting to analyze the sequence of protein accumulation and axonal degeneration in milton knockdown ((24,29) and Figure 1) in detail with higher time resolution.’
L147-L151: Though Figures 1F and 1G provide qualitative representations, it is advisable to quantitatively assess whether dense materials significantly accumulate. Such quantitative analysis would be required to verify the accumulation of dense materials in the context of the study.
Thank you for pointing it out. We included quantitation of the number of neurons with dense material (Figure 1G). We revised the manuscript as follows:
Line 161-163: ‘Dense materials are rarely found in age-matched control neurons, indicating that milton knockdown induces abnormal protein accumulation in the presynaptic terminals (Figure 1G and H).’
Regarding Figure 1B, C:
- Even though the count of puncta in the whole brain appears to be fewer than 400, the magnification of the optic lobe suggests a substantial presence of puncta. Please clarify in the Methods section what constitutes a puncta and whether the quantification in the whole brain is based on a 2D or 3D analysis. Detail the methodology used for quantification.
Thank you for your comment. We revised the method section to include more details as below:
Lines 434-437: ‘Quantitative analysis was performed using ImageJ (National Institutes of Health) with maximum projection images derived from Z-stack images acquired with same settings. Puncta was identified with mean intensity and area using ImageJ.’
- What about 1-day-old specimens? Does Milton knockdown already show an increase in ubiquitinated protein accumulation at this early stage? Investigating whether ubiquitin-protein accumulation is involved in aging promotion or is already prevalent during developmental stages is a necessary experiment.
Thank you for your comment. We carried out immunostaining with an anti-ubiquitin antibody in the brains at 1-day-old. No significant difference was detected between the control and milton knockdown. This result has been included as Figure S1 in the revised manuscript. The result section was revised as below:
Line 136-139 ‘There was no significant increase in ubiquitinated proteins in milton knockdown flies at 1-day old, suggesting that the accumulation of ubiquitinated proteins caused by milton knockdown is age-dependent (Figure S1).’
For Figure 1E: In the Electron Microscopy section of the Methods, define how swollen axons were identified and describe the quantification methodology used.
Thank you for your comment. Swollen axons are, unlike normal axons, round in shape and enlarged. We revised the text as below;
Lines 157-160: ‘The swelling of presynaptic terminals, characterized by the enlargement and roundness, was not reported at 3-day-old(24) but observed at this age with about 4% of total presynaptic terminals (Figure 1F, asterisks).’
Lines 683-684, Figure 1 legend: ‘Swollen presynaptic terminals (asterisks in (F)), characterized by the enlargement and higher circularity, were found more frequently in milton knockdown neurons.’
L218-L219: Throughout the text, the expression 'eIF2β is "upregulated" in response to Milton knockdown' is frequently used. However, considering the presented results, it might be more accurate to interpret that under the condition of Milton knockdown, eIF2β is not undergoing degradation but rather remains stable.
Thank you for pointing it out. We replaced ‘upregulated’ with ‘increased’ throughout the text.
L234-L235: On what basis is the conclusion drawn that there is a reduction? Given that three experiments have been conducted, it would be possible and more convincing to quantify the results to determine if there is a significant decrease.
Thank you for pointing it out. We quantified the AUC of polysome fraction and carried out statistical analysis. There is a significant decrease in polysome in milton knockdown, and this result has been included in Figure 5B. We revised the figure and the legend accordingly.
L236: 5H-> 4H
Thank you for pointing it out, and we are sorry for the confusion. We corrected it.
L238-L239: Since there is no significant difference observed, it may not be accurate to interpret a reduction in puromycin incorporation.
Thank you for pointing it out. As described above, quantification of polysome fractions showed that milton knockdown significantly reduce polysome (Figure 5B). We revised the manuscript as below;
Lines 263-264: ‘However, unexpectedly, we found that milton knockdown significantly reduced the level of mRNAs associated with polysomes (Figure 5A and B).’
Figure 5D and Figure 6D: Climbing assays have been conducted, but I believe experiments should also be performed to examine whether overexpression or heterozygous mutants of eIF2β induce or suppress degeneration.
Thank you for pointing it out. We analyzed the eyes with eIF2_β_ overexpression for neurodegeneration. Although there was a tendency of elevated neurodegeneration in the retina with eIF2_β_ overexpression, the difference between control and eIF2_β_ overexpression did not reach statistical significance (Figure S2). This result has been included as Figure S2 in the revised manuscript, and the following sentences have been included in the text:
Lines 288-293: ‘We asked if eIF2β overexpression causes neurodegeneration, as depletion of axonal mitochondria in the photoreceptor neurons causes axon degeneration in an age-dependent manner(24). eIF2β overexpression in photoreceptor neurons tends to increase neurodegeneration in aged flies, while it was not statistically significant (p>0.05, Figure S2).’
L271-L272: The results in Figure 6B are surprising. I anticipated a greater increase compared to the Milton knockdown alone. While p62 appears to be reduced, it is not clear why these results lead to the conclusion that lowering eIF2β rescues autophagic impairment. Please add a discussion section to address this point.
Thank you for pointing it out. We apologize for the unclear description of the result. Milton knockdown flies show p62 accumulation (Figure 2), and deleting one copy of eIF2beta in milton knockdown background reduced p62 accumulation (Figure 7C). We revised the text as below:
Lines 307-315: ‘Neuronal knockdown of milton causes accumulation of autophagic substrate p62 in the Triton X-100-soluble fraction (Figure 2B), and we tested if lowering eIF2β ameliorates it. We found that eIF2β heterozygosity caused a mild increase in LC3-I levels and decreases in LC3-II levels, resulting in a significantly lower LC3-II/LC3-I ratio in milton knockdown flies (Figure 7B). eIF2β heterozygosity decreased the p62 level in the Triton X-100-soluble fraction in the brains of milton knockdown flies (Figure 7C). The p62 level in the SDS-soluble fraction, which is not sensitive to milton knockdown (Figure 2B), was not affected (Figure 7C). These results suggest that suppression of eIF2β ameliorates the impairment of autophagy caused by milton knockdown.’
L369: Please specify the source of the anti-ubiquitin antibody used.
Thank you for pointing it out. We included the antibody information in the method section.
Figure 7: While the relationship between Milton knockdown and the eIF2β and eIF2α proteins has been elucidated through the authors' efforts, I would like to see an investigation into whether eIF2β is upregulated and eIF2α phosphorylation is reduced in simply aged Drosophila. This would help us understand the correlation between aging and eIF2 protein dynamics.
Thank you for your comment. We agree that it is an important question, and we are working on it. However, we feel that it is beyond the scope of the current manuscript.
L645-L646: If the mushroom body is identified using mito-GFP, then include mito-GFP in the genotype listed in Supplementary Table 2.
We are sorry for the oversight. We corrected it in Supplementary Table 2.
Additionally, while it is presumed that the mito-GFP signal decreases in axons with Milton RNAi, how was the lobe tips area accurately selected for analysis? Please include these details along with a comprehensive description of the quantification methodology in the Methods section.
Thank you for your comment. Although the mito-GFP signal in the axon is weak in the milton knockdown neurons, it is sufficient to distinguish the mushroom body structure from the background. We revised the method section to include this information in the method section:
Line 437-438: ‘For eIF2α and p-eIF2α immunostaining, the mushroom body was detected by mitoGFP expression.’
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Point-by-point response to the public review:
General Comment: “Using computational modeling, this manuscript explores the effect of growth feedback on the performance of gene networks capable of adaptation. The authors selected 425 hypothetical synthetic circuits that were shown to achieve nearly perfect adaptation in two earlier computational studies (see Ma et al. 2009, and Shi et al. 2017). They examined the effects of cell growth feedback by introducing additional terms to the ordinary differential equation-based models, and performed numerical simulations to check the retainment and the loss of the adaptation responses of the circuits in the presence of growth feedback. The authors show that growth feedback can disrupt the gene network adaptation dynamics in different ways, and report some exceptional core motifs which allow for robust performance in the presence of growth feedback. They also used a metric to establish a scaling law between a circuit robustness measure and the strength of growth feedback. These results have important implications in the field of synthetic biology, where unforeseen interactions between designed gene circuits and the host often disrupt the desired behavior. The paper’s conclusions are supported by their simulation results, although these are presented in their summary formats and it would be useful for the community if the detailed results for each topology were available as a supplementary file or through the authors’ GitHub repository.”
We are grateful for the referee’s positive evaluation of our work. We have updated our GitHub and OSF repositories with detailed results for each topology. Additionally, we have included other simulation codes, result data, and detailed explanations in these two repositories that may be of interest to our readers.
Strength 1: “This work included a detailed investigation of the reasons for adaptation failure upon introducing cell growth to the systems. The comprehensiveness of the analysis makes the work stand out among studies of functional screening of network topologies of gene regulation.”
We are grateful for the referee’s positive assessment of our work, notably the recognition of the ‘detailed investigation’ we conducted, and the ‘comprehensiveness of the analysis’ we provided.
Strength 2: “The authors’ approaches for assessment of robustness, such as the survival ratio Q, can be useful for a wide range of topologies beyond adaptation. The scaling law obtained with those approaches is interesting.”
We are grateful for the referee’s positive evaluation of our defined factors for assessing circuit robustness. We also appreciate the acknowledgment of the “interesting” nature of the scaling law we discovered using the assessment factor R.
Weaknesses 1: “The title suggests that the work investigates the ’effects of growth feedback on gene circuits’. However, the performance of ’nearly perfect adaptation’ was chosen for the majority of the work, leaving the question of whether the authors’ conclusion regarding the effects of growth feedback is applicable to other functional networks.”
We agree that our present title can be too broad, and we have changed it from “Effects of growth feedback on gene circuits: A dynamical understanding” to “Effects of growth feedback on adaptive gene circuits: A dynamical understanding”. Although we have some brief results and discussions on the gene circuits with bistability, we admit that most of our results and discussions are focused on circuits that have adaptation.
The new title is more specific and should be a more appropriate summary of the paper.
Weaknesses 2: “This work relies extensively on an earlier study, evaluating only a selected set of 425 topologies that were shown to give adaptive responses (Shi et al., 2017). This limited selection has two potential issues. First, as the authors mentioned in the introduction, growth feedback can also induce emerging dynamics even without existing function-enabling gene circuits, as an example of the ”effects of growth feedback on gene circuits”. Limiting the investigation to only successful circuits for adaptation makes it unclear whether growth feedback can turn the circuits that failed to produce adaptation by themselves into adaptation-enabling circuits. Secondly, as the Shi et al. (2017) study also used numerical experiments to achieve their conclusions about successful topologies, it is unclear whether the numerical experiments in the present study are compatible with the earlier work regarding the choice of equation forms and ranges of parameter values. The authors also assumed that all readers have sufficient understanding of the 425 topologies and their derivation before reading this paper.”
We agree with the reviewer that several issues need to be clarified in our new manuscript. We have added new discussions for all of them.
We agree with the reviewer that growth feedback could turn the non-adaptive circuits into adaptationenabling circuits, and this indeed presents a compelling topic for future research. We have added the following discussions to our paper, talking about a relevant matter. We find that in our simulated dataset, there are cases where a higher degree of growth feedback can restore the adaptation that has been lost in a circuit. However, as we discussed in this new paragraph, a comprehensive study in the direction of turning non-adaptive circuits into adaptation-enabling circuits will “require entirely different approaches for sampling circuit parameters and selecting candidate network topologies, demanding significantly high computational costs.” Given that this topic extends beyond the scope of the current paper, we leave this matter to future research.
“Although the primary focus of this paper is on how growth feedback can undermine an originally adaptive circuit and how to design circuits that are robust against such feedback, our simulated dataset reveals instances where growth feedback can benefit the circuit within certain ranges. Specifically, we identified 2,092 circuits across 306 different topologies where adaption, lost at an intermediate level of growth feedback, is restored at higher levels. This is 1.4% of all circuits tested. We anticipate that additional circuits exhibiting this loss-and-recovery behavior exist, as our sampling of six discrete levels of k<sub>g</sub> (0,0.2,0.4,0.6,0.8,1.0) might have overlooked numerous cases. This result again suggests the possible advantages of growth feedback in gene circuits (Tan et al., 2009; Nevozhay et al., 2012; Deris et al., 2013; Feng et al., 2014; Melendez-Alvarez and Tian, 2022). A comprehensive study into how growth feedback can endow or enhance adaption in circuits would require entirely different approaches for sampling circuit parameters and selecting candidate network topologies, demanding significantly high computational costs. Given that this topic extends beyond the scope of the current paper, we leave this matter to future research.”
We have added the following discussions about the reasoning behind using the 425 network topologies selected from the study Shi et al. (2017).
“We use these 425 network topologies from the study (Shi et al., 2017), avoiding redundancy with established results. Due to the unique focus of our research on the effects of growth feedback and the need to evaluate quantitative ratios of robust circuits among all functional ones, we have chosen to use a 20-fold increase in the number of random parameter sets for each network topology compared to the simulations in (Shi et al., 2017). This approach makes it computationally prohibitive to scan all possible 16,038 three-node circuits. We carefully follow the settings in (Shi et al., 2017), which also analyzed TRNs with the AND logic as in this paper. Detailed descriptions of our simulation experiments are provided in the Methods section. To make our results more convincing, we have adopted a set of adaptation criteria that are stricter than those used in (Shi et al., 2017). Consequently, the ratio of adaptive circuits is somewhat lower in our study, with 4 out of the 425 network topologies not demonstrating adaptation.”
Other than the more strict adaptation criteria and much larger sampling sizes, as we mentioned in this paragraph, we have carefully followed the simulation details of the study Shi et al. (2017). This includes but is not limited to: the dynamical equations (when k<sub>g</sub> = 0), the input signals, the scales and ranges of the circuit parameters to be randomly sampled, and the sampling method (Latin hypercube sampling). One of the authors of the current paper was also the first author of the study Shi et al. (2017), who helped us verify the details of simulations (among many other contributions). These identical settings justify our usage of the established results with the 425 network topologies.
To provide more information about these 425 network topologies, We have added the following introduction. It introduces the structural features of the networks, especially the shared core motifs for adaptation. In our GitHub and OSF repositories, we have also provided relevant data about the 425 topologies, including the topology structures and the parameter sets we scanned.
“These topologies can be classified into two families based on the core topology: networks with a negative feedback loop (NFBL) and networks with an incoherent feed-forward loop (IFFL) (Shi et al., 2017). More specifically, there are 206 network topologies in the NFBL family. All of these NFBL topologies have a negative feedback loop for node B. This negative feedback loop can be formed by the loop from node B to A and back to B (such as the circuit shown in Fig. 1 (a)), by node B to C and back to B, or by a longer route, from node B to A and then to C and back to B. There is always a self-activation link from B to B in all these 206 NFBL networks. There are 219 network topologies in the IFFL family. All of them have two feed-forward pathways from the input node A to the output node C. One pathway goes from node A to C directly, while the other involves node B in the middle. One of the pathways is activating while the other one is inhibitory.”
Weaknesses 3: “The authors’ model does not describe the impact of growth via a biological mechanism: they model growth as an additional dilution rate and calculate growth rate based on a phenomenological description with growth rate occurring at a maximum (k<sub>g</sub>) scaled by the circuit ’burden’ b(t). Therefore, the authors’ model does not capture potential growth rate changes in parameter values (e.g., synthetic protein production falls with increasing growth rate; see Scott & Hwa, 2023).”
In our paper, we consider dilution due to cell growth as the dominant factor of growth feedback. Here we compared the adaptive circuits under no-growth conditions and their ability to maintain their adaptive behaviors after dilution into a fresh medium, which mediated a significant dilution to the circuits. This is based on our previous work, Zhang, et al. Nature chemical biology 16.6 (2020): 695-701. We agree that an increased growth rate can change synthetic protein production. However, the dynamic roles of the dilution and growthaffected production rate should be analogous, given that they both act as inhibitory factors arising from cell growth as mentioned by the reviewer. Still, we agree that taking the growth effect on the production rate into account would provide a more comprehensive study, but it is beyond the scope of the present work. We have added the following paragraph in the Discussion section of our paper.
“In our paper, we consider dilution due to cell growth as the dominant factor of growth feedback. Here we compared the adaptive circuits under no-growth conditions and their ability to maintain their adaptive behaviors after dilution into a fresh medium, which mediated a significant dilution to the circuits. This is based on our previous work (Zhang et al. (2020)). However, growth feedback is inherently complex (Klumpp et al. (2009)). For instance, an increased growth rate can change protein synthesis rate (Hintsche and Klumpp (2013); Scott and Hwa (2023)), and cell growth rates can affect the distribution of protein expression in cell populations (Gouda et al. (2019)). In our paper, we concentrate on a simplified model with dilution, which we consider to have captured the dominant factor. The dynamic roles of the dilution and growth-affected production rate should be analogous, given that they both act as inhibitory factors arising from cell growth. Incorporating the impact of growth rate on protein synthesis into our model would offer a more comprehensive analysis, a task beyond the scope of this paper but presenting an intriguing opportunity for future research to address the complexities of growth feedback.”
Weaknesses 4: “The authors made several claims about the bifurcations (infinite-period, saddle-node, etc) underlying the abrupt changes leading to failures of adaptations. There is a lack of evidence supporting these claims. Both local and global bifurcations can be demonstrated with semi-analytic approaches such as numerical continuation along with investigations of eigenvalues of the Jacobian matrix. The claims based on ODE solutions alone are not sound.”
After our further simulations and verification, we found that most of the bifurcation-induced failures we mentioned in type-V and type-VI failures should be categorized as bistability or multistability-induced failures. They are still abrupt switching between adaptive and non-adaptive states, as we described in the previous version of the manuscript. However, they are actually still far away from the bifurcation points at the critical k<sub>g</sub>. We have corrected all relevant descriptions and figures, including panel Fig. 4 (c) and its captions. We have added the following paragraph in the paper to explain this issue.
“One might expect bifurcations to play an important role in many type-V and type-VI failures. However, in our simulations, failures precisely at the bifurcation point are not observed. This is because the bifurcation points under consideration, such as fold bifurcations, are where one of the attraction basins diminishes to zero. For a failure to occur exactly at the bifurcation point, the initial condition would need to coincide precisely with the infinitesimally small basin just before it vanishes. More realistically, failures almost always largely precede the exact bifurcation point. They happen while the basin is still contracting and the basin boundary crosses the initial condition or O<sub>1</sub>. An example is shown in Fig. 4(b), where bistability persists, yet the lighter orange basin with a larger O<sub>1</sub>(C) cannot be reached as the boundary shifts away from the initial condition A<sub>0</sub> and B<sub>0</sub>. As another example, in Fig. 4 (c) from a different circuit, the higher O<sub>2</sub>(C) state disappears at k<sub>g</sub> ≈ 0.012 and switches to a lower O<sub>2</sub>(C), but this point is not a bifurcation.
It is the point where the stable O<sub>1</sub> continuously crosses the basin boundary of O<sub>2</sub>.”
Our further simulations have verified the existence of the oscillation-related bifurcations. We have added a new appendix discussing the phenomena associated with them in more detail.
Weaknesses 5: “The impact of biochemical noise is not evaluated in this work; the author’s analysis is only carried out in a deterministic regime.”
In this paper, we have not taken into account biochemical noise as we focus solely on scenarios where all protein concentrations are high. In these circumstances, the influence of noise is relatively minor. Incorporating biochemical noise, which originates from various sources and possesses diverse characteristics, would significantly complicate the analysis beyond the scope of our current work. However, exploring this aspect could be an intriguing avenue for future research. We have included the following discussions in our paper.
“Our study focuses on scenarios where random noises are ignored. Realistically, gene circuits are subjected to diverse types of noise, which can complicate their predictable behavior and design. These noises can originate externally from a noisy input signal I, or intrinsically, directly affecting the circuit components. Further, these noises can be classified based on various mechanisms that cause them (Colin et al. (2017); Sartori and Tu (2011)) . And with different mechanisms, each type of noise can be characterized by different attributes such as frequency, amplitude, and noise color. These variances can lead to different impacts on the circuits, potentially necessitating unique mechanisms or designs for the attenuation of each category (Sartori and Tu (2011); Qiao et al. (2019) ). Given the extensive complexity and the need for thorough investigation, these noise-related challenges are beyond the scope of this paper and require a series of future studies.”
Point-by-point response to the recommendations for the authors:
Comment 1: - The authors’ github repository, detailed in their code availability statement, is currently unavailable and likely contains some of the answers to the queries here.
We have updated our GitHub and OSF repositories with simulation codes, result data, and detailed explanations. The link to our GitHub repository in the previous version of the manuscript contained a format error, making it inaccessible to the referees. We apologize for this mistake and have corrected it.
Comment 2: - At present, it is not clear how the 425 topologies are created from the system of equations (Eq. 6-8) or from the circuit diagram in Fig 1a. This could do with being explicitly stated for the reader.
We have added the following paragraph to discuss how the 425 topologies are selected and what the common motifs and connections they share.
“Previous research identified 425 different three-node TRN network topologies that can achieve adaptation in the absence of growth feedback (Shi et al., 2017), providing the base of our computational study. These topologies can be classified into two families based on the core topology: networks with a negative feedback loop (NFBL) and networks with an incoherent feed-forward loop (IFFL) (Shi et al., 2017). More specifically, there are 206 network topologies in the NFBL family. All of these NFBL topologies have a negative feedback loop for node B. This negative feedback loop can be formed by the loop from node B to A and back to B (such as the circuit shown in Fig. 1 (a)), by node B to C and back to B, or by a longer route, from node B to A and then to C and back to B. There is always a self-activation link from B to B in all these 206 NFBL networks. There are 219 network topologies in the IFFL family. All of them have two feed-forward pathways from the input node A to the output node C. One pathway goes from node A to C directly, while the other involves node B in the middle. One of the pathways is activating while the other one is inhibitory. We use these 425 network topologies from the study (Shi et al., 2017), avoiding redundancy with established results. Due to the unique focus of our research on the effects of growth feedback and the need to evaluate quantitative ratios of robust circuits among all functional ones, we have chosen to use a 20-fold increase in the number of random parameter sets for each network topology compared to the simulations in (Shi et al., 2017). This approach makes it computationally prohibitive to scan all possible 16,038 three-node circuits. We carefully follow the settings in (Shi et al., 2017), which also analyzed TRNs with the AND logic as in this paper. Detailed descriptions of our simulation experiments are provided in the Methods section. To make our results more convincing, we have adopted a set of adaptation criteria that are stricter than those used in (Shi et al., 2017). Consequently, the ratio of adaptive circuits is somewhat lower in our study, with 4 out of the 425 network topologies not demonstrating adaptation.”
Comment 3: - In the main text, the authors mentioned that they chose 425 network topologies for this study, whereas the number is 435 in the abstract. Please correct the error.
The number 435 in our previous abstract referred to the 10 four-node circuits that we studied in the appendix, in addition to the 425 three-node network topologies. To avoid confusion and potential misunderstandings among readers, we have revised this expression of “435 distinct topological structures” to “more than four hundred topological structures”.
Comment 4: - Please can the authors include the topologies they have studied in an appendix or as supplementary material. The impact of this work would increase significantly if for each topology the authors could include a pie chart similar to the one shown in Fig 2 so that others can use these results.
We fully acknowledge the potential benefits of providing simulation results for each topology. However, including over four hundred more figures in this paper is not feasible. Moreover, we expect that many readers may also be interested in results not only for individual topologies but also for subsets sharing specific motifs or regulatory connections. Therefore, we have provided all the necessary data and codes in our GitHub repository to make these pie charts. We have included a detailed guide on how to generate these pie charts in the GitHub Readme file. These allow readers to plot the pie chart and extract distributions for any individual topology or use conditions to filter any subset of topologies as required. We believe this approach offers greater flexibility for our readers. We have also added the following explanation in the Methods section.
“The codes implementing these criteria are available in our GitHub repository, with the link provided in the ”Code Availability” section. The failure type results for all circuits tested are available in our OSF repository, with the link provided in the ”Data Availability” section. An additional note is provided in the README file of our GitHub repository for further guidance on generating pie charts similar to Fig. 2 for any network topology or subset of topologies.”
Comment 5: - At present, the authors have not given sufficient detail for their numerical methods (e.g. to identify bistability or oscillations) to enable the work to be repeated. I would appreciate it if the authors could expand their Methods section or provide a description of their method as an appendix. Additionally, the authors must clarify how many parameter sets per topology showed successful adaptation.
In response to this comment, we have reorganized and expanded our Methods section, especially the new “Numerical simulations of circuit dynamics” and “Numerical criteria for functional adaptation and failure types” subsections. We added details on how we define and evaluate a “relatively steady state”, how to determine if there is an oscillation, how to determine the critical k<sub>g</sub> value, and how to determine if a failure is continuous or abrupt. Readers can also find the corresponding codes in our GitHub repository, where we provide a README file to help the readers locate the script file they need.
The number of parameter sets per topology showed successful adaptation is precisely our definition of the Q-value. Q-values of most of the circuits we tested are shown in multiple figures in the paper. A complete table of Q-values with different topologies and different k<sub>growth</sub> values can be found in our OSF repository.
Comment 6: - Looking at the Model Description, there seem to be multiple issues, as follows. The model should be rewritten and all simulations redone with the model corrected as described below:
(a) The ”strength of growth feedback” is modeled by the maximal growth parameter k<sub>g</sub> in Equation (12). However, this rate does not represent growth feedback. In fact, this parameter must be present also for the system without growth feedback, Equations (6 - 8), because those cells grow as well! So Equation (12) with b(t)=0 should also be added to Equations (6 - 8), in addition to the dilution terms in each equation.
(b) The dilution due to growth (dN/dt)*(B/N) is only added to Equations (9 - 11). This is wrong - growthaffects (dilutes) all protein concentrations, even without growth feedback, so similar terms must be added even to equations without growth feedback, i.e., to Equations (6 - 8).
(c) The term representing growth feedback is actually the fraction 1/(1+b(t)). To adjust the strength ofgrowth feedback, some parameters should be introduced into this term. Specifically, the term currently has a Hill form with Hill coefficient = 1 and sensitivity = 1. The term should be converted into a general Hill function, and the parameters of that function should be altered to represent growth feedback. This Hill function is called a cellular (phenotypic) fitness landscape, see Nevozhay et al., 2012.
Equations (6-8) only describe one part of the entire model we are studying. We are having these equations presented solely for the purpose of not overwhelming readers with a large number of parameters that are defined for the first time. They are not actually used in our simulations, but were only for explanations of the meaning of parameters. In our simulations throughout the paper, we only used Eqs. (9-13) (with various topologies). We have revised the texts to make this point clear. We have added the following descriptions in the section Model Description:
“In order not to overwhelm readers with too many terms and parameters, we first describe a partial model (an isolated circuit without growth feedback) before introducing the complete model that we study in this work.”
“Equations. (9) to (13) are the dynamical equations we actually use for simulating the circuit dynamics.”
Additionaly, in the newly added subsection “Numerical simulations of circuit dynamics683” in the Methods, we explicitly mention that:
“The dynamical equations we use are similar to Eqs. (9-13) but with different topologies.”
We consider dilution due to cell growth as the dominant factor of growth feedback. In fact, we study the adaptive circuits without growth and their ability to maintain their adaptive behaviors after dilution into a fresh medium, based on a recent work [Zhang, et al., Nature Chemical Biology 16.6 (2020): 695-701]. The dynamic roles of the dilution and growth-affected production rate should be analogous, given that they both act as inhibitory factors arising from cell growth. The term mentioned in the comment is about how the burden of the circuit affects cell growth. We agree that it can be interesting to have a more comprehensive study on how different degrees of nonlinearity of this term can have different effects on the overall robustness towards the growth feedback problem, but this is not part of our primary focus and is beyond the scope of this paper. In this paper, we are mostly concerned with the variability of the strength of the growth feedback/dilution, controlled by the parameter k<sub>g</sub>, instead of the different types of nonlinearity.
Comment 7: - On the right side of Equation (7), the first term should be inhibitory, right?
This is indeed an error. We accidentally reversed the regulation from A to B and B to A when inputting the formula. We have corrected both terms.
Comment 8: - It seems to me that a better transition from Figs 6 and 7 to Fig 8 can be made. Did the authors choose the three circuits in Fig 8 based on the three distinct groups shown in Fig 6 and 7? The rationale for choosing the three topologies given the clusters identified earlier can be explained more clearly.
We agree more explanation can be provided here. We have added the following descriptions, in the caption of Fig.8:
“The other three curves represent circuits with different robustness levels: high (Circuit No. 98), moderate (Circuit No. 3), and low (Circuit No. 28) values of R, to demonstrate that this scaling behavior is generic. Each of these three circuit topologies is selected from one of the three groups illustrated in Fig. 6 and Fig. 7, and they have the highest Q(k<sub>g</sub> = 0) value within their respective groups.”
and in the main text:
“The three other curves represent circuit topologies that have a relatively high, moderate, and low value R among the 425 topologies tested, to demonstrate that this scaling behavior is generic. (These three topologies are the highest Q(k<sub>g</sub> = 0) topology in each of the three groups shown in Fig. 6 and Fig. 7.”
Comment 9: - The insights from the neural network model seem to be very limited. It would be interesting to see if the model can predict the performance of network topologies that have not been exposed to the model during training.
Machine learning is not a focus of this paper. For the section the comment was referring to, the main research question is on the relationship between circuit robustness and topology, and the point we are trying to make is that the robustness dependency varies across different connections — some connections are critical, while others are less impactful. The neural-network-based analysis was only used to provide further support to this point by demonstrating that through optimization, neural networks automatically assign different levels of weights to different connections in the circuits.
We agree that it can be an interesting topic to study how machine learning can be used to help us design functional and robust circuits, as discussed in the final paragraph of the Discussion section. However, such an investigation would require a series of more comprehensive and carefully designed simulation experiments to validate if “neural networks can predict the performance of network topologies that have not been exposed to the model during training”. One point one should take extra care of is that many network topologies we study are very similar to many others, with shared motifs and links. These considerations extend beyond the scope of this paper.
Other potential improvements or future work
Comment 10: - The growth feedback examined in this paper comes from the effect of protein levels on the cell division rate (growth rate). However, the opposite effect can also occur; cell growth rates can affect the distribution of protein expression in cell populations. A good reference is Kheir Gouda et al., which is already on the list of references. These opposite effects should be described and discussed.
We agree that growth feedback is inherently complex and has many biological effects, and in our paper, we are using a simplified model to study the dominant factor of growth feedback. We have added the following paragraph in the Discussion section, which involves the opposite effect mentioned in the comment.
“In our paper, we consider dilution due to cell growth as the dominant factor of growth feedback. Here we compared the adaptive circuits under no-growth conditions and their ability to maintain their adaptive behaviors after dilution into a fresh medium, which mediated a significant dilution to the circuits. This is based on our previous work (Zhang et al. (2020)). However, growth feedback is inherently complex (Klumpp et al. (2009)). For instance, an increased growth rate can change protein synthesis rate (Hintsche and Klumpp (2013); Scott and Hwa (2023)), and cell growth rates can affect the distribution of protein expression in cell populations (Gouda et al. (2019)). In our paper, we concentrate on a simplified model with dilution, which we consider to have captured the dominant factor. The dynamic roles of the dilution and growth-affected production rate should be analogous, given that they both act as inhibitory factors arising from cell growth. Incorporating the impact of growth rate on protein synthesis into our model would offer a more comprehensive analysis, a task beyond the scope of this paper but presenting an intriguing opportunity for future research to address the complexities of growth feedback.”
Comment11: - It may be worth mentioning that growth feedback can lead to persistence, see PMID:27010473.
We have included this research as a citation.
Comment 12: - While some other networks (two-node) are discussed, it would be worth doing this analysis for all one- and two-node networks, perhaps controlled by small molecules added externally. If not here, then as a future plan.
We agree that this is an interesting idea for future studies.
Comment 13: - The manuscript analyzes the deterministic dynamics of a set of gene networks. However, gene expression is always stochastic, and gene circuits have been designed to control stochastic gene expression. For example, gene expression distributions can be reshaped, or even new peaks can appear, which would be worth mentioning, PMID: 30341217. The effect of growth feedback on stochastic gene expression and future perspectives of systematically studying this should be discussed.
We have added the following paragraph in the Discussion section to discuss the effects of noises and stochasticity. The research mentioned in the comment is also included.
“Our study focuses on scenarios where random noises are ignored. Realistically, gene circuits are subjected to diverse types of noise, which can complicate their predictable behavior and design. These noises can originate externally from a noisy input signal I, or intrinsically, directly affecting the circuit components. Further, these noises can be classified based on various mechanisms that cause them (Colin et al. (2017); Sartori and Tu (2011)). And with different mechanisms, each type of noise can be characterized by different attributes such as frequency, amplitude, and noise color. These variances can lead to different impacts on the circuits, potentially necessitating unique mechanisms or designs for the attenuation of each category (Sartori and Tu (2011); Qiao et al. (2019)). Given the extensive complexity and the need for thorough investigation, these noise-related challenges are beyond the scope of this paper and require a series of future studies.”
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public review):
Summary:
In this work, the authors present a cornucopia of data generated using deep mutational scanning (DMS) of variants in MET kinase, a protein target implicated in many different forms of cancer. The authors conducted a heroic amount of deep mutational scanning, using computational structural models to augment the interpretation of their DMS findings.
Strengths:
This powerful combination of computational models, experimental structures in the literature, dose-response curves, and DMS enables them to identify resistance and sensitizing mutations in the MET kinase domain, as well as consider inhibitors in the context of the clinically relevant exon-14 deletion. They then try to use the existing language model ESM1b augmented by an XGBoost regressor to identify key biophysical drivers of fitness. The authors provide an incredible study that has a treasure trove of data on a clinically relevant target that will appeal to many.
We thank Reviewer 1 for their generous assessment of our manuscript!
Weaknesses:
However, the authors do not equally consider alternative possible mechanisms of resistance or sensitivity beyond the impact of mutation on binding, even though the measure used to discuss resistance and sensitivity is ultimately a resistance score derived from the increase or decrease of the presence of a variant during cell growth.
For this resistance screen, Ba/F3 was a carefully chosen cellular selection system due to its addiction to exogenously provided IL-3, undetected expression of endogenous RTKs (including MET), and dependence on kinase transgenes to promote signaling and growth under IL-3 withdrawal. Together this allows for the readout of variants that alter kinase-driven proliferation without the caveat of bypass resistance. In our previous phenotypic screen (Estevam et al., 2024, eLife), we also carefully examined the impact of all possible MET kinase domain mutations both in the presence and absence of IL-3 withdrawal, but no inhibitors. There, we identified a small group of mutations that were associated with gain-of-function behavior located at conserved regulatory motifs outside of the catalytic site, yet these mutations were largely sensitive to inhibitors within this screen.
Here, the majority of resistance mutations were located at or near the ATP-binding pocket, suggesting an impact on resistance through direct drug interactions. However, there was also a small population of distal mutations that met our statistical definitions of resistance. Within the crizotinib selection, sites such as T1293, L1272, T1261, amongst others, demonstrated resistance profiles but were located in C-lobe away from the catalytic site. While we did not experimentally validate these specific mutations, it is possible that non-direct drug binders instead promote resistance through allosteric or conformational mechanisms which preserve kinase activity and signaling. Indeed, our ML framework explicitly included conformational and stability effects as significant in improving predictions.
We would be happy to further discuss any specific alternative resistance mechanisms Reviewer 1 has in mind! Thank you for highlighting this!
There are also points of discussion and interpretation that rely heavily on docked models of kinase-inhibitor pairs without considering alternative binding modes or providing any validation of the docked pose. Lastly, the use of ESM1b is powerful but constrained heavily by the limited structural training data provided, which can lead to misleading interpretations without considering alternative conformations or poses.
The majority of our interpretations are grounded in the X-ray structures of WT MET bound to the inhibitors studied (or close analogs). The use of docked models (note - to mutant structures predicted by UMol, not ESM, that can have conformational changes) is primarily in the ML part of the manuscript. Indeed, in our models, conformational and binding mode changes are taken into account as features (see Ligand RMSD, Residue RMSD). There are certainly improved methods (AF3 variants) emerging that might have even more power to model these changes, but they come with greater computational costs and are something we will be evaluating in the future.
We added to the results section: “While our features can account for some changes in MET-mutant conformation and altered inhibitor binding pose, the prediction of these aspects can likely be improved with new methods.”
Reviewer #2 (Public review):
Summary:
This manuscript provides a comprehensive overview of potential resistance mutations within MET Receptor Tyrosine Kinase and defines how specific mutations affect different inhibitors and modes of target engagement. The goal is to identify inhibitor combinations with the lowest overlap in their sensitivity to resistant mutations and determine if certain resistance mutations/mechanisms are more prevalent for specific modes of ATP-binding site engagement. To achieve this, the authors measured the ability of ~6000 single mutants of MET's kinase domain (in the context of a cytosolic TPR fusion) to drive IL-3-independent proliferation (used as a proxy for activity) of Ba/F3 cells (deep mutational profiling) in the presence of 11 different inhibitors. The authors then used co-crystal and docked structures of inhibitor-bound MET complexes to define the mechanistic basis of resistance and applied a protein language model to develop a predictive model of inhibitor sensitivity/resistance.
Strengths:
The major strengths of this manuscript are the comprehensive nature of the study and the rigorous methods used to measure the sensitivity of ~6000 MET mutants in a pooled format. The dataset generated will be a valuable resource for researchers interested in understanding kinase inhibitor sensitivity and, more broadly, small molecule ligand/protein interactions. The structural analyses are systematic and comprehensive, providing interesting insights into resistance mechanisms. Furthermore, the use of machine learning to define inhibitor-specific fitness landscapes is a valuable addition to the narrative. Although the ESM1b protein language model is only moderately successful in identifying the underlying mechanistic basis of resistance, the authors' attempt to integrate systematic sequence/function datasets with machine learning serves as a foundation for future efforts.
We thank Reviewer 2 for their thoughtful assessment of our manuscript!
Weaknesses:
The main limitation of this study is that the authors' efforts to define general mechanisms between inhibitor classes were only moderately successful due to the challenge of uncoupling inhibitor-specific interaction effects from more general mechanisms related to the mode of ATP-binding site engagement. However, this is a minor limitation that only minimally detracts from the impressive overall scope of the study.
We agree. We have added to the discussion: “A full landscape of mutational effects can help to predict drug response and guide small molecule design to counteract acquired resistance. The ability to define molecular mechanisms towards that goal will likely require more purposefully chosen chemical inhibitors and combinatorial mutational libraries to be maximally informative.”
Reviewer #3 (Public review):
Summary:
In the manuscript 'Mapping kinase domain resistance mechanisms for the MET receptor tyrosine kinase via deep mutational scanning' by Estevam et al, deep mutational scanning is used to assess the impact of ~5,764 mutants in the MET kinase domain on the binding of 11 inhibitors. Analyses were divided by individual inhibitor and kinase inhibitor subtypes (I, II, I 1/2, and III). While a number of mutants were consistent with previous clinical reports, novel potential resistance mutants were also described. This study has implications for the development of combination therapies, namely which combination of inhibitors to avoid based on overlapping resistance mutant profiles. While one suggested pair of inhibitors with the least overlapping resistance mutation profiles was suggested, this manuscript presents a proof of concept toward a more systematic approach for improved selection of combination therapeutics. Furthermore, in a final part of this manuscript the data was used to train a machine learning model, the ESM-1b protein language model augmented with an XG Boost Regressor framework, and found that they could improve predictions of resistance mutations above the initial ESM-1b model.
Strengths:
Overall this paper is a tour-de-force of data collection and analysis to establish a more systematic approach for the design of combination therapies, especially in targeting MET and other kinases, a family of proteins significant to therapeutic intervention for a variety of diseases. The presentation of the work is mostly concise and clear with thousands of data points presented neatly and clearly. The discovery of novel resistance mutants for individual MET inhibitors, kinase inhibitor subtypes within the context of MET, and all resistance mutants across inhibitor subtypes for MET has clinical relevance. However, probably the most promising outcome of this paper is the proposal of the inhibitor combination of Crizotinib and Cabozantib as Type I and Type II inhibitors, respectively, with the least overlapping resistance mutation profiles and therefore potentially the most successful combination therapy for MET. While this specific combination is not necessarily the point, it illustrates a compelling systematic approach for deciding how to proceed in developing combination therapy schedules for kinases. In an insightful final section of this paper, the authors approach using their data to train a machine learning model, perhaps understanding that performing these experiments for every kinase for every inhibitor could be prohibitive to applying this method in practice.
We thank Reviewer 3 for their assessment of our manuscript (we are very happy to have it described as a tour-de-force!)
Weaknesses:
This paper presents a clear set of experiments with a compelling justification. The content of the paper is overall of high quality. Below are mostly regarding clarifications in presentation.
Two places could use more computational experiments and analysis, however. Both are presented as suggestions, but at least a discussion of these topics would improve the overall relevance of this work. In the first case it seems that while the analyses conducted on this dataset were chosen with care to be the most relevant to human health, further analyses of these results and their implications of our understanding of allosteric interactions and their effects on inhibitor binding would be a relevant addition. For example, for any given residue type found to be a resistance mutant are there consistent amino acid mutations to which a large or small or effect is found. For example is a mutation from alanine to phenylalanine always deleterious, though one can assume the exact location of a residue matters significantly. Some of this analysis is done in dividing resistance mutants by those that are near the inhibitor binding site and those that aren't, but more of these types of analyses could help the reader understand the large amount of data presented here. A mention at least of the existing literature in this area and the lack or presence of trends would be worthwhile. For example, is there any correlation with a simpler metric like the Grantham score to predict effects of mutations (in a way the ESM-1b model is a better version of this, so this is somewhat implicitly discussed).
Indeed we experimented with including these types of features in the XGBoost scheme (particularly residue volume change and distance) to augment the predictive power of the ESM model - see Figure 8 - figure supplement 1; however, we didn’t find them as significant. Therefore, the signal is likely very small and/or incorporated into the baseline ESM model.
Indeed, this discussion relates to the second point this manuscript could improve upon: the machine learning section. The main actionable item here is that this results section seems the least polished and could do a better job describing what was done. In the figure it looks like results for certain inhibitors were held out as test data - was this all mutants for a single inhibitor, or some other scheme? Overall I think the implications of this section could be fleshed out, potentially with more experiments.
Figure 8A and the methods section contain a very detailed explanation of test data. We have thought about it and do not have any easy path to improve the description, which we reproduce here:
“Experimental fitness scores of MET variants in the presence of DMSO and AMG458 were ignored in model training and testing since having just one set of data for a type I ½ inhibitor and DMSO leads to learning by simply memorizing the inhibitor type, without generalizability. The remaining dataset was split into training and test sets to further avoid overfitting (Figure 8A). The following data points were held out for testing - (a) all mutations in the presence of one type I (crizotinib) and one type II (glesatinib analog) inhibitor, (b) 20% of randomly chosen positions (columns) and (c) all mutations in two randomly selected amino acids (rows) (e.g. all mutations to Phe, Ser). After splitting the dataset into train and test sets, the train set was used for XGBoost hyperparameter tuning and cross-validation. For tuning the hyperparameters of each of the XGBoost models, we held out 20% of randomly sampled data points in the training set and used the remaining 80% data for Bayesian hyperparameter optimization of the models with Optuna (Akiba et al., 2019), with an objective to minimize the mean squared error between the fitness predictions on 20% held out split and the corresponding experimental fitness scores. The following hyperparameters were sampled and tuned: type of booster (booster - gbtree or dart), maximum tree depth (max_depth), number of trees (n_estimators), learning rate (eta), minimum leaf split loss (gamma), subsample ratio of columns when constructing each tree (colsample_bytree), L1 and L2 regularization terms (alpha and beta) and tree growth policy (grow_policy - depthwise or lossguide). After identifying the best combination of hyperparameters for each of the models, we performed 10-fold cross validation (with re-sampling) of the models on the full training set. The training set consists of data points corresponding to 230 positions and 18 amino acids. We split these into 10 parts such that each part corresponds to data from 23 positions and 2 amino acids. Then, at each of 10 iterations of cross-validation, models were trained on 9 of 10 parts (207 positions and 16 amino acids) and evaluated on the 1 held out part (23 positions and 2 amino acids). Through this protocol we ensure that we evaluate performance of the models with different subsets of positions and amino acids. The average Pearson correlation and mean squared error of the models from these 10 iterations were calculated and the best performing model out of 8192 models was chosen as the one with the highest cross-validation correlation. The final XGBoost models were obtained by training on the full training set and also used to obtain the fitness score predictions for the validation and test sets. These predictions were used to calculate the inhibitor-wise correlations shown in Figure 8B.“
As mentioned in the 'Strengths' section, one of the appealing aspects of this paper is indeed its potential wide applicability across kinases -- could you use this ML model to predict resistance mutants for an entirely different kinase? This doesn't seem far-fetched, and would be an extremely compelling addition to this paper to prove the value of this approach.
This is exactly where we want to go next! But as we see here, it is going to be hard and require more purposeful selection of chemicals and likely combinatorial mutations to be maximally informative (see also reviewer 2 response where we have added text)
Another area in which this paper could improve its clarity is in the description of caveats of the assay. The exact math used to define resistance mutants and its dependence on the DMSO control is interesting, it is worth discussing where the failure modes of this procedure might be. Could it be that the resistance mutants identified in this assay would differ significantly from those found in patients? That results here are consistent with those seen in the clinic is promising, but discrepancies could remain.
Thank you for pointing this out. The greatest trade-off of probing the intracellular MET kinase (juxtamembrane, kinase domain, c-tail) in the constitutively active TPR system is that while we gain cytoplasmic expression, constitutive oligomerization, and HGF-independent activation, other features like membrane-proximal effects are lost and translatability of some mutations in non-proliferative conditions may also be limited. Nevertheless, Ba/F3 allows IL-3 withdrawal to serve as an effective variant readout of transgenic kinase variant effects due to its undetectable expression of endogenous RTKs and addiction to exogenous interleukin-3 (IL-3).
In our previous study, we were also interested in comparing the phenotypic results to available patient populations in cBioPortal. We observed that our DMS captured known oncogenic MET kinase variants, in addition to a population of gain-of-function variants within clinical residue positions that have not been clinically reported. Interestingly, the population of possible novel gain-of-function mutant codons were more distant in genetic space (2-3 Hamming distance) from wild type than the clinically reported variant codon (1-2 Hamming distance).
For this inhibitor screen, we also carefully compared previously reported and validated resistance mutations across referenced publications to that of our inhibitor screen, and observed large agreement as noted in-text. While discrepancies could definitely remain, there is precedence for consistency.
Furthermore a more in depth discussion of the MetdelEx14 results is warranted. For example, why is the DMSO signature in Figure 1 - supplement 4 so different from that of Figure 1?
In our previous study (Estevam et al., 2024), we more directly compared MET and METΔExon14, and while observed several differences, especially at conserved regulatory motifs, the TPR expression system did not provide a robust differential. Therefore, we hypothesize that a membrane-bound context is likely necessary to obtain a differential that captures juxtamembrane regulatory effects for these two isoforms. For that reason, we did not place heavy emphasis on the differences between MET and METΔExon14 in this study. Nevertheless, we performed parallel analysis of the METΔExon14 inhibitor DMS and provided all source and analyzed data in our GitHub repository (https://github.com/fraser-lab/MET_kinase_Inhibitor_DMS).
In our analysis of resistance, we used Rosace to score and compare DMSO and inhibitor landscapes. We present the full distribution of raw scores in Figure 1 for each condition. However, to visually highlight resistance mutations as a heatmap, we subtracted the scores of each variant in each inhibitor condition from the raw DMSO score, making the heatmaps in Figure 1 - supplement 4 appear more “blue.”
And finally, there is a lot of emphasis put on the unexpected results of this assay for the tivantinib "type III" inhibitor - could this in fact be because the molecule "is highly selective for the inactive or unphosphorylated form of c-Met" according to Eathiraj et al JBC 2011?
The work presented by Eathiraj et al JBC 2011 is a key study we reference and is foundational to tivantinib. While the point brought up about tivantinib’s selective preference for an inactive conformation is valid, this is also true for type II kinase inhibitors. In our study, regardless of inhibitor conformational preference, tivantinib was the only one with a nearly identical landscape to DMSO and exhibited selection even in the absence of Ba/F3 MET-addiction (Figure 1E). This result is in closer agreement with MET agnostic behavior reported by Basilico et al., 2013 and Katayama et al., 2013.
While this paper is crisply written with beautiful figures, the complexity of the data warrants a bit more clarity in how the results are visualized. Namely, clearly highlighting mutants that have previously reported and those identified by this study across all figures could help significantly in understanding the more novel findings of the work.
To better compare and contrast novel mutation identified in this study to others, we compiled a list of reported resistance mutations from recent clinical and experimental studies (Pecci et al 2024; Yao et al., 2023; Bahcall et al., 2022; Recondo et al., 2020; Rotow et al ., 2020; Fujino et al., 2019), since a direct database with resistance annotations does not exist for MET, to the best of our knowledge. In total, this amounted to 31 annotated resistance mutations across crizotinib, capmatinib, tepotinib, savolitinib, cabozantinib, merestinib, and glesatinib, which we have now tabulated in a new figure (Figure 4) and commentary in the main text:
To assess the agreement between our DMS and previously annotated resistance mutations, we compiled a list of reported resistance mutations from recent clinical and experimental studies (Pecci et al 2024; Yao et al., 2023; Bahcall et al., 2022; Recondo et al., 2020; Rotow et al ., 2020; Fujino et al., 2019) (Figure 4A,B). Overall, previously discovered mutations are strongly shifted to a GOF distribution for the drugs where resistance is reported from treatment or experiment; in contrast, the distribution is centered around neutral for those sites for other drugs not reported in the literature (Figure 4C). However, even in cases such as L1195V, we observe GOF DMS scores indicative of resistance to previously reported inhibitors. Given this overall strong concordance with prior literature and clinical results, we can also provide hypotheses to clarify the role of mutations that are observed in combination with others. For example, H1094Y is a reported driver mutation that has been linked to resistance in METΔEx14 for glesatinib with either the secondary L1195V mutation or in isolation (Recodo et al., 2020). However, in our assay H1094Y demonstrated slight sensitivity to gelesatinib, suggesting that either resistance is linked to the exon14 deletion isoform, the L1195V mutation, or a cellular factor not modeled well by the BaF3 system.
Finally, the potential impacts and follow-ups of this excellent study could be communicated better - it is recommended that they advertise better this paper as a resource for the community both as a dataset and as a proof of concept. In this realm I would encourage the authors to emphasize the multiple potential uses of this dataset by others to provide answers and insights on a variety of problems.
Please see below
Related to this, the decision to include the MetdelEx14 results, but not discuss them at all is interesting, do the authors expect future analyses to lead to useful insights? Is it surprising that trends are broadly the same to the data discussed?
Our previous paper suggests that Ba/F3 isn’t a great model for measuring the differences between MET and METΔEx14, so we haven’t emphasized other than to point to our previous paper. We include the full analysis here nonetheless as a resource. Potentially where the greatest differences between resistance mutant behaviors would be observed is in the full-length, membrane-bound MET and METΔEx14 receptor isoforms. While outside of the scope of this study, there is great potential to use the resistance mutations identified in this study as a filtered group to test and map differential inhibitor sensitivities between receptor isoforms.
And finally it could be valuable to have a small addition of introspection from the authors on how this approach could be altered and/or improved in the future to facilitate the general application of this approach for combination therapies for other targets.
See also reviewer 2 response where we have added text.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
Major points of revision:
(1) It seems like much of the structural interpretation of the inhibitor binding mode, outside of crizotinib binding, appears to come from docked models of the inhibitor to the MET kinase domain. Given the potential variability of the docked structure to the kinase domain, it would be useful for the authors to consider alternative possible binding modes that their docking pipeline may have suggested. It could also be useful to provide some degree of validation or contextualization of their docking models.
All individual figures are very carefully inspected based on either existing crystal structures of the inhibitor or closely related inhibitors (ATP, 3DKC; crizotinib, 2WGJ; tepotinib, 4R1V; tivantinib, 3RHK; AMG-458, 5T3Q; NVP-BVU972, 3QTI; merestinib, 4EEV; savolitinib, 6SDE). In total, four structural interpretations were the result of docking onto reference experimental structures (capmatinib, cabozantinib, glumetinib, glesatinib). As we wrote above, different conformations and binding modes are possible in predicted mutant structures (as we did here at scale) and included in the ML analysis already.
(2) In the first section, the authors classify an inhibitor as Type Ia on docking models, but mention the conflicting literature describing it as type Ib - it would be helpful to provide a contextualization of why this distinction between Ia and Ib matters, and what difference it might make. It would also be useful to know if their docking score only suggested poses compatible with Ia or if other poses were provided as well. Validation using other method might be beneficial, especially since they acknowledge the conflicting literature for classification. Or at least recontextualization that more evidence would be needed.
Kinase inhibitors have several canonical structural definitions we use to base the classifications in this study. Specifically, type I inhibitors are classified in MET by interactions with Y1230, D1228, K1110 in addition to its conformation in the ATP-binding site. Type I inhibitors are further subdivided into type 1a in MET if it leverages interactions with the solvent front and residue G1163. In prior literature referenced, tepotinib was classified as type 1b, which would imply it does not have solvent front interactions, like savolitinib (PDB 6SDE) or NVP-BVU972 (PDB 3QTI). However, in the tepotinib experimental structure (PDB 4R1V), we observed a greater structural resemblance to other type 1a inhibitors opposed to type 1b (Figure 1 - figure supplement 1b).
(3) The measure used to discuss resistance and sensitivity is ultimately a resistance score derived from the increase or decrease of the presence of a variant during cell growth. This is not a measure of direct binding. It would be helpful if the authors discussed alternative mechanisms through which these variants may impact resistance and/or sensitivity, such as stability, protonation effects, or kinase activity. The score itself may be convolving over all these potential mechanisms to drive GOF and LOF observed behavior.
See the response to the public review. Indeed, our ML framework explicitly included conformational and stability effects as significant in improving predictions.
(4) While it is promising to try and improve the predictive properties of ESM1b, it is not exactly clear why the authors considered their structural data of 11 inhibitors a sufficient dataset with which to augment the model. It would be useful for the authors to provide some additional context for why they wished to augment ESM1b in particular with their dataset, and provide any metrics indicating that their training data of 11 inhibitors provided an adequate statistical sample.
We don’t understand what this means. Sorry!
(5) The authors use ESM-1b to predict the fitness impact of each mutation and augment it using protein structural data of drug-target interactions. However, using an XGBoost regressor on a single set of 11 kinase-inhibitor interaction pairs is an incredibly sparse dataset to train upon. It would be useful for the authors to consider the limitations of their model, as well as its extensibility in the context of alternate binding poses, alternate conformations, or changes in protonation states of ligand or inhibitor.
On the contrary - this is 11 chemicals across 3000 mutations. We have discussed alternative interpretations above.
Minor points:
(1) It would also be useful for the authors to provide more context around their choice of regressor. XGBoost is a powerful regressor but can easily overfit high dimensional data when paired with language models such as ESM-1b. This would be particularly useful since some of the features to train on were also generated using existing models such as ThermoMPNN.
Yes - we are quite concerned about overfitting and have tried to assess overfitting by careful design of test and validation sets.
(2) The authors also mention excluding their DMSO and AMG458 scores in the model training and testing due to overfitting issues - it would be useful to have an SI figure pointing to this data.
No - we exclude the DMSO because that is the reference (baseline) and AMG because it has a different binding mode. This isn’t related to overfitting.
(3) The authors mention in their docking pipeline that 5 binding modes were used for each ligand docking, but it appears that only one binding mode is considered in the main figures. It would be useful for the authors to provide additional details about what were the other binding modes used for, how different were each binding mode, and how was the "primary" mode selected (and how much better was its score than the others).
The reviewer misinterprets the difference between poses shown in figures, based on mostly crystal structures or carefully selected templates, and the use of docked models in feature engineering for the ML part of the study. Where existing crystal structures do not exist, we performed docking for capmatinib, cabozantinib, glumetinib, glesatinib onto reference structures bound to type I (2WGJ) and type II (4EEV) inhibitors. We selected one representative binding mode based on the reference inhibitor, and while not exact, at a minimum these models provide a basis for structural interpretation.
Reviewer #2 (Recommendations for the authors):
My main suggestion is for the authors to add a few sentences (in non-technical language) to the results section, specifically before the results shown in Figure 3, defining gain-of-function, loss-of-function, resistance, and sensitivity. While these definitions are present in the materials and methods section, explicitly discussing them prior to the relevant results would significantly improve the overall readability of the manuscript.
We defined “gain-of-function” and “loss-of-function” mutations as those with fitness scores statistically greater or lower than wild-type. Within the DMSO condition, gain-of-function and loss-of -function labels describe mutational perturbation to protein function, whereas within inhibitor conditions, the labels describe the difference in fitness introduced by an inhibitor.
We have also clarified these definitions where the terms are first introduced: “As expected, the DMSO control population displayed a bimodal distribution with mutations exhibiting wild-type fitness centered around 0, with a wider distribution of mutations that exhibited loss- or gain-of-function effects, as defined by fitness scores with statistically significant lower or greater scores than wild-type, respectively.”
Figure 7D. Please add a bit more detail to the legend on how fold change (y-axis) was calculated.
Here, fold change represents the number of viable cells at each inhibitor concentration relative to the TKI control, measured with the CellTiter-Glo® Luminescent Cell Viability Assay (Promega) as an end point readout. We have updated the legend of Figure 7D with calculation details: “Dose-response for each inhibitor concentration is represented as the fraction of viable cells relative to the TKI free control.”
I must admit, I did not understand what "Specific inhibitor fitness landscapes also aid in identifying mutations with potential drug sensitivity, such as R1086 and C1091 in the MET P-loop" means. These are positions where most mutations lead to greater sensitivity to crizotinib. Is the idea that there are potentially clinically-relevant MET mutations that can be targeted over wild type with crizotinib?
Thank you for highlighting this! The P-loop (phosphate-binding loop) is a glycine-rich structural motif conserved in kinase domains. This motif is located in the N-lobe, where its primary role is to gate ATP entry into the active site and stabilize the phosphate groups of ATP when bound. Therefore, the P-loop is a common target region for ATP-competitive inhibitor design, but also a site where resistance can emerge (Roumiantsev et al., 2002). The idea we’d like to convey is that identifying residues that offer the potential for drug stabilization with the added benefit of having lower risk resistance, is an attractive consideration for novel inhibitor design.
We have added to the text: “Individual inhibitor resistance landscapes also aid in identifying target residues for novel drug design by providing insights into mutability and known resistance cases. This enables the selection of vectors for chemical elaboration with potential lower risk of resistance development. Sites with mutational profiles such as R1086 and C1091, located in the common drug target P-loop of MET, could be likely candidates for crizotinib.”
Reviewer #3 (Recommendations for the authors):
(1) Suggested Improvements to the Figures:
a) Figure 4A - T1261 seems to be mislabeled
b) In Figure 3A it's suggested to highlight mutants determined to be resistance mutants by this scheme.
c) In Figure 3D it would be informative to highlight which of these resistance mutants have already been previously reported and which are novel to this study
d) Throughout figures 3A, 3D, and 4G the graphical choices on how to highlight synonymous mutations and mutations not performed in the assay needs improvement.
The Green vs Grey 'TRUE' vs 'FALSE' boxes are confusing. Just a green box indicating synonymous mutations would be sufficient. Additionally these green boxes are hard to see, and often edges of this green box are currently missing making it even more difficult to see and interpret.
* In Figure 4A mutants do not seem to be indicated by a line or plus sign, but this is not explained in the legend or the caption. Please add.
* In 3D and 4G it is not clear if the mutants not performed are indicated at all - perhaps they are indicated in white, making them indistinguishable from scores with 0. Please clarify.
T1261 and G1242 are now correctly labeled.
In text we have also highlighted reported resistance mutations for crizotinib, which are inclusive of clinical reports and in vitro characterization: “These sites, and many of the individual mutations, have been noted in prior reports, such as: D1228N/H/V/Y, Y1230C/H/N/S, G1163R.”
We have adjusted the heatmaps to improve visual clarity. Mutations with score 0 are white, as indicated by the scale bar, and mutations uncaptured by the screen are now in light yellow. The green outline distinguishing WT synonymous mutations have also been adjusted so edges are no longer cut off. In our representations, we only distinguished mutations by the score color scale bar and WT outline. What looked like a “plus” or “line” in the original figure was only the heatmap background, which now should be resolved in the updated figure and legends for Figure 3 and Figure 4.
(2) Some Minor Suggested Improvements to the Text:
a) The abbreviation CBL for 'CBL docking site' is used without being defined.
b) Figure 3G is referenced, but it does not exist.
c) In the sentence 'Beyond these well characterized sites, regions with sensitivity occurred throughout the kinase, primarily in loop-regions which have the greatest mutational tolerance in DMSO, but do not provide a growth advantage in the presence of an inhibitor (Figure 1 - Figure Supplement 1; Figure 1 - Figure Supplement 2).'. It is not clear why these supplemental figures are being referenced.
d) In the supplement section 'Enrich2 Scoring' has what seem like placeholders for citations in [brackets]
Cbl is a E3 ubiquitin ligase that plays a role in MET regulation through engagement with exon 14, specifically at Y1003 when phosphorylated. This mode of regulation was more highlighted in our previous study. However, since Cbl was only mentioned briefly in this study, we have removed reference to it to simplify the text.
In addition, we have removed the figure 3G reference and corrected the in-text range. We have also removed references to figure supplements where unnecessary and edited the “Enrich2 scoring” method section to now reference missing citations.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Public Reviews:
Reviewer #1 (Public review):
Summary:
In this study from Zhu and colleagues, a clear role for MED26 in mouse and human erythropoiesis is demonstrated that is also mapped to amino acids 88-480 of the human protein. The authors also show the unique expression of MED26 in later-stage erythropoiesis and propose transcriptional pausing and condensate formation mechanisms for MED26's role in promoting erythropoiesis. Despite the author's introductory claim that many questions regarding Pol II pausing in mammalian development remain unanswered, the importance of transcriptional pausing in erythropoiesis has actually already been demonstrated (Martell-Smart, et al. 2023, PMID: 37586368, which the authors notably did not cite in this manuscript). Here, the novelty and strength of this study is MED26 and its unique expression kinetics during erythroid development.
Strengths:
The widespread characterization of kinetics of mediator complex component expression throughout the erythropoietic timeline is excellent and shows the interesting divergence of MED26 expression pattern from many other mediator complex components. The genetic evidence in conditional knockout mice for erythropoiesis requiring MED26 is outstanding. These are completely new models from the investigators and are an impressive amount of work to have both EpoR-driven deletion and inducible deletion. The effect on red cell number is strong in both. The genetic over-expression experiments are also quite impressive, especially the investigators' structure-function mapping in primary cells. Overall the data is quite convincing regarding the genetic requirement for MED26. The authors should be commended for demonstrating this in multiple rigorous ways.
Thank you for your positive feedback.
Weaknesses:
(1) The authors state that MED26 was nominated for study based on RNA-seq analysis of a prior published dataset. They do not however display any of that RNA-seq analysis with regards to Mediator complex subunits. While they do a good job showing protein-level analysis during erythropoiesis for several subunits, the RNA-seq analysis would allow them to show the developmental expression dynamics of all subunit members.
Thank you for this helpful suggestion. While we did not originally nominate MED26 based on RNA-seq analysis, we have analyzed the transcript levels of Mediator complex subunits in our RNA-seq data across different stages of erythroid differentiation (Author response image 1). The results indicate that most Mediator subunits, including MED26, display decreased RNA expression over the course of differentiation, with the exception of MED25, as reported previously (Pope et al., Mol Cell Biol 2013. PMID: 23459945).
Notably, our study is based on initial observations at the protein level, where we found that, unlike most other Mediator subunits that are downregulated during erythropoiesis, MED26 remains relatively abundant. Protein expression levels more directly reflect the combined influences of transcription, translation and degradation processes within cells, and are likely more closely related to biological functions in this context. It is possible that post-transcriptional regulation (such as m6A-mediated improvement of translational efficiency) or post-translational modifications (like escape from ubiquitination) could contribute to the sustained levels of MED26 protein, and this will be an interesting direction for future investigation.
Author response image 1.
Relative RNA expression of Mediator complex subunits during erythropoiesis in human CD34+ erythroid cultures. Different differentiation stages from HSPCs to late erythroblasts were identified using CD71 and CD235a markers, progressing sequentially as CD71-CD235a-, CD71+CD235a-, CD71+CD235a+, and CD71-CD235a+. Expression levels were presented as TPM (transcripts per million).
(2) The authors use an EpoR Cre for red cell-specific MED26 deletion. However, other studies have now shown that the EpoR Cre can also lead to recombination in the macrophage lineage, which clouds some of the in vivo conclusions for erythroid specificity. That being said, the in vitro erythropoiesis experiments here are convincing that there is a major erythroid-intrinsic effect.
Thank you for this insightful comment. We recognize that EpoR-Cre can drive recombination in both erythroid and macrophage lineages (Zhang et al., Blood 2021, PMID: 34098576). However, EpoR-Cre remains the most widely used Cre for studying erythroid lineage effects in the hematopoietic community. Numerous studies have employed EpoR-Cre for erythroid-specific gene knockout models (Pang et al, Mol Cell Biol 2021, PMID: 22566683; Santana-Codina et al., Haematologica 2019, PMID: 30630985; Xu et al., Science 2013, PMID: 21998251.).
While a GYPA (CD235a)-Cre model with erythroid specificity has recently been developed (https://www.sciencedirect.com/science/article/pii/S0006497121029074), it has not yet been officially published. We look forward to utilizing the GYPA-Cre model for future studies. As you noted, our in vivo mouse model and primary human CD34+ erythroid differentiation system both demonstrate that MED26 is essential for erythropoiesis, suggesting that the regulatory effects of MED26 in our study are predominantly erythroid-intrinsic.
(3) Te donor chimerism assessment of mice transplanted with MED26 knockout cells is a bit troubling. First, there are no staining controls shown and the full gating strategy is not shown. Furthermore, the authors use the CD45.1/CD45.2 system to differentiate between donor and recipient cells in erythroblasts. However, CD45 is not expressed from the CD235a+ stage of erythropoiesis onwards, so it is unclear how the authors are detecting essentially zero CD45-negative cells in the erythroblast compartment. This is quite odd and raises questions about the results. That being said, the red cell indices in the mice are the much more convincing data.
Thank you for your careful and thorough feedback. We have now included negative staining controls (Author response image 2A, top). We agree that CD45 is typically not expressed in erythroid precursors in normal development. Prior studies have characterized BFU-E and CFU-E stages as c-Kit+CD45+Ter119−CD71low and c-Kit+CD45−Ter119−CD71high cells in fetal liver (Katiyar et al, Cells 2023, PMID: 37174702).
However, our observations indicate that erythroid surface markers differ during hematopoiesis reconstitution following bone marrow transplantation. We found that nearly all nucleated erythroid progenitors/precursors (Ter119+Hoechst+) express CD45 after hematopoiesis reconstitution (Author response image 2A, bottom).
To validate our assay, we performed next-generation sequencing by first mixing mouse CD45.1 and CD45.2 total bone marrow cells at a 1:2 ratio. We then isolated nucleated erythroid progenitors/precursors (Ter119+Hoechst+) by FACS and sequenced the CD45 gene locus by targeted sequencing. The resulting CD45 allele distribution matched our initial mixing ratio, confirming the accuracy of our approach (Author response image 2B).
Moreover, a recent study supports that reconstituted erythroid progenitors can indeed be distinguished by CD45 expression following bone marrow transplantation (He et al., Nature Aging 2024, PMID: 38632351. Extended Data Fig. 8).
In conclusion, our data indicate that newly formed erythroid progenitors/precursors post-transplant express CD45, enabling us to identify nucleated erythroid progenitors/precursors by Ter119+Hoechst+ and determine their origin using CD45.1 and CD45.2 markers.
Author response image 2.
Representative flow cytometry gating strategy of erythroid chimerism following mouse bone marrow transplantation. A. Gating strategy used in the erythroid chimerism assay. B. Targeted sequencing result of Ter119+Hoechst+ cells isolated by FACS. The cell sample was pre-mixed with 1/3 CD45.2 and 2/3 CD45.1 bone marrow cells. Ptprc is the gene locus for CD45.
(4) The authors make heavy use of defining "erythroid gene" sets and "non-erythroid gene" sets, but it is unclear what those lists of genes actually are. This makes it hard to assess any claims made about erythroid and non-erythroid genes.
Thank you for this helpful suggestion. We defined "erythroid genes" and "non-erythroid genes" based on RNA-seq data from Ludwig et al. (Cell Reports 2019. PMID: 31189107. Figure 2 and Table S1). Genes downregulated from stages k1 to k5 are classified as “non-erythroid genes,” while genes upregulated from stages k6 to k7 are classified as “erythroid genes.” We will add this description in the revised manuscript.
(5) Overall the data regarding condensate formation is difficult to interpret and is the weakest part of this paper. It is also unclear how studies of in vitro condensate formation or studies in 293T or K562 cells can truly relate to highly specialized erythroid biology. This does not detract from the major findings regarding genetic requirements of MED26 in erythropoiesis.
Thank you for the rigorous feedback. Assessing the condensate properties of MED26 protein in primary CD34+ erythroid cells or mouse models is indeed challenging. As is common in many condensate studies, we used in vitro assays and cellular assays in HEK293T and K562 cells to examine the biophysical properties (Figure S7), condensation formation capacity (Figure 5C and Figure S7C), key phase-separation regions of MED26 protein (Figure S6), and recruitment of pausing factors (Figure 6A-B) in live cells. We then conducted functional assays to demonstrate that the phase-separation region of MED26 can promote erythroid differentiation similarly to the full-length protein in the CD34+ system and K562 cells (Figure 5A). Specifically, overexpressing the MED26 phase-separation domain accelerates erythropoiesis in primary human erythroid culture, while deleting the Intrinsically Disordered Region (IDR) impairs MED26’s ability to form condensates and recruit PAF1 in K562 cells.
In summary, we used HEK293T cells to study the biochemical and biophysical properties of MED26, and the primary CD34+ differentiation system to examine its developmental roles. Our findings support the conclusion that MED26-associated condensate formation promotes erythropoiesis.
(6) For many figures, there are some panels where conclusions are drawn, but no statistical quantification of whether a difference is significant or not.
Thank you for your thorough feedback. We have checked all figures for statistical quantification and added the relevant statistical analysis methods to the corresponding figure legends (Figure 2L and Figure S4C) to clarify the significance of the observed differences. The updated information will be incorporated into the revised manuscript.
Reviewer #2 (Public review):
Summary:
The manuscript by Zhu et al describes a novel role for MED26, a subunit of the Mediator complex, in erythroid development. The authors have discovered that MED26 promotes transcriptional pausing of RNA Pol II, by recruiting pausing-related factors.
Strengths:
This is a well-executed study. The authors have employed a range of cutting-edge and appropriate techniques to generate their data, including: CUT&Tag to profile chromatin changes and mediator complex distribution; nuclear run-on sequencing (PRO-seq) to study Pol II dynamics; knockout mice to determine the phenotype of MED26 perturbation in vivo; an ex vivo erythroid differentiation system to perform additional, important, biochemical and perturbation experiments; immunoprecipitation mass spectrometry (IP-MS); and the "optoDroplet" assay to study phase-separation and molecular condensates.
This is a real highlight of the study. The authors have managed to generate a comprehensive picture by employing these multiple techniques. In doing so, they have also managed to provide greater molecular insight into the workings of the MEDIATOR complex, an important multi-protein complex that plays an important role in a range of biological contexts. The insights the authors have uncovered for different subunits in erythropoiesis will very likely have ramifications in many other settings, in both healthy biology and disease contexts.
Thank you for your thoughtful summary and encouraging feedback.
Weaknesses:
There are almost no discernible weaknesses in the techniques used, nor the interpretation of the data. The IP-MS data was generated in HEK293 cells when it could have been performed in the human CD34+ HSPC system that they employed to generate a number of the other data. This would have been a more natural setting and would have enabled a more like-for-like comparison with the other data.
Thank you for your positive feedback and insightful suggestions. We will perform validation of the immunoprecipitation results in CD34+ derived erythroid cells to further confirm our findings.
Reviewer #3 (Public review):
Summary:
The authors aim to explore whether other subunits besides MED1 exert specific functions during the process of terminal erythropoiesis with global gene repression, and finally they demonstrated that MED26-enriched condensates drive erythropoiesis through modulating transcription pausing.
Strengths:
Through both in vitro and in vivo models, the authors showed that while MED1 and MED26 co-occupy a plethora of genes important for cell survival and proliferation at the HSPC stage, MED26 preferentially marks erythroid genes and recruits pausing-related factors for cell fate specification. Gradually, MED26 becomes the dominant factor in shaping the composition of transcription condensates and transforms the chromatin towards a repressive yet permissive state, achieving global transcription repression in erythropoiesis.
Thank you for your positive summary and feedback.
Weaknesses:
In the in vitro model, the author only used CD34+ cell-derived erythropoiesis as the validation, which is relatively simple, and more in vitro erythropoiesis models need to be used to strengthen the conclusion.
Thank you for your thoughtful suggestions. We have shown that MED26 promotes erythropoiesis using the primary human CD34+ differentiation system (Figure 2 K-M and Figure S4) and have demonstrated its essential role in erythropoiesis through multiple mouse models (Figure 2A-G and Figure S1-3). Together, these in vitro and in vivo results support our conclusion that MED26 regulates erythropoiesis. However, we are open to further validating our findings with additional in vitro erythropoiesis models, such as iPSC or HUDEP erythroid differentiation systems.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the current reviews.
Reviewer #2 (Public review):
Dipasree Hajra et al demonstrated that Salmonella was able to modulate the expression of Sirtuins (Sirt1 and Sirt3) and regulate the metabolic switch in both host and Salmonella, promoting its pathogenesis. The authors found Salmonella infection induced high levels of Sirt1 and Sirt3 in macrophages, which were skewed toward the M2 phenotype allowing Salmonella to hyper-proliferate. Mechanistically, Sirt1 and Sirt3 regulated the acetylation of HIF-1alpha and PDHA1, therefore mediating Salmonella-induced host metabolic shift in the infected macrophages. Interestingly, Sirt1 and Sirt3-driven host metabolic switch also had an effect on the metabolic profile of Salmonella. Counterintuitively, inhibition of Sirt1/3 led to increased pathogen burdens in an in vivo mouse model. Overall, this is a well-designed study.<br /> The revised manuscript has addressed all of the previous comments. The re-analysis of flow cytometry and WB data by authors makes the results and conclusion more complete and convincing.
We are immensely grateful to the reviewer for improving the strength of the manuscript by providing insightful comments and for appreciating the work.
Reviewer #3 (Public review):
Summary:
In this paper Hajra et al have attempted to identify the role of Sirt1 and Sirt3 in regulating metabolic reprogramming and macrophage host defense. They have performed gene knock down experiments in RAW macrophage cell line to show that depletion of Sirt1 or Sirt3 enhances the ability of macrophages to eliminate Salmonella Typhimurium. However, in mice inhibition of Sirt1 resulted in dissemination of the bacteria but the bacterial burden was still reduced in macrophages. They suggest that the effect they have observed is due to increased inflammation and ROS production by macrophages. They also try to establish a weak link with metabolism. They present data to show that the switch in metabolism from glycolysis to fatty acid oxidation is regulated by acetylation of Hif1a, and PDHA1.
Strengths:
The strength of the manuscript is that the role of Sirtuins in host-pathogen interactions have not been previously explored in-depth making the study interesting. It is also interesting to see that depletion of either Sirt1 or Sirt3 result in a similar outcome.
Weaknesses:
The major weakness of the paper is the low quality of data, making it harder to substantiate the claims. Also, there are too many pathways and mechanisms being investigated. It would have been better if the authors had focussed on either Sirt1 or Sirt3 and elucidated how it reprograms metabolism to eventually modulate host response against Salmonella Typhimurium. Experimental evidences are also lacking to prove the proposed mechanisms. For instance they show correlative data that knockdown of Sirt1 mediated shift in metabolism is due to HIF1a acetylation but this needs to be proven with further experiments.
As the public review of the reviewer remains unaltered as the previous version without further recommendations for authors, we are sticking to our former author’s response. We respect the reviewer’s opinion and thank the reviewer for the critical analysis of our work.
---------
The following is the authors’ response to the previous reviews.
Reviewer #2 (Public Review):
Dipasree Hajra et al demonstrated that Salmonella was able to modulate the expression of Sirtuins (Sirt1 and Sirt3) and regulate the metabolic switch in both host and Salmonella, promoting its pathogenesis. The authors found Salmonella infection induced high levels of Sirt1 and Sirt3 in macrophages, which were skewed toward the M2 phenotype allowing Salmonella to hyper-proliferate. Mechanistically, Sirt1 and Sirt3 regulated the acetylation of HIF-1alpha and PDHA1, therefore mediating Salmonella-induced host metabolic shift in the infected macrophages. Interestingly, Sirt1 and Sirt3-driven host metabolic switch also had an effect on the metabolic profile of Salmonella. Counterintuitively, inhibition of Sirt1/3 led to increased pathogen burdens in an in vivo mouse model. Overall, this is a well-designed study.
Comments on revised version:
The authors have performed additional experiments to address the discrepancy between in vitro and in vivo data. While this offers some potential insights into the in vivo role of Sirt1/3 in different cell types and how this affects bacterial growth/dissemination, I still believe that Sirt1/3 inhibitors could have some effect on the gut microbiota contributing to increased pathogen counts. This possibility can be discussed briefly to give a better scenario of how Sirt1/3 inhibitors work in vivo. Additionally, the manuscript would improve significantly if some of the flow cytometry analysis and WB data could be better analyzed.
We are highly grateful for your valuable and insightful comments. Thank you for appreciating the merit of our manuscript. As rightly pointed out by the eminent reviewer, we acknowledge the probable link of Sirtuin on gut microbiota and its effect on increased bacterial loads as indicated by previous literature studies (PMID: 22115311, PMID: 19228061). These reports suggested that a low dose of Sirt1 activator, resveratrol treatment in rats for 25 days treatment under 5% DSS induced colitis condition led to alterations in gut microbiota profile with increased lactobacilli and bifidobacteria alongside reduced abundance of enterobacteria. This study correlates with our study wherein we have detected enhanced Salmonella (belonging to Enterobacteriaceae family) loads under both Sirt1/3 in vivo knockdown condition or inhibitor-treated condition in C57BL/6 mice and reduced burden under Sirt-1 activator treatment SRT1720.
As per your valid suggestion, we have discussed this possibility in our discussion section. (Line- 541-548).
We have incorporated the suggestions for the improvement in the analysis of WB data and flow cytometry.
Reviewer #3 (Public Review):
Summary:
In this paper Hajra et al have attempted to identify the role of Sirt1 and Sirt3 in regulating metabolic reprogramming and macrophage host defense. They have performed gene knock down experiments in RAW macrophage cell line to show that depletion of Sirt1 or Sirt3 enhances the ability of macrophages to eliminate Salmonella Typhimurium. However, in mice inhibition of Sirt1 resulted in dissemination of the bacteria but the bacterial burden was still reduced in macrophages. They suggest that the effect they have observed is due to increased inflammation and ROS production by macrophages. They also try to establish a weak link with metabolism. They present data to show that the switch in metabolism from glycolysis to fatty acid oxidation is regulated by acetylation of Hif1a, and PDHA1.
Strengths:
The strength of the manuscript is that the role of Sirtuins in host-pathogen interactions has not been previously explored in-depth making the study interesting. It is also interesting to see that depletion of either Sirt1 or Sirt3 results in a similar outcome.
Weaknesses:
The major weakness of the paper is the low quality of data, making it harder to substantiate the claims. Also, there are too many pathways and mechanisms being investigated. It would have been better if the authors had focussed on either Sirt1 or Sirt3 and elucidated how it reprograms metabolism to eventually modulate host response against Salmonella Typhimurium. Experimental evidence is also lacking to prove the proposed mechanisms. For instance they show correlative data that knock down of Sirt1 mediated shift in metabolism is due to HIF1a acetylation but this needs to be proven with further experiments.
We appreciate the reviewer’s critical analysis of our work. In the revised manuscript, we aimed to eliminate the low-quality data sets and have tried to substantiate them with better and conclusive ones, as directed in the recommendations for the author section. We agree with the reviewer that the inclusion of both Sirtuins 1 and 3 has resulted in too many pathways and mechanisms and focusing on one SIRT and its mechanism of metabolic reprogramming and immune modulation would have been a less complicated alternative approach. However, as rightly pointed out, our work demonstrated the shared and few overlapping roles of the two sirtuins, SIRT1 and SIRT3, together mediating the immune-metabolic switch upon Salmonella infection. As per the reviewer’s suggestion, we have performed additional experiments with HIF-1α inhibitor treatment in our revised manuscript to substantiate our correlative findings on SIRT1-mediated regulation of host glycolysis (Fig.7G). We wanted to clarify our claim in this regard. Our results suggested that loss of SIRT1 function triggered increased host glycolysis alongside hyperacetylation of HIF-1α. HIF-1α is reported to be one of the important players in glycolysis regulation (Kierans SJ, Taylor CT. Regulation of glycolysis by the hypoxia-inducible factor (HIF): implications for cellular physiology. J Physiol. 2021;599(1):23-37. doi:10.1113/JP280572.) and additionally, SIRT1 has been shown to regulate HIF-1α acetylation status (Lim JH, Lee YM, Chun YS, Chen J, Kim JE, Park JW. Sirtuin 1 modulates cellular responses to hypoxia by deacetylating hypoxia-inducible factor 1 alpha. Mol Cell. 2010;38(6):864-878. doi:10.1016/j.molcel.2010.05.023.) Further, ectopic expression of SIRT1 has been demonstrated to reduce glycolysis by negatively regulating HIF-1α. (Wang Y, Bi Y, Chen X, et al. Histone Deacetylase SIRT1 Negatively Regulates the Differentiation of Interleukin-9-Producing CD4(+) T Cells. Immunity. 2016;44(6):1337-1349. doi:10.1016/j.immuni.2016.05.009). We have subsequently shown in Fig. 7G, that the increase in host glycolysis upon SIRT knockdown in the infected macrophages gets lowered upon HIF-1α inhibitor treatment, suggesting that one of the mechanisms of SIRT-mediated regulation of host glycolysis is via regulation of HIF-1α. However, this warrants further future mechanistic research.
Recommendations for the authors:
Reviewer #2 (Recommendations For The Authors):
(1) Figures 8I-S: are only viable cells used for analysis? Please provide gating strategy used for these analyses.
(2) Many changes seen in WB seem to be marginal. Since the authors used densitometric plot to quantify the band intensities, I expect these experiments were repeated at least three times. Please indicate the number of repeats. For instance, Figures 7C, 7I (UI SCR vs UI shSIRT3), 7J, show marginal changes or no changes. What do other WB images look like? Are they more convincing than the ones currently shown? Please provide them in the response letter.
(3) Figure 7C: label is a bit misleading. Please relabel the figure title to Acetylated HIF vs total levels
(4) Figure 7J: which band is AcPDHA1?
(1) We are highly apologetic for not clarifying our gating strategy for the analysis.
We initially gated the viable splenocyte population based on Forward scatter (FSC) and Side Scatter (SSC). This gated population was further subjected to gating based on cell FSC-H (height) versus FSC-A (area). Subsequently, the population was gated as per SSC-A and GFP (expressed by intracellular bacteria) based on the autofluorescence exhibited by the uninfected control (Fig. 8I-J).
Author response image 1.
UNINFECTED
Author response image 2.
VEHICLE CONTROL INFECTED
Author response image 3.
EX-527 INFECTED
Author response image 4.
3TYP INFECTED
Author response image 5.
SRT 1720 INFECTED
For gating different cell types such as F4/80 (PE) positive population in Fig. 8K-L, the viable cell population was gated based on SSC-A versus PE-A to gate the macrophage population. These macrophage populations were gated further based on GFP (Salmonella) + population to obtain the percentage of macrophage population harboring GFP+ bacteria. Similar strategies were followed for other cell types as depicted in Fig. 8M-S, Fig. S8.
(2) We agree with the reviewer’s concern with the marginal changes in the western blots (Figures 7C, 7I (UI SCR vs UI shSIRT3), 7J). As per the suggestions, we have provided the alternate blot images and have indicated the number of repeats in the manuscript. The alternate blot images are provided herewith:
Author response image 6.
Alternate blot images for Fig. 7B-C
Author response image 7.
Alternate blot images for Fig. 7I, J
(1) We are highly thankful to the reviewer for recommending this suggestion. We have made the necessary modifications of relabelling Fig. C to Acetylated HIF-1α over total HIF-1α as per the suggestion.
(2) 7J Acetylated PDHA1 has been duly pointed as per the suggestion. We are extremely apologetic for the inconvenience caused.
Author response image 8.
Reviewer #3 (Recommendations For The Authors):
The authors have done some work to improve the manuscript. However, the data presented lacks clarity.
Fig 4B: I still do not see a change in Ac p65 in the less saturated blot. It looks reduced as the band is distorted. I am not sure how this could be quantified.
Fig S2 b-actin bands are hyper saturated, and it is not possible to decipher the knockdown efficiency. It is probably better to provide a ponceau staining similar to S2C. The band intensity values are out of place.
Fig 5F HADHA blot: Lane 1 expression appears to be significantly higher than lane 3, but the values mentioned do not match the intensity of the bands.
It is hard to interpret the authors' claim that the shift in metabolism is HIF1a-dependent.
Fig 7B: I would expect HIF1a acetylation to be increased in UI ShSIRT1 compared to UI SCR. The blot shows reduced HIF1a acetylation.
Fig 7D: SIRT1 immunoprecipitates with HIF1a equally under all conditions. Is this what the authors expect? Labelling of the blots are not clear. It looks like the bottom SIRT1 blot is from Beads IgG control.
Fig 7H: How does PDHA1 interact with SIRT3 so strongly in shSIRT3 cells (lane 2)?
Authors have mentioned in their response that a knockdown of 40% has been achieved in the uninfected but the blot does not reflect that. SIRT3 expression seems to be more in the knockdown.
Blots are also not labelled properly especially Input. The lanes are not marked.
We thank the reviewer for acknowledging the improvements in the revised version and for suggesting further clarifications and improvements.
We have tried to incorporate the specified modifications to the best of our abilities in the revised manuscript.
We are highly apologetic for the inconclusive blot image in the figure 4B. We have provided an alternative blot image with better clarity for Fig.4B used for quantification analysis.
Author response image 9.
As per the reviewer’s valuable suggestions, we have provided the ponceau image in the Fig. S2B.
We thank the reviewers for rightly pointing out the discrepancy in the band intensity quantification in the Fig. 5F. We have re-evaluated the intensities on imageJ and have provided with the correct band intensities. We are highly apologetic for the inaccuracies.
As per the reviewer’s previous suggestion, we have performed additional experiments with HIF-1α inhibitor treatment in our revised manuscript to substantiate our correlative findings on SIRT1-mediated regulation of host glycolysis (Fig.7G). We wanted to clarify our claim in this regard. Our results suggested that loss of SIRT1 function triggered increased host glycolysis alongside hyperacetylation of HIF-1α. HIF-1α is reported to be one of the important players of glycolysis regulation (Kierans SJ, Taylor CT. Regulation of glycolysis by the hypoxia-inducible factor (HIF): implications for cellular physiology. J Physiol. 2021;599(1):23-37. doi:10.1113/JP280572.) and additionally, SIRT1 has been shown to regulate HIF-1α acetylation status (Lim JH, Lee YM, Chun YS, Chen J, Kim JE, Park JW. Sirtuin 1 modulates cellular responses to hypoxia by deacetylating hypoxia-inducible factor 1alpha. Mol Cell. 2010;38(6):864-878. doi:10.1016/j.molcel.2010.05.023.) Further, ectopic expression of SIRT1 has been demonstrated to reduce glycolysis by negatively regulating HIF-1α. (Wang Y, Bi Y, Chen X, et al. Histone Deacetylase SIRT1 Negatively Regulates the Differentiation of Interleukin-9-Producing CD4(+) T Cells. Immunity. 2016;44(6):1337-1349. doi:10.1016/j.immuni.2016.05.009). We have subsequently shown in Fig. 7G, that the increase in host glycolysis upon SIRT knockdown in the infected macrophages gets lowered upon HIF-1α inhibitor treatment, suggesting that one of the mechanisms of SIRT-mediated regulation of host glycolysis is via regulation of HIF-1α. However, this warrants further future mechanistic research.
We agree with the reviewer’s claim of increased HIF-1α acetylation in the UI sh1 versus UI SCR. The apparent reduced acetylation depicted in UI sh1 in Fig. 7B could be attributed to lower HIF-1α levels in the UI sh1 compared to UI SCR. Therefore, we have provided an alternate blot image that been used for quantification in Fig. 7C (Author response image 6).
To answer the reviewer’s question in Fig. 7D, we have noticed more or less equal degree of immunoprecipitation of HIF-1α under pull down of HIF-1α in all the sample cohorts under conditions of SIRT1 inhibitor treatment. However, we have observed reduced interaction of HIF-1α with SIRT1 in the infected sample upon SIRT1 inhibitor treatment.
We thank the reviewers for suggesting improvements in the blot labelling and for raising this concern. We have corrected the blot labelling to avoid the previous confusion.
We appreciate the reviewer’s concern and therefore we have provided an alternate blot image for Fig. 7H which might address the previous stated concern wherein we have achieved an enhanced SIRT3 knockdown percentage.
We are extremely apologetic for the improper labelling of the Input blot with unmarked lanes. We have addressed this issue by labelling the lanes in the input section of the blots.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Recommendations For The Authors):
Figures 1 and 2. How do the authors know that the lysine mutations are specific to constitutive activity and not because it is causing the channel to be now voltage sensitive?
As shown in the revised Figs. 1b, S2a, and 3b, TMEM16F I521K/M522K, TMEM16F I521E, and TMEM16A I546K/I547K spontaneously expose PS, respectively. Neither membrane depolarization nor calcium stimulation was introduced under these conditions and the cells were grown in calcium-free media after transfection to limit calcium-dependent activation. Our new experiments further demonstrate that TMEM16F T526K (Fig. 1b) and TMEM16A E551K (Fig. 3b), which are further away from the activation gate, exhibit either strongly attenuated or lack spontaneous lipid scrambling activity. According to these results, the gain-of-function mutants (TMEM16F
I521K/M522K/I521E and TMEM16A I546K/I547K) are indeed constitutively active. This constitutive scramblase activity is not due to a gain of voltage sensitivity as ion channel activity is also minimal around the resting membrane potential of a HEK cell (Fig. 1d, e and Fig. 3d, e).
The authors see very large currents of 5 -10 nA in their electrophysiology experiments in Figures 2D and 3D. I understand that Figure 2D are whole-cell recordings but are the authors confident that the currents that they are recordings from the mutants are indeed specific to TMEM16A. More importantly, in Figure 3D they see 3-5nA currents in insideout patches, which is huge. They have no added divalent in their bath solution, which could lead to larger single-channel amplitudes, but 3-5nA seems excessive. Some control to demonstrate that these are indeed OSCA1.2 currents is important.
TMEM16A and TMEM16F are well-known for their high cell surface expression. Therefore, the current amplitude is usually huge even in excised inside-out or outside-out patches—please see our previous publications for details: 1) 10.1016/j.cell.2012.07.036, 2) 10.7554/eLife.02772, 3) 10.1038/s41467-019-11784-8, 4) 10.1038/s41467-019-09778-7, 5) 10.1016/j.celrep.2020.108570, 6) 10.1085/jgp.202012704, and 7) 10.1085/jgp.202313460.
HEK293 cells do not have endogenous TMEM16A (https://doi.org/10.1038/nature07313, 10.1016/j.cell.2008.09.003 , DOI: 10.1126/science.1163518). It therefore serves as a widely used cell line for studying TMEM16A biophysics. As overexpressing the WT control barely elicited any obvious current in 0 Ca2+ (Fig. 3d), there is no doubt that the large outward-rectifying current (hallmark of CaCC) in the revised Fig. 3d (previous Fig. 2D) was elicited from the mutant TMEM16A channels. The strong outward rectification also rules out the possibility of this being leak current.
Regarding Fig. 4d (previous Fig. 3D), OSCA1.2 has excellent surface expression as shown in Fig. 4b. OSCA1.2 also has much higher single channel conductance (121.8 ± 3.4 pS, 10.7554/eLife.41844) than TMEM16A (~3-8 pS) and TMEM16F (<1 pS). Therefore, recording nA OSCA1.2 current from excised patches is normal given larger OSCA1.2 current at depolarized voltages than the current recorded at hyperpolarized voltages (please see our explanation in the next response). As the reviewer pointed out, lack of divalent ions in our experimental conditions may also partially contribute to the large conductance. To further verify, we conducted mock transfection recordings (please see Author response image 1 below). WT- but not mock (GFP)transfected cells gave rise to large current, further supporting that the recorded current was indeed through OSCA1.2.
Author response image 1.
Representative inside-out currents for mock (GFP)- and OSCA1.2 WT-transfected cells. OSCA1.2 is responsible for nA currents elicited by the pressure and voltage protocols shown.
Figure 3D and 5D. Most of the traces and current quantification is done at positive potentials and is outward current. Do the authors observe inward currents? It is difficult to judge by the figures since currents are so large. OSCA/TMEM63s are cationic channels and all published data on these channels have demonstrated robust inward currents at negative, physiologically relevant potentials. The lack of inward currents but only large outward currents suggests that these mutations could be doing something else to the channel.
Yes. We indeed observe inward current at negative holding potentials under pressure clamp (Author response image 2). However, mechanosensitive OSCA and TMEM63A channels are also voltage dependent. Their outward current is an order of magnitude larger at depolarized voltages (e.g., Author response image 2, also 10.7554/eLife.41844, see Fig. 1H).
Author response image 2.
Voltage-dependent rectification of OSCA1.2 current. a. Representative OSCA1.2 trace (bottom) elicited by a voltage-ramp under -50 mmHg (top). b. The difference in inward and outward current amplitudes.
We found that quantifying the OSCA1.2 outward current has advantages over the inward current. Usually, using the gold standard pressure clamp protocol at negative holding voltages, peak inward current amplitude is quantified. However, OSCA inward current quickly inactivates (10.7554/eLife.41844, see Fig. 1C). This makes robust quantification and comparison with mutant channels difficult. Holding the membrane at a constant pressure and measuring OSCA1.2 G-V overcomes these issues associated with the classical inward current measurements. The large depolarization-driven outward current does not inactivate, and robust tail current (Response Fig. 1, 2) allows us to construct G-V relationships. We found quantifying mutants’ voltage dependence at constant pressure is more consistent than quantifying pressure dependence at constant voltage. These advantages make our new protocol preferable to the commonly used gold standard pressure clamp protocol for characterizing and comparing the gating mutations identified in this manuscript.
Figure 3 and 5. Why are mechanically activated currents being recorded at random pressure stimuli (-50 mmHg for OSCA) and (-80 mmHg for Tmem63a)? The gold standard in the field is to run an entire pressure response curve. Given that only outward currents are observed at membrane potentials +120mV and above at 0mmHg, this questions whether they are indeed constitutively active.
As we explained in the previous response, both voltage and membrane stretch activate OSCA/TMEM63A channels. We found measuring voltage dependence under constant pressure provided more consistent quantification than the gold standard pressure response protocol. This may be due to the variability of applied membrane tension under repeated stretches versus the more consistent applied voltage. Additionally, we chose -50 mmHg and -80 mmHg to reflect the reported differences in half-maximal pressures between OSCA1.2 and TMEM63A (e.g., P50 ~55 mmHg for 1.2 and ~61 mmHg for 63A in 10.7554/eLife.41844 versus ~86 mmHg for 1.2 and -123 mmHg for 63A in 10.1016/j.neuron.2023.07.006).
We also used higher pressure in cell attached mode to increase TMEM63A current amplitudes, which are usually tiny. We have updated our method section (Lines 329334) to further clarify why we used these protocols.
Please note that in TMEM16 proteins, ions and lipids might not always co-transport.
This means that under certain conditions, only one type of substrate may go through. For instance, in WT TMEM16F, Ca2+ stimulation can easily trigger PS exposure at resting membrane potential. No ionic currents are elicited until strong depolarization is applied. Similarly, the TMEM16F GOF mutations spontaneously transport lipids, leading to loss of lipid asymmetry (Fig. 1b, c). However, in 0 Ca2+, these TMEM16F mutant channels still need strong depolarization for ion conduction (Fig. 1d, e). Although the detailed mechanism still needs to be further investigated, the OSCA1.2 and TMEM63A GOF mutations share similar features with TMEM16 proteins, exhibiting ion conduction under high pressures and depolarizing voltages, yet constitutively active scrambling.
Some clarity is needed for their choice of residues. I understand that a lot of this is also informed by the structures of these ion channels. According to the alignment shown in Supplementary Figure 1, they chose LA for OSCA1.2, which is in line with the IM (TMEM16F) and II(TMEM16A) residues but for Tmem63a they chose the hydrophobic gate residue W and S. Was the A476 tested? Also, OSCA1.2 already has a K in the hydrophobic gating residue region. How do the authors reconcile this with their model?
We appreciate this critical comment. We have included the characterization of TMEM63A A476K (Fig. 6, corresponding to M522 in 16F, I547 in 16A, and A439 in OSCA1.2). Interestingly, A476K transfected cells did not show obvious spontaneous PS exposure yet exhibited a modest shift in V50 comparable to W472K and S475K. These differences may reflect the high-tension activated nature of the TMEM63 proteins (10.1016/j.neuron.2023.07.006) as compared to OSCA1.2, where the corresponding mutation (A439K, Fig. 4b, c) showed very little spontaneous activity and required hypotonic stimulation to promote more robust PS exposure (Fig. 5).
Furthermore, as we showed in Figs. 1b-c and 3b-c, there is a lower limit (towards the Cterminus) of the TM 4 lysine mutation effect, which becomes insufficient to cause a constitutively open pore for spontaneous lipid scrambling. It is possible that TMEM63A A476K represents the lower limit of TM 4 mutations that can convert TMEM63A into a spontaneous lipid scramblase.
Regarding OSCA1.2 K435 and TMEM63A W472, these sites correspond to the hydrophobic gate residues on TM 4 in TMEM16F (F518, Fig. 1a) and TMEM16A (L543, Fig. 3a) so it is unsurprising to us that a lysine mutation at this site causes constitutive scramblase activity in TMEM63A (Fig. 6b, c). For OSCA1.2, it is more intriguing since this residue is already a lysine (K435). In Supplementary Fig. 5 our new experiments show that neutralizing K435 with leucine (K435L) in the background of L438K significantly attenuates spontaneous PS exposure from ~63% PS positive for L438K alone (two lysine residues) to ~31% for K435L/L438K (one lysine). One the other hand, the K435L mutation by itself is also insufficient to induce PS exposure. Therefore, the endogenous lysine at residue 435 has an additive effect on the spontaneous scramblase activity of L438K. We believe the explanation for this result lies in experiments conducted in model transmembrane helices, which have shown that stacking hydrophilic side chains within the membrane interior promotes trans-bilayer lipid flipping (see 10.1248/cpb.c22-00133).
These same studies also support our observation (10.1038/s41467-019-09778-7) that highly hydrophilic side chains (such as lysine or glutamic acid) accelerate trans-bilayer lipid flipping more effectively than hydrophobic side chains such as isoleucine or alanine (Author response image 3, see also 10.1021/acs.jpcb.8b00298).
Author response image 3.
Trans-bilayer lipid flipping rates (kflip) accelerate with increasing side chain hydropathy for a residue placed in the center of a model transmembrane helical peptide
How do the authors know that osmotic shock is indeed activating OSCA1.2 and TMEM63A? If they can record from the channels then electrophysiology data that confirms activation of the channel in the presence of hypoosmotic shock will strengthen the osmolarity active scramblase activity demonstrated in Figure 4. So far, there is conclusive data showing that they are mechanically activated but conclusive electrophysiological data for OSCA/TMEM63 osmolarity activation is not described yet, including the reference (38) they indicate in line 132. Although osmotic shock can perturb mechanical properties of the membrane it can also activate volume-regulated anion channels, which are also present in HEK cells.
Thank you for raising this important question. While reference 38, (now reference 39) shows direct electrophysiological evidence of hypertonicity-induced current (e.g., Fig. 4 f, g, i, and j in 10.1038/nature13593), direct electrophysiological evidence that OSCA/TMEM63 can be activated by hypotonic stimulation is still missing. To address this question, we conducted whole-cell patch clamp experiments on mocktransfected and OSCA1.2 WT-transfected cells stimulated with 120 mOsm/kg hypotonic solution, comparable to the same conditions as hypotonic-induced scrambling shown in Fig. 5. As shown in Supplementary Fig. 6, our whole-cell recording detected a slowly evolving yet robust outward rectifying current in OSCA1.2-transfected cells, which was not observed in mock transfected cells.
To avoid the contamination from endogenous SWELL osmo-/volume-regulated chloride channels, our new experiment used 140 mM Na gluconate to replace NaCl in both the pipette and the bath solution. Because SWELL/VRAC channels are minimally permeable to gluconate anions (e.g., 10.1007/BF00374290), we conclude that hypotonic stimulation can indeed activate OSCA1.2 albeit with perhaps lower efficiency compared to mechanical stimulation.
Minor comments
What is the timeline for the scramblase assay for all the experiments (except Figure 4)? How long is the AnnexinV incubated before imaging?
Thank you for pointing out this point where we have not provided sufficient detail. Cells were imaged in the scramblase assay (including in Fig. 4, now revised Fig. 5) in AnnexinV-containing buffer immediately and without a formal incubation period because AnnexinV binding to exposed PS proceeds rapidly. We have included additional detail in the methods section to eliminate any confusion (Lines 310-312).
In some places of the document, it says OSCA/TMEM63, and in other places, it is denoted as TMEM63/OSCA. The literature so far has always called the family OSCA/TMEM63- please stay consistent with the field.
Thank you for pointing this out, we have corrected these instances to be consistent with the field.
Reviewer #2 (Recommendations For The Authors):
(1) The authors' statement that the channel/scramblase family members have a relatively low "energetic barrier for scramblase" activity needs further support. While mutating the hydrophobic channel gate certainly could destabilize ion conduction to cause a GOF effect on channel activity, it is still not clear why scramblase activity, which is tantamount to altered permeation, happens in the mutant channels. Are permeation and channel gating (opening) coupled in these channels? If so, what is the basis for the coupling? Is scramblase activity only observed when the gating is destabilized or are they separable?
We appreciate these great questions. For the question about the ‘energetic barrier’ statement, please see our response to point (3) where we have carried out MD simulations of the OSCA1.2 WT and L438K mutant to provide insight into how the permeation pathway is altered by these mutations.
Regarding why TMEM16A can be converted into a scramblase, we use the extensively studied TMEM16 proteins as examples to improve our current understanding of OSCA/TMEM63 proteins. For further details please see our original paper (10.1038/s41467-019-09778-7) and our review (10.3389/fphys.2021.787773), which are summarized as follows:
(1) The “neck region”, consisting of the exofacial halves of TMs 3-6, form the poregate region for both ion and lipid permeation (Author response image 4B). In the closed state, the neck region is constricted and TMs 4 and 6 interact with each other, preventing substrate permeation. The hydrophobic inner activation gate that we identified (10.1038/s41467-019-09778-7) resides right underneath the inner mouth of the neck region, controlling both ion and lipid permeation scrambling.
(2) Based on our functional observations and the available scramblase structures of TMEM16 proteins in multiple conformations, we proposed a clamshell-like gating model to describe TMEM16 lipid scrambling (Author response image 4D). According to this model, Ca2+-induced conformational changes weaken the TM 4/6 interface. This promotes the separation of the two transmembrane segments, analogous to the opening of a clam shell, allowing a membrane-spanning groove to facilitate permeation of the lipid headgroup.
(3) For the CaCC, TMEM16A, Ca2+ binding dilates the pore. However, the binding energy likely cannot open the TM 4/6 interface at the neck region so, in the absence of groove formation, only Cl- ions but not lipids can permeate. (Pore dilation model, Author response image 4C).
(4) Introducing charged residues near the inner activation gate disrupts the neck region, potentially by weakening the hydrophobic interactions between TMs 4 and 6. This mutational effect results in constitutively active TMEM16F scramblases and enables spontaneous lipid permeation in the TMEM16A CaCC.
(5) In our revision, we tested additional mutations with different side chain properties (Supplementary Fig. 2), validating previous findings by us (10.1038/s41467-01909778-7) and others (10.1038/s41467-022-34497-x) that gate disruption increases with the side chain hydropathy of the mutation.
(6) We further extended lysine mutations to two helical turns below the inner activation gate on TM 4 and identified a lower limit for mutation-induced spontaneous scramblase activity in TMEM16F and TMEM16A (Figs. 1b, c and 3b, c, respectively). Together, all these points lend additional support to our proposed gating models for TMEM16 proteins, which we postulate may also relate to the OSCA/TMEM63 family based on the evidence provided in our manuscript.
Author response image 4.
Model of gating (and regulatory) mechanisms in the TMEM16 family. (B) overall architecture and proposed modules, (C) pore-dilation gating model for CaCCs, (D) Clamshell gating model for CaPLSases.
Regarding the relationship between ion and lipid permeation through TMEM16 scramblases, the following is the summary of our current understanding:
(1) Functionally, ion and lipid permeation are not necessarily obligatory to each other. This is evidenced by our previous biophysical characterizations of TMEM16F ion channel and lipid scramblase activities. Ca2+ can trigger TMEM16F lipid scrambling at resting membrane potentials, however, Ca2+ alone is insufficient to record TMEM16F current. Strong membrane depolarization synergistically with elevated intracellular Ca2+ is required to activate ion permeation. Based on these observations, we postulate that ions and lipids may have different extracellular gates, despite sharing an inner activation gate (10.1038/s41467-019-09778-7). Ca2+ alone may sufficiently open the inner gate (and extracellular gate) for lipids, whereas depolarization is likely required to open the extracellular gate and allow ion flux. Further structure-function studies are needed to test this hypothesis.
(2) Structurally, the open conformation of TMEM16 scramblases such as the fungal orthologs and human TMEM16K (Supplementary Fig. 1 b-d) are widely open, which allows lipid and ion co-transport. Ion and lipid co-transport has also been demonstrated in various MD simulations (e.g., 10.7554/eLife.28671, 10.3389/fmolb.2022.903972, and 10.1038/s41467-021-22724-w)
(3) Functionally, we (10.1085/jgp.202012704) and others (10.7554/eLife.06901.001) have measured dual recording of channel and scramblase activities, also demonstrating that ions and lipids are co-transported simultaneously when the proteins are fully activated.
(4) In this manuscript, we also provide multiple examples (TMEM16F in Fig. 1, TMEM16A in Fig. 3, OSCA1.2 in Fig. 4, and TMEM63A in Fig. 6) of mutations showing spontaneous phospholipid scramblase activities, yet their channel activities require strong depolarization or, in the case of TMEM63A, high pressures to be elicited.
Together, this new evidence further supports our hypothesis that there might be multiple gates for ion and lipid permeation, in addition to the shared inner gate we previously identified. We hope these detailed explanations help convey the intricacy of these intriguing questions. Of course, future studies are needed to test our hypothesis and elucidate the complex relationship between ion and lipid permeation of these proteins.
(2) One weakness in the experimental approach is the very limited number of substitutions used to infer the conclusion regarding the energetic barrier and other conclusions relating to scramblase activity. Additional substitutions of charged and polar amino acids at the hydrophobic gate would be helpful in illuminating the molecular determinants of the GOF phenotype and also reveal varying patterns of lipid permeation which could be enormously informative. These additional mutations for analysis of TMEM16F and OSCA should be added to the study.
We appreciate these great suggestions which were shared by multiple reviewers. We have included our duplicated response below.
“Response to reviewers 2 & 3: In our 2019 paper (10.1038/s41467-019-09778-7), we have systematically tested the side chain properties at the inner activation gate of TMEM16F on lipid scrambling activity (Response Fig. 6) and, since then, these results have been supplemented by others as well (10.1038/s41467-022-34497-x). In summary, mutating the inner activation gate residues to polar or charged residues generally results in constitutively activated scramblases without requiring Ca2+ (Fig 5a in 10.1038/s41467-019-09778-7). Because these residues form a hydrophobic gate, introducing smaller side chains via alanine substitution are also gain-of-function with the Y563A mutant as well as the F518A/Y563A/I612A variant being constitutively active (Fig. 3a in 10.1038/s41467-019-09778-7). Meanwhile, mutating these gate residues to hydrophobic amino acids causes no change for I612W, a slight gain-of-function for F518W, slight loss-of-function of F518L, and complete loss-of-function for Y563W (Fig. 4b in 10.1038/s41467-01909778-7). These findings clearly demonstrate that the side-chain properties are critical for regulating the gate opening. Charged mutations including lysine and glutamic acid are the most effective to promote gate opening (Fig 5a in 10.1038/s41467-019-09778-7).
Similarly, others have observed that side chain hydropathy at the F518 site in TMEM16F correlates with shifts in the Ca2+ EC50 (Fig. 2 of 10.1038/s41467-022-34497-x). Note that this publication resolved the structure of the TMEM16F F518H mutant, revealing a previously unseen conformation that we have highlighted in Supplementary Fig. 1e and discussed in lines 235-238. Please also see our response to Reviewer #1 above, where we discuss discoveries in model transmembrane helical peptide systems showing that transbilayer lipid flipping rates correlate with side chain hydropathy (Author response image 3), distance between stacked hydropathic residues (schematic in 10.1248/cpb.c22-00133), and even helical angle between stacked side chains (not show).
Following the reviewers’ suggestions, we have tested additional mutations in alternative locations and with different side chains.
(1) We have added data for TMEM16F I521A and I521E to demonstrate a similar effect of alternative side chains to what has previously been reported by us and others. We found that I521A failed to show spontaneous scrambling activity (Supplementary Fig. 2), yet I521E (Supplementary Fig. 2) is a constitutively active lipid scramblase, similar to I521K (Fig. 1). This further demonstrates that gate disruption correlates with the side chain hydropathy and that this site lines a critical gating interface.
(2) We also added lysine mutations two helical turns below the conserved inner activation gate for TMEM16F T526 (Fig. 1), TMEM16A E551 (Fig. 3). We found that there is indeed a lower limit for the observed effect in TMEM16, where lysine mutations no longer induce spontaneous lipid scrambling activity. This indicates that when TM 4/6 interaction is weaker toward intracellular side (Figs. 1a, 3a), the TM 4 lysine mutation loses the ability to promoting lipid scrambling by disrupting the TM 4/6 interface to enable clamshell-like opening of the permeation pathway.
(3) We added a TMEM16F lysine mutation on TM 6 at residue I611 (Fig. 2). Similar to I612K (Response Fig. 6), I611K also leads to spontaneous lipid scrambling and enhanced channel activity in the absence of calcium (Fig. 2). This shows that charged mutations along TM 6 can also promote lipid scrambling, strengthening our model that hydrophobic interactions along the TM 4/6 interface are critical for gating and lipid permeation.”
(3) Related to the above point, it would be enormously useful to perform even limited computational modelling to support the "energetic barrier" statement. Specifically, can the authors model waters in the putative pore to examine water occupancy in the WT and mutant channels to better understand how the barrier for ions and lipids is altered in the TMEM16?
We appreciate this suggestion and have now conducted atomistic MD simulations of OSCA1.2 WT and L438K mutant for ~1 μs (Supplementary Fig. 4). The simulations revealed, elevated water occupancy in the pore region of the L438K mutant, likely due to a widening at the TM 4/6 interface. Conversely, the WT interface remained constricted, largely disallowing water occupancy. These computational results support our previously proposed clamshell-like gating model for TMEM16 scramblases and provide strong support that the L438K mutation is disrupting the interaction of the TM 4/6 interface, in turn reducing the energetic barrier for both ion and lipid permeation.
(4) I am puzzled about the ability of OSCA and the TMEM63 proteins which are cation channels to conduct negatively charged lipids. How can the pore be selective for cations and yet permeate negatively charged molecules when lipids are presented?
This is a great question. TMEM16 scramblase (as well as other known scramblases, such as the Xkr and Opsin families) are surprisingly non-selective to phospholipids (all major phospholipid species, not just anionic lipids like PS). It is still debated whether lipid headgroups indeed insert into an open pore or hydrophilic groove (Response Fig. 5), or if they may traverse the bilayer by the so-called ‘out-of-groove’ model. Regardless of the model, the consensus is that Ca2+-induced conformational changes catalyze lipid permeation and the mutations we have introduced are designed to mimic these conformational changes by separating the TM 4/6 interface.
Additionally, TMEM16F channel activity was first characterized as cation non-selective (10.1016/j.cell.2012.07.036), similar to OSCA/TMEM63s, which may even exhibit some chloride permeability (10.7554/eLife.41844.001). Thus, it appears as though scramblase activity is agnostic to headgroup charge and compatible with both a mutant anion channel (TMEM16A) and mutant cation channels (TMEM16F, OSCA1.2, and TMEM63A), however, more detailed structural, functional, and computational studies are needed to further clarify ion and lipid co-transport mechanisms.
(5) Do pore blockers like Gd3+ which block permeation also inhibit the scramblase activity of the mutant channels? This should be tested for the mutant channels.
While extracellular Gd3+ has been previously reported as an inhibitor of OSCA1.2 (10.7554/eLife.41844.001), we did not observe this effect (Author response image 5), but instead saw inhibition by intracellular Gd3+ (Author response image 6). Given this discrepancy, we did not test Gd3+ inhibition of the OSCA1.2 scramblases, but instead tested Ani9, a paralog-specific inhibitor of TMEM16A, on the TMEM16A I546K gain-offunction and found it attenuated both ion channel and phospholipid scramblase activities (Supplementary Fig. 3).
Author response image 5.
200 µM Gd3+ext fails to inhibit OSCA1.2 currents in cell-attached patches. Pressure-elicited peak currents (n=6 each). Statistical test is an unpaired Student’s t-test.
Author response image 6.
200 µM Gd3+int completely inhibits OSCA1.2 currents in inside-out patches. (a) representative traces in before (black), during (red), and after (blue) Gd3+ application. (b) Representative application timecourse. (c) Quantification of peak currents (n=8 each). Statistical test is one-way ANOVA.
Minor:
- Some of the current amplitudes shown in Figures 2 and 3 are enormous. Is liquid junction potential corrected in these experiments? If not, it would be preferable to correct this to avoid voltage errors.
Thanks for the question. The large current amplitude is due to 1) great surface expression of the proteins; 2) large single channel conductance of OSCA channels, 3) much larger current at positive voltages for OSCA channels. Our control experiment showed that WT TMEM16A at 0 Ca2+ did not give rise to any current (Fig. 3d), further demonstrating that the large current was not due to liquid junction potential. For the OSCA recordings, we also did not observe current in mock-transfected cells, further excluding the possible interference of liquid junction potential (Response Fig. 1)
- Related, authors could consider adding some evidence using selective pharmacology to support the conclusions that the observed currents arise from TMEM or OSCA channels.
Thanks for the suggestion. As mentioned above, we have added experiments with Ani9, a specific inhibitor of TMEM16A, in Supplementary Fig. 3. We found that Ani9 robustly attenuated both ion channel and phospholipid scramblase activities for the TMEM16A I546K gain-of-function mutant. This is also consistent with our previous publication (10.1038/s41467-019-09778-7), where Ani9 efficiently inhibited the TMEM16A L534K mutant scramblases. Additionally, we have provided mock controls (Response Fig. 1, Fig. 6d, e) to show that the observed currents are indeed attributable to OSCA1.2 and TMEM63A.
Reviewer #3 (Recommendations For The Authors):
Given that the authors postulate that the introduction of a positive charge via the lysine side chain is essential to the constitutive activity of these proteins, additional mutation controls for side chain size (e.g. glutamine/methionine) or negative charge (e.g. glutamic acid), or a different positive charge (i.e. arginine) would have strengthened their argument. To more comprehensively understand the TM4/TM6 interface, mutations at locations one turn above and one turn below could be studied until there is no phenotype. In addition, the equivalent mutations on the TM6 side should be explored to rule out the effects of conformational changes that arise from mutating TM4 and to increase the strength of evidence for the importance of side-chain interactions at the TM6 interface.
We appreciate these great suggestions which were shared by multiple reviewers. We have included our previous responses below.
“Response to reviewers 2 & 3: In our 2019 paper (10.1038/s41467-019-09778-7), we have systematically tested the side chain properties at the inner activation gate of TMEM16F on lipid scrambling activity (Response Fig. 6) and, since then, these results have been supplemented by others as well (10.1038/s41467-022-34497-x). In summary, mutating the inner activation gate residues to polar or charged residues generally results in constitutively activated scramblases without requiring Ca2+ (Fig 5a in 10.1038/s41467-019-09778-7). Because these residues form a hydrophobic gate, introducing smaller side chains via alanine substitution are also gain-of-function with the Y563A mutant as well as the F518A/Y563A/I612A variant being constitutively active (Fig. 3a in 10.1038/s41467-019-09778-7). Meanwhile, mutating these gate residues to hydrophobic amino acids causes no change for I612W, a slight gain-of-function for F518W, slight loss-of-function of F518L, and complete loss-of-function for Y563W (Fig. 4b in 10.1038/s41467-01909778-7). These findings clearly demonstrate that the side-chain properties are critical for regulating the gate opening. Charged mutations including lysine and glutamic acid are the most effective to promote gate opening (Fig 5a in 10.1038/s41467-019-09778-7).
Similarly, others have observed that side chain hydropathy at the F518 site in TMEM16F correlates with shifts in the Ca2+ EC50 (Fig. 2 of 10.1038/s41467-022-34497-x). Note that this publication resolved the structure of the TMEM16F F518H mutant, revealing a previously unseen conformation that we have highlighted in Supplementary Fig. 1e and discussed in lines 235-238. Please also see our response to Reviewer #1 above, where we discuss discoveries in model transmembrane helical peptide systems showing that transbilayer lipid flipping rates correlate with side chain hydropathy (Author response image 3), distance between stacked hydropathic residues (schematic in 10.1248/cpb.c22-00133), and even helical angle between stacked side chains (not show).
Following the reviewers’ suggestions, we have tested additional mutations in alternative locations and with different side chains.
(1) We have added data for TMEM16F I521A and I521E to demonstrate a similar effect of alternative side chains to what has previously been reported by us and others. We found that I521A failed to show spontaneous scrambling activity (Supplementary Fig. 2), yet I521E (Supplementary Fig. 2) is a constitutively active lipid scramblase, similar to I521K (Fig. 1). This further demonstrates that gate disruption correlates with the side chain hydropathy and that this site lines a critical gating interface.
(2) We also added lysine mutations two helical turns below the conserved inner activation gate for TMEM16F T526 (Fig. 1), TMEM16A E551 (Fig. 3). We found that there is indeed a lower limit for the observed effect in TMEM16, where lysine mutations no longer induce spontaneous lipid scrambling activity. This indicates that when TM 4/6 interaction is weaker toward intracellular side (Figs. 1a, 3a), the TM 4 lysine mutation loses the ability to promoting lipid scrambling by disrupting the TM 4/6 interface to enable clamshell-like opening of the permeation pathway.
(3) We added a TMEM16F lysine mutation on TM 6 at residue I611 (Fig. 2). Similar to I612K (Response Fig. 6), I611K also leads to spontaneous lipid scrambling and enhanced channel activity in the absence of calcium (Fig. 2). This shows that charged mutations along TM 6 can also promote lipid scrambling, strengthening our model that hydrophobic interactions along the TM 4/6 interface are critical for gating and lipid permeation.”
The experiments for OSCA1.2 osmolarity effects on gating and scramblase in Figure 4 could be improved by adding different levels of osmolarity in addition to time in the hypotonic solution.
We thank the reviewer for this excellent suggestion. We extensively tested this idea and found evidence (Response Fig. 10) that intermediate osmolarity (220 and 180 mOso/kg) also can enhance the scramblase activity of the A439K mutant, albeit to a milder extent compared to 120 mOso/kg stimulation. This suggests that swellinginduced membrane stretch may proportionally induce A439K activation and lipid scrambling. Due to the relatively mild sensitivity of OSCA to osmolarity and the variations induced by the experimental conditions, we believe it is better to not include this data to avoid overclaiming. We hope the reviewer would agree.
Author response image 7.
AnV intensities of WT- and A439K-transfected cells after 10 minutes of hypotonic stimulation at the listed osmolarities.
Some confocal images appear to be rotated relative to each other (e.g. Figures 2b and 3b).
Thank you for identifying these errors, they are corrected in the revision.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
Summary:
The authors addressed the influence of DKK2 on colorectal cancer (CRC) metastasis to the liver using an orthotopic model transferring AKP-mutant organoids into the spleens of wild-type animals. They found that DKK2 expression in tumor cells led to enhanced liver metastasis and poor survival in mice. Mechanistically, they associate Dkk2-deficiency in donor AKP tumor organoids with reduced Paneth-like cell properties, particularly Lz1 and Lyz2, and defects in glycolysis. Quantitative gene expression analysis showed no significant changes in Hnf4a1 expression upon Dkk2 deletion. Ingenuity Pathway Analysis of RNA-Seq data and ATAC-seq data point to a Hnf4a1 motif as a potential target. They also show that HNF4a binds to the promoter region of Sox9, which leads to LYZ expression and upregulation of Paneth-like properties. By analyzing available scRNA data from human CRC data, the authors found higher expression of LYZ in metastatic and primary tumor samples compared to normal colonic tissue; reinforcing their proposed link, HNF4a was highly expressed in LYZ+ cancer cells compared to LYZ- cancer cells.
Strengths:
Overall, this study contributes a novel mechanistic pathway that may be related to metastatic progression in CRC.
Weaknesses:
The main concerns are related to incremental gains, missing in vivo support for several of their conclusions in murine models, and missing human data analyses. Additionally, methods and statistical analyses require further clarification.
Main comments:
(1) Novelty
The authors previously described the role of DKK2 in primary CRC, correlating increased DKK2 levels to higher Src phosphorylation and HNF4a1 degradation, which in turn enhances LGR5 expression and "stemness" of cancer cells, resulting in tumor progression (PMID: 33997693). A role for DKK2 in metastasis has also been previously described (sarcoma, PMID: 23204234).
(2) Mouse data
a) The authors analyzed liver mets, but the main differences between AKT and AKP/Dkk2 KO organoids could arise during the initial tumor cell egress from the intestinal tissue (which cannot be addressed in their splenic injection model), or during pre-liver stages, such as endothelial attachment. While the analysis of liver mets is interesting, given that Paneths cells play a role in the intestinal stem cell niche, it is questionable whether a study that does not involve the intestine can appropriately address this pathway in CRC metastasis.
We value the reviewer’s comment that the splenic injection model cannot represent metastasis from the primary tumors, intravasation and extravasation. Therefore, we performed the orthotopic transplantation of AKP and KO organoids into the colon directly then, tested metastasis of cancer.
Author response image 1.
Primary tumor formation and liver metastasis by orthotopic transplantation of AKP or KO colon cancer organoids. 6-8 week-old male C57BL/6J mice were treated with 2.5% DSS dissolved in drinking water for 5 days, followed by regular water for 2 days to remove gut epithelium. After recovery with the regular water, the colon was flushed with 1000 μl of 0.1% BSA in PBS. Then, 200,000 dissociated organoid cells in 200 μl of 5% Matrigel and 0.1% BSA in PBS were instilled into the colonic luminal space. After infusion, the anal verge was sealed with Vaseline. 8 weeks after transplantation, the mice were sacrificed to measure primary tumor formation and liver metastasis.
As a result, 4 out 6 mice in the control group successfully formed colorectal primary tumors whereas only 2 out 6 mice showed primary tumor formation in the KO group (Author response image 1A). The size of tumors was reduced by about half (10-12 mm to 5-7 mm). Only one AKP mouse developed metastasized nodules in the liver (Author response image 1B). Next, to measure the circulating tumor cells, we harvested at least 500 ul of bloods from the portal vein and then analyzed tdTomato-positive tumor cells (Author response image 2). Flow cytometry analysis of PBMCs showed the presence of tdTomatohiCD45- cells as well as tdTomatomidCD45+ cells in 2 out of 6 AKP mice, while no tdTomato-positive cells were observed in the PBMCs of KO organoid-transplanted mice.
Due to the limited numbers of mice showed primary and metastatic tumor formation, we cannot provide a statistic analysis of DKK2-mediated metastasis. However, our revised data indicate a trend that DKK2 KO reduced primary tumor formation, the number of circulating tumor cells and liver metastasis. This trend is consistent with our previous report in the iScience paper, which showed that DKK2 KO reduced AOM/DSS-induced polyp formation about 60 % and decreased metastasis in the splenic injection model system in this manuscript. Further studies are necessary to confirm this trend and to provide the underlying mechanisms of intravasation and extravasation of circulating tumor cells.
Author response image 2.
Flow cytometry analysis of tdTomato+ circulating colon tumor cells in PBMCs. PBMCs were harvested via the portal vein after euthanasia. CD45 and tdTomato were analyzed by flow cytometry.
b) The overall number of Paneth cells found in the scRNA-seq analysis of liver mets was strikingly low (17 cells, Figure 3), and assuming that these cells are driving the differences seems somewhat far-fetched. Adding to this concern is inappropriate gating in the flow plot shown in Figure 6. This should be addressed experimentally and in the interpretation of data.
We appreciate for reviewer’s comments to clarify this point. Since the number of LYZ+ cells is low in our scRNA-seq analysis, we performed flow cytometry in Figure 6H showing the clear population expressing LYZ in the same splenic injection model of metastasis. Figure 6H is a representative image of triplicates for each group and we performed this experiment three times, independently. As suggested, we changed the graph format and updated the gating and statistical analysis in Fig 6H and 6I. This in vivo result confirmed our in vitro data showing that DKK2 KO reduced LYZ+ cells while increase the HNF4α1 proteins.
c) Figures 3, 5, and 6 show the individual gene analyses with unclear statistical data. It seems that the p-values were not adjusted, and it is unclear how they reached significance in several graphs. Additionally, it was not stated how many animals per group and cells per animal/group were included in the analyses.
In Fig. 3, mouse scRNA-seq data were generated from pooled cancer samples from 5 animals per group. The Wilcoxon signed-rank test was performed for each gene and/or regulon activity. Since multiple testing adjustments were not performed, a p-value adjustment is neither needed nor applicable..
In Fig. 5, human data were analyzed. Cells from the same sample are dependent, but differential gene expression (DEG) analysis typically calculates statistics under the assumption that they are independent. This assumption may explain the low p-values observed in our data. To address this issue, we applied pseudobulk DEG analysis to our human single-cell data. Even after correcting for statistical error, we confirmed that the genes of interest still exhibited significantly different expression patterns (Author response image 3).
Author response image 3.
Pseudobulk DEG analysis confirmed the differential expression genes of interest.
In Fig.6H-6I, the number of animals per group is provided in the figure legend.
d) Figure 6 suggests a signaling cascade in which the absence of DKK2 leads to enhanced HNF4A expression, which in turn results in reduced Sox9 expression and hence reduced expression of Paneth cell properties. It is therefore crucial that the authors perform in vivo (splenic organoid injection) loss-of-function experiments, knockdown of Sox9 expression in AKP organoids, and Sox9 overexpression experiments in AKP/Dkk2 KO organoids to demonstrate Sox9 as the central downstream transcription factor regulating liver CRC metastasis.
Sox9 is a well-established marker gene for Paneth cell formation in the gut. Therefore, overexpression or knockout of the Sox9 gene would result in either an increase or decrease in Paneth cells in the organoids. We believe that the suggested experiments fall outside the scope of this manuscript. Instead, we demonstrated the change in the Paneth cell differentiation marker, Sox9, in the presence or absence of DKK2.
e) Given the previous description of the role of DKK2 in primary CRC, it is important to define the step of liver metastasis affected by Dkk2 deficiency in the metastasis model. Does it affect extravasation, liver survival, etc.?
We appreciate the reviewer’s insights and perspectives. Regarding liver survival, it is well known that stem cell niche formation is a critical step for the outgrowth of metastasized cancer cells (Fumagalli et al. 2019, Cell Stem Cell). LYZ+ Paneth cells are recognized as stem cell niche cells in the intestine, and human scRNA-seq data have shown that LYZ+ cancer cells express stem cell niche factors such as Wnt and Notch ligands. To determine whether LYZ+ cancer cells act as stem cell niche cells, we performed confocal microscopy to assess whether LYZ+ cancer cells express WNT3A and DLL4 in AKP organoids (Author response image 4). The results show that LYZ labeling co-localizes with DLL4 and WNT3A expression, while the organoid reporter tdTomato is evenly distributed. Additionally, our in vitro and in vivo data indicate that DKK2 deficiency leads to a reduction of LYZ+ cancer cells, which may contribute to stem cell niche formation. Based on these findings, we propose that DKK2 is an essential factor for stem cell niche formation, which is required for cancer cell survival in the liver during the early stages of metastasis. Although our revised data confirmed the trend that DKK2 deficiency decreases liver metastasis, we have not yet determined whether DKK2 is involved in extravasation. This research topic should be addressed in future studies.
Author response image 4.
Confocal microscopy analysis for lysozyme (LYZ) and Paneth cell-derived stem cell niche factors, WNT3A and DLL4 in AKP colon cancer organoids.
The method is described in the supplemental information. The list of antibodies used: DLL4 (delta-like 4) Polyclonal Antibody (Invitrogen, PA5-85931), WNT3A Polyclonal Antibody (Invitrogen, PA5-102317), Goat anti-Rabbit IgG (H+L) Cross-Adsorbed Secondary Antibody, Alexa Fluor™ 488 (Invitrogen, A-11008), Anti-Lysozyme C antibody (H-10, Santacurz, sc-518083), Goat anti-Mouse IgM (Heavy chain) Secondary Antibody, Alexa Fluor™ 647 (Invitrogen, A-21238).
(3) Human data
Can the authors address whether the expression of Dkk2 changes in human CRC and whether mutations in Dkk2 as correlated with metastatic disease or CRC stage?
The human data were useful in identifying the presence of LYZ+ cancer cells with Paneth cell properties. However, due to the limited number of late-stage patient samples with high DKK2 expression, the results were not statistically significant. Nevertheless, the trend suggests a positive correlation between DKK2 expression and the malignant stage of CRC.
(4) Bioinformatic analysis
The authors did not provide sufficient information on bioinformatic analyses. The authors did not include information about the software, cutoffs, or scripts used to make their analyses or output those figures in the manuscript, which challenges the interpretation and assessment of the results. Terms like "Quantitative gene expression analyses" (line 136) "visualized in a Uniform Approximation and Projection" (line 178) do not explain what was inputted and the analyses that were executed. There are multiple forms to align, preprocess, and visualize bulk, single cell, ATAC, and ChIP-seq data, and depending on which was used, the results vary greatly. For example, in the single-cell data, the authors did not inform how many cells were sequenced, nor how many cells had after alignment and quality filtering (RNA count, mt count, etc.), so the result on Paneth+ to Goblet+ percent in lines 184 and 185 cannot be reached because it depends on this information. The absence of a clustering cutoff for the single-cell data is concerning since this greatly affects the resulting cluster number (https://www.nature.com/articles/s41592-023-01933-9). The authors should provide a comprehensive explanation of all the data analyses and the steps used to obtain those results.
We apologize for the insufficient information. Below, we provide detailed information on the data analyses, which are also available in the GEO database (Bulk RNA-seq: GSE157531, ATAC-seq: GSE157529, ChIP-seq: GSE277510). Methods are updated in the current version of supplemental information.
(5) Clarity of methods and experimental approaches
The methods were incomplete and they require clarification.
We’ve updated our methods as requested by the reviewer.
Reviewer #2 (Public Review):
Summary:
The authors propose that DKK2 is necessary for the metastasis of colon cancer organoids. They then claim that DKK2 mediates this effect by permitting the generation of lysozyme-positive Paneth-like cells within the tumor microenvironmental niche. They argue that these lysozyme-positive cells have Paneth-like properties in both mouse and human contexts. They then implicate HNF4A as the causal factor responsive to DKK2 to generate lysozyme-positive cells through Sox9.
Strengths:
The use of a genetically defined organoid line is state-of-the-art. The data in Figure 1 and the dependence of DKK2 for splenic injection and liver engraftment, as well as the long-term effect on animal survival, are interesting and convincing. The rescue using DKK2 administration for some of their phenotype in vitro is good. The inclusion and analysis of human data sets help explore the role of DKK2 in human cancer and help ground the overall work in a clinical context.
Weaknesses:
In this work by Shin et al., the authors expand upon prior work regarding the role of Dickkopf-2 in colorectal cancer (CRC) progression and the necessity of a Paneth-like population in driving CRC metastasis. The general topic of metastatic requirements for colon cancer is of general interest. However, much of the work focuses on characterizing cell populations in a mouse model of hepatic outgrowth via splenic transplantation. In particular, the concept of Paneth-like cells is primarily based on transcriptional programs seen in single-cell RNA sequencing data and needs more validation. Although including human samples is important for potential generality, the strength could be improved by doing immunohistochemistry in primary and metastatic lesions for Lyz+ cancer cells. Experiments that further bolster the causal role of Paneth-like CRC cells in metastasis are needed.
Recommendations for the Authors:
Reviewing Editor (Recommendations for the Authors):
Here we note several key concerns with regard to the main conclusions of the paper. Additional experiments to directly address these concerns would be required to substantially update the reviewer evaluation.
(1) Demonstration of a causal role of Paneth-like cells in CRC metastasis, for example by sorting the Paneth-like cells - either by the markers they identified in the subsequent single cell or by scatter - to establish whether the frequency of the Paneth-like cells in a culture of organoids is directly correlated with tumorigenicity and engraftment.
We sincerely appreciate the reviewing editor’s comment. First, as previously reported (Shin et al., iScience 2021), there is no difference in proliferation between WT and KO during in vitro organoid culture or in vivo colitis-induced tumors. However, DKK2 deficiency led to morphological changes, which we analyzed using bulk RNA-seq. As described in the manuscript, Paneth cell marker genes, such as Lysozymes and defensins, were significantly reduced in DKK2 KO AKP organoids.
Due to the nature of these markers, it is technically challenging to isolate live LYZ+ cancer cells. To address this issue in the future, we plan to develop organoids that express a reporter gene specific for Paneth cells. In this manuscript, we demonstrated a correlation between DKK2 and the formation of LYZ+ cancer cells. In both the splenic injection model (Fig. 1) and the orthotopic transplantation model (Fig. R1-R2), we observed that transplantation of cancer organoids with reduced numbers of LYZ+ cells (KO organoids) led to decreased metastatic tumor formation. The number of LYZ+ cells in KO-transplanted mice remained low in liver metastasized tumor nodules (Fig. 6H-I6). Immunohistochemistry further confirmed that LYZ+ cancer cells were barely detectable in KO samples (Author response image 5). These data suggest that DKK2 is essential for the formation of LYZ+ cancer cells, which are necessary for outgrowth following metastasis.
Author response image 5.
Histology of Lysozyme positive cells in metastasized tumor nodules in liver of colon cancer organoid transplanted mice. Immunohistochemistry of Lysozyme positive Paneth-like cells cells in liver metastasized colon cancer (Upper panels, DAB staining). Identification of tumor nodules by H&E staining (lower panels, Scale bar = 100 μm). Magnified tumor nodules are shown in the 2nd and 3rd columns (Scale bar = 25 μm). Arrows indicate Lysozyme positive Paneth like cells in tumor epithelial cells. Infiltration of Lysozyme positive myeloid cells is detected in both AKP and KO tumor nodules. AKP: Control colon cancer organoids carrying mutations in Apc, Kras and Tp53 genes. KO: Dkk2 knockout colon cancer organoids
(2) Further characterization of Lyz+/Paneth-like cells to further the authors' argument for the unique function that they have in their tumor model. Specifically, do the cells with Paneth-like cells secrete Wnt3, EGF, Notch ligand, and DII4 as normal Paneth cells do?
We appreciate the reviewing editor’s comment. In response, we performed confocal microscopy analysis to examine the protein levels of LYZ, Wnt3A, and DLL4 in AKP colon cancer organoids (Author response image 4). The data presented above show that LYZ+ cancer cells express both Wnt3A and DLL4, suggesting that LYZ+ colon cancer cells may function similarly to Paneth cells, which are stem cell niche cells. Furthermore, using the Panglao database, we demonstrated that LYZ+/Paneth-like cells exhibit typical Paneth cell properties in human scRNA-seq data (Fig. 4 and Fig. 5). These findings suggest that LYZ+ colon cancer cells possess Paneth cell properties.
(3) Experiments to test metastasis, ideally from orthotopic colonic tumors, to ensure phenotypes aren't restricted to the splenic model of hepatic colonization and outgrowth used at present.
We are in agreement with the reviewing editor and reviewers, which is why we conducted the orthotopic transplantation experiment. However, we encountered challenges in establishing this model effectively. After multiple trials, we observed that many mice did not form primary tumors, and the variability, particularly in metastasis, was difficult to control. Only a few AKP-transplanted mice developed liver metastasis. The representative revision data have been provided above. Nevertheless, we believe that this model needs further improvement and optimization to reliably study metastasis originating from primary tumors.
(4) To generalize claims to human cancer, the authors should test whether loss of DKK2 impacts LYZ+ cancer cells in human organoids and affects their engraftment in immunodeficient mice compared to control. Another more correlative way to validate the LYZ+ expression in human colon cancer would be to stain for LYZ in metastatic vs. primary colon cancer, expecting metastatic lesions to be enriched for LYZ+ cells.
We agree with your point, and this will be addressed in future studies.
(5) Clarifying inconsistencies regarding effect of DKK2 loss on HNF4A (Figure 1E vs Figure 6I).
In Figure 1 E, we measured the mRNA levels of HNF4A in metastasized foci by qPCR while in Figure 6I, we measured the protein level of HNF4A by flow cytometry. Recent studies, including our previous report, have shown that HNF4A protein levels are regulated by proteasomal degradation mediated by pSrc (Mori-Akiyama et al. 2007, Gastroenterology, Bastide et al. 2007, Journal of Cell Biology, Shin et al. 2021 iScience). Consequently, while the mRNA levels remained unchanged in Fig. 1E, we observed a reduction of HNF4A protein levels in Figure 6I.
(6) Addressing concerns about statistics and reporting as outlined by Reviewer 1.
Thank you very much for your assistance in improving our manuscript. The updates have been incorporated as detailed above.
These are the central reviewer concerns that would require additional experimentation to update the editorial summary. Other concerns should be addressed in a revision response but do not require additional experimentation.
Reviewer #1 (Recommendations For The Authors):
Specific comments:
• Do Dkk2-KO organoids grow normally?
Yes, in vitro.
Since the authors reported on the effects of Dkk2 in the induction/maintenance of the Paneth cell niche, changes in AKP organoid numbers of growth rate between Dkk2-WT and KO would be an expected outcome.
Disruption of Paneth cell formation in normal organoids is expected to alter growth. However, DKK2 KO in colon cancer organoids with mutations in the Apc, Kras, and Tp53 genes exhibits growth rates and organoid sizes similar to those of WT AKP controls. In contrast to in vitro observations, we observed a significant reduction in metastasized tumor growth in vivo. Further analyses of factors derived from LYZ+ cancer cells will help address the discrepancy in DKK2's absence between in vitro and in vivo conditions.
• Figure 1:
- Panel C: The legend indicates what c.p. stands for.
c.p.m. stands for count per minutes for in vivo imaging analysis. This has been updated in the Figure legend.
- Panel E: Please comment on the possible underlying reasons for the lack of change in HNF4a1 levels.
This has been updated in response to the reviewing editor’s comment (5) above.
- Panel E: Number of mice from which isolated cancer nodules were harvested.
Total mice per group were 5. This has been updated in the legend.
• Figure 2:
- Suggestion: Panel A should be presented in Figure 1 since Dkk2 KO organoids are already used in Figure 1.
We added this to present the recovery of DKK2 by adding recombinant DKK2 proteins in Fig.2.
- Panel B: Please explain why these genes are marked in blue.
It has been described in the legend. “Paneth cell marker genes are highlighted as blue circles (AKP=3 and KO=5 biological replicates were analyzed).”
• Figure 3:
- Indicate the number of cells recovered from AKP vs. KO mice (since liver metastasis was already reduced in KO mice). This should be shown in a UMAP.
- Panel A: 4th line in the pathways, correct "Singel" typo.
We appreciate your correction. It has been fixed.
- Panel A: There are multiple versions of PanglaoDB with different markers; a list of all that was used to determine cell type should be provided.
- Panel C: Bar value for the WNT pathway is not displayed, and there is no legend to indicate the direction of the analysis (that is, AKPvsKO or KOvsAKP).
It is KOvsAKP, described in the figure legend.
- Panel C: Ingenuity pathway analysis is not a good tool to look at this type of result because it does not include the gene fold changes in the analysis, so it only provides a Z-score of the presence of that pathway and not the degree it is increased or fold changes - recommend substituting any type of GSEA analysis, such as fgsea. -o Panel D: the term "Patient" to refer to mice is confusing. Use "Mice" or "Treatment" or "Condition" instead.
Corrected
- Panel D: Information about the number of mice per group, cells per animal (or liver let) used, and additional clarification about the statistical analysis used is required, as differences shown in this panel appear subtle given the standard variation in each group. Box plots need to show individual/raw values.
• Figure 4:
- Panel E: It would be helpful to show the cutoff lines for the Paneth cell score and Lyz expression in the graphs.
It has been updated in response to the reviewer’s request.
• Figure 5:
- Panel B: again, information about the number of "patients" or cells used and clarification about the statistical analysis used is required as the display of data generates concerns about the distribution within groups. Box plots need to show individual/raw values
It has been updated in response to the reviewer’s request.
• Figure 6:
- Panel A: Add a legend to inform the direction of the process (e.g., red, activation, blue, repression). We noticed the Yap1 bar data had no color. Is there a reason for that? Please explain this point in the revised manuscript.
Red color added for the Yap1.
- Panel A: Ingenuity pathway analysis is not a good tool to look at this type of results because it does not include the gene Foldchanges in the analysis, so it only provides a Z-score of the presence of that pathway and not the degree it is increased or not. I recommend substituting any type of GSEA analysis, such as fgsea.
- Panels A&B: Again, only p-value scores were provided, while fold changes are necessary to define the ratio of presence increase of normal vs. AKP.
- Panel D: No raw or pre-processed ChIP-seq data was provided. Additionally, please indicate exactly the genome location (it seems the image was edited from a raw made on UCSC genome browser-it should be remade by adding coordinates and other important information (genes around, epigenetic, etc.).
- Panel H/I: Flow cytometry gating is inappropriate, as its catching cells are negative for LYZ in both AKP and KO cells, resulting in an overestimation of the number of Lyz cells. Gating should specifically select very few LYZ-positive cells in the top/left quadrant.
The updates have been made, and the statistical data have been re-analyzed.
- Panel J: Information about the number of animals/organoids or cells used and clarification about the statistical analysis used is required, as the display of data generates concerns about the distribution within groups. Box plots need to show individual/raw values.
• Overall:
- A supplementary table with all the sequenced libraries and their depth, read length/cell count should be provided.
All of the information is now available in the GEO database. We used previously published human epithelial datasets for human single cell analysis (Joanito*, Wirapati*, Zhao*, Nawaz* et al, Nat Genetics, 2022, PMID: 35773407).
- The Hallmark Geneset used is very broad, and the authors should confirm the results on GO bp.
Using Gene Ontology biological processes (GO bp), we observed that glycolysis-related genes were enriched in our newly described cell population, although the adjusted p-value did not exceed 0.05.
Author response image 6.
GSEA with GOBP pathway highlighted glycoprotein and protein localization to extracellular region, both of which are related Paneth cell functions. Paneth cells secrete α-defensins, angiogenin-4, lysozyme and secretory phospholipase A2. The enriched glycoprotein process and protein localization not extracellular region reflect the characteristics of Paneth cells.
- qPCR is not a good way to confirm sequencing results; while PCR data is pre-normalized, sequencing is normalized only after quantification, so results on 6 E and F should be shown on the sequencing data.
The expression level of Sox9 is relatively low. In our bulk RNA-seq data, the averages for Sox9 in AKP versus DKK2 KO are 28.2 and 25.1, respectively. While there is a similar trend, the difference is not statistically significant in this dataset, and we did not include an experimental group for reconstitution. Therefore, we conducted qPCR experiments for the reconstitution study by adding recombinant DKK2 (rmDKK2) protein to the culture. Furthermore, it is well established that Sox9 is an essential transcription factor for the formation of LYZ+ Paneth cells. Based on this, we assessed the levels of LYZ and Sox9 using qPCR and confocal microscopy in the presence or absence of DKK2.
• Edits in the text:
- There are several typographical errors. Specific suggestions are provided below.
- Line 43: "Chromatin immunoprecipitation followed by sequencing analysis," state analysis of what cells before continuing with "revealed..." revealed...
- Line 77: Recent findings have identified
- Line 138: were reduced in KO tumor samples à rephrase to clarify "KO-derived liver tumors"
- Line 167: Recombinant mouse DKK2 protein treatment in KO organoids partially rescued this effect. Add "partially" since adding rmDkk2 didn't fully restore Lyz1 and Lyz2 levels.
- Line 185-187: the authors should not reference Figure 6 because it has not been introduced yet.
- Line 198-199: The authors claimed a correlation between Dkk2 expression and Lgr5 expression; however, the graph presented in Figure 3B does not indicate this. The R-value was 0.11, which does not indicate a correlative expression between these genes.
- Line 232-233: the authors need to show any connection to Dkk2 gene expression in human samples in order to draw that conclusion.
- Line 294: expression, leading to the formation
- Line 347: Wnt ligand (correct Wng typo)
We have modified our manuscript in accordance with the reviewer’s suggestions.
Reviewer #2 (Recommendations For The Authors):
Specific criticisms/suggestions:
Author claim 1: Dkk2 is necessary for liver metastasis of colon cancer organoids. <br /> This model is one of hepatic colonization and eventual outgrowth and not metastasis. Metastasis is optimally assessed using autochthonous models of cancer generation, with the concomitant intravasation, extravasation, and growth of cancer cells at the distant site. The authors should inject their various organoids in an orthotopic colonic transplantation assay, which permits the growth of tumors in the colon, and they can then identify metastasis in the liver that results from that primary cancer lesion (i.e., to better model physiologic metastasis from the colon to liver).
The data of orthotopic colonic transplantation data has been provided above (Author response images 1 and 2).
Author claim 2: DKK2 is required for the formation of lysozyme-positive cells in colon cancer.
It would greatly strengthen the authors' claim if supraphysiologic or very high amounts of DKK2 enhance CRC organoid line engraftment ( i.e., the specific experiment being pre-treatment with high levels of DKK2 and immediate transplantation to see a number of outgrowing clones). If DKK2 is causal for the engraftment of the tumors, increased DKK2 should enhance their capacity for engraftment.
Paneth cells have physical properties permitting sorting and are readily identifiable on flow cytometry. The authors should demonstrate increased tumorigenicity and engraftment by sorting the Paneth-like cells-either by the markers they identified in the subsequent single cell or by scatter to establish whether the frequency of the Paneth-like cells in a culture of organoids is directly correlated with engraftment potential.
Further characterization of the Paneth-like cells would help further the authors' argument for the unique function that they have in their tumor model. Specifically, do the cells with Paneth-like cells secrete Wnt3, EGF, Notch ligand, and DII4 as normal Paneth cells do? Immunofluorescence, sorting, or western blots would all be reasonable methods to assess protein levels in the sorted population.
This has been performed and provided above (Author response images 1 and 3)
Author claim 3: Lyzosome (LYZ)+ cancer cells exhibit Paneth cell properties in both mouse and human systems.
For the claim to be general to human cancer, the author should demonstrate that loss of DKK2 impacts LYZ+ cancer cells in human organoids and affects their engraftment in immunodeficient mice compared to control. Another more correlative way to validate the LYZ+ expression in human colon cancer would be to stain for LYZ in metastatic vs. primary colon cancer, expecting metastatic lesions to be enriched for LYZ+ cells.
The claims on the metabolic function of Paneth-like cells need more clarification. Do the cancer cells with Paneth features have a distinct metabolic profile compared to the other cell populations? The authors should address this through metabolic characterization of isolated LYZ+ cells with Seahorse or comparison of Dkk2 KO to WT organoids (i.e., +/-LYZ+ cancer cell population).
To address this question, we need to develop organoids with a Paneth cell reporter gene. We appreciate the reviewer’s comment, and this should be pursued in future studies.
Author claim 4: HNF4A mediates the formation of Lysozyme (Lyz)-positive colon cancer cells by DKK2.
The authors implicate HNF4A and Sox9 as causal effectors of the Paneth-like cell phenotype and subsequent metastatic potential. There appears to be some discordance regarding the effect of DKK2 loss on HNF4A. In Figure 1E, the authors show that gene expression in metastatic colon cancer cells for HNF4A in DKK2 knockout vs AKP control is insignificant. However, in Figure 6I, there is a highly significant difference in the number of HNF4A positive cells, more than a 3-fold percentage difference, with a p-value of <0.0001. If there is the emergence of a rare but highly expressing HNF4A cell type that on aggregate bulk expression leads to no difference, but sorts differentially, why is it not identified in the single-cell data set? These data together are highly inconsistent with regards to the effect of DKK2 on HNF4A and require clarification.
Previous studies have demonstrated that HNF4A is regulated by proteasomal degradation mediated by pSrc. As a result, the mRNA level of HNF4A remains unchanged, while the protein level is significantly reduced in colon cancer cells. DKK2 KO leads to decreased Src phosphorylation, resulting in the recovery of HNF4A protein levels. This explains why HNF4A cannot be detected in scRNA-seq datasets, which measure mRNA. We have shown this in our previous report. In this manuscript, based on ChIP-seq data using an anti-HNF4A monoclonal antibody, as well as confocal microscopy and qPCR data for the Sox9 gene, we propose that HNF4A acts as a regulator of cancer cells exhibiting Paneth cell properties.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1:<br /> The ingenious design in this study achieved the observation of 3D cell spheroids from an additional lateral view and gained more comprehensive information than the traditional one angle of imaging, which extensively extended the methods to investigate cell behaviors in the growth or migration of tumor organoids in the present study. I believe that this study opens an avenue and provides an opportunity to characterize the spheroid formation dynamics from different angles, in particular side-view with high resolution, in other organoids study in the future.
Thank you for your positive response.
(1) Figure 1A and B, the images of "First surface mirror" are unclear. The authors should capture a single image of "First surface mirror" by high resolution. The corresponding information on the mirror should also be included in the manuscript.
Thank you for your kind reminder. To make the content more intuitive, we have added the clear image of the first surface mirror to Fig. 8C.
(2) The spheroids sizes in this study are 200-300 um. Whether this size is the limitation by the device? And which is the best size by the device? The size of spheroids suitable for this device should be characterized.
Thank you very much for your question. As shown in Fig. 1D, the imaging principle indicates that the sample size is theoretically not affected by the device. For larger biological samples or samples exceeding the size of a 35 mm petri dish, a larger container and first surface mirror can be used. However, in practice, it is not recommended to use this device with laboratory microscopes for samples exceeding 4 mm in size.
Firstly, the working distance of the microscope objective lens is limited by its factory specifications. Secondly, this device is designed to fit a 35 mm petri dish, and the first surface mirror can capture a maximum sample size of 4.5 mm. Fortunately, this size is more than sufficient for cell spheroids.
(3) Figure 2F. The scale bar covered the imaging and made it unclear. It was difficult to read and evaluate the quality of the images. And it seemed no obvious difference between 5 cm and 15 cm. Please carefully check this data.
Thank you very much for your question. First, we checked the image scale and coverage issues and made adjustments in the revised version. Secondly, when the light source was placed 5 cm from the sample, the sample itself appeared relatively clear, but the boundary with the background was less distinct. At a distance of 15 cm, the light source not only illuminated the sample effectively but also made the distinction between the spheroid and the background more apparent. To ensure consistency and stability in image capture, we ultimately selected a 15 cm distance between the sample and the light source for imaging.
(4) Figure 3A. It seemed that the seeding cells were initially located as a ring with a hole in the center. Why do not seed the cells evenly in the well?
Thank you very much for your question. First, the cells were added as a suspension, naturally settling at the bottom of the well during imaging. When seeded in agarose wells, the cells spontaneously aggregated over time, as shown in sVideo4. Our previous study showed that the use of agarose wells offers high fault tolerance and efficiency in cell spheroid culture (Pan, R. et al. Biofabrication, 2024, 16, 035016).
(5) I just wonder whether this design could be extended to the fluorescent imaging and how do it. Please give an expectation in the discussion.
Thank you very much for raising this key question regarding the imaging capability of this device. As shown in Author response image 1A, due to the specific nature of fluorescence imaging light sources, it is feasible to perform fluorescence imaging of cell spheroids using a microscope, including the built-in light source. Using 4′,6-diamidino-2-phenylindole (DAPI) staining, we captured fluorescence images of cell spheroids in both bottom-view and side-view modes (Author response image 1B), demonstrating that side-view observation of cell spheroids with this device is indeed feasible.
Author response image 1.
(A) The schematic diagram of the principle of fluorescence images of spheroids using an inverted microscope with the side-view observation petri dish/device. (B) Bottom-view and side-view images of a 3D cell spheroid. Scale bar = 500 µm.
(6) The first sentence in the introduction. "Three-dimensional (3D) spheroids" should be "Three-dimensional (3D) tumor spheroids".
(7) P11, Line 7, "both lethal and lethal" should be corrected.
(8) The writing and grammar should be polished.
Thank you very much for your suggestions to improve the quality of the article. We have made the necessary revisions in the updated version.
Reviewer #2:
Summary:
The author developed a new device to overcome current limitations in the imaging process of 3D spheroidal structures. In particular, they created a system to follow in real-time tumour spheroid formation, fusion and cell migration without disrupting their integrity. The system has also been exploited to test the effects of a therapeutic agent (chemotherapy) and immune cells.
Strengths:
The system allows the in situ observation of the 3D structures along the 3 axes (x,y and z) without disrupting the integrity of the spheroids; in a time-lapse manner it is possible to follow the formation of the 3D structure and the spheroids fusion from multiple angles, allowing a better understanding of the cell aggregation/growth and kinetic of the cells.
Interestingly the system allows the analysis of cell migration/ escape from the 3D structure analyzing not only the morphological changes in the periphery of the spheroids but also from the inner region demonstrating that the proliferating cells in the periphery of the structure are more involved in the migration and dissemination process. The application of the system in the study of the effects of doxorubicin and NK cells would give new insights in the description of the response of tumor 3D structure to killing agents.
We sincerely thank you for your detailed and supportive review of our manuscript. Your recognition of our system’s capabilities for in situ observation of 3D structures along multiple axes, as well as its potential applications in studying therapeutic effects, is highly encouraging. Your comments on the advantages of this system for analyzing cell migration, morphological changes, and responses to therapeutic agents are especially appreciated.
Thank you again for your thoughtful feedback and for highlighting the contributions of our work. Your insights have been invaluable in refining the focus and clarity of our study, and we hope that our revisions meet your expectations.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public review):
From the Reviewing Editor:
Four reviewers have assessed your manuscript on valence and salience signaling in the central amygdala. There was universal agreement that the question being asked by the experiment is important. There was consensus that the neural population being examined (GABA neurons) was important and the circular shift method for identifying task-responsive neurons was rigorous. Indeed, observing valenced outcome signaling in GABA neurons would considerably increase the role the central amygdala in valence. However, each reviewer brought up significant concerns about the design, analysis and interpretation of the results. Overall, these concerns limit the conclusions that can be drawn from the results. Addressing the concerns (described below) would work towards better answering the question at the outset of the experiment: how does the central amygdala represent salience vs valence.
A weakness noted by all reviewers was the use of the terms 'valence' and 'salience' as well as the experimental design used to reveal these signals. The two outcomes used emphasized non-overlapping sensory modalities and produced unrelated behavioral responses. Within each modality there are no manipulations that would scale either the value of the valenced outcomes or the intensity of the salient outcomes. While the food outcomes were presented many times (20 times per session over 10 sessions of appetitive conditioning) the shock outcomes were presented many fewer times (10 times in a single session). The large difference in presentations is likely to further distinguish the two outcomes. Collectively, these experimental design decisions meant that any observed differences in central amygdala GABA neuron responding are unlikely to reflect valence, but likely to reflect one or more of the above features.
We appreciate the reviewers’ comments regarding the experimental design. When assessing fear versus reward, we chose stimuli that elicit known behavioral responses, freezing versus consumption. The use of stimuli of the same modality is unlikely to elicit easily definable fear or reward responses or to be precisely matched for sensory intensity. For example, sweet or bitter tastes can be used, but even these activate different taste receptors and vary in the duration of the activation of taste-specific signaling (e.g. how long the taste lingers in the mouth). The approach we employed is similar to that of Yang et al., 2023 (doi: 10.1038/s41586-023-05910-2) that used water reward and shock to characterize the response profiles of somatostatin neurons of the central amygdala. Similar to what was reported by Yang and colleagues we observed that the majority of CeA GABA neurons responded selectively to one unconditioned stimulus (~52%). We observed that 15% of neurons responded in the same direction, either activated or inhibited, by the food or shock US. These were defined as salience based on the definitions of Lin and Nicolelis, 2008 (doi: 10.1016/j.neuron.2008.04.031) in which basal forebrain neurons responded similarly to reward or punishment irrespective of valence. The designation of valence encoding based opposite responses to the food or shock is straightforward (~10% of cells); however, we agree that the designation of modality-specific encoding neurons as valence encoding is less straightforward.
A second weakness noted by a majority of reviewers was a lack of cue-responsive unit and a lack of exploration of the diversity of response types, and the relationship cue and outcome firing. The lack of large numbers of neurons increasing firing to one or both cues is particularly surprising given the critical contribution of central amygdala GABA neurons to the acquisition of conditioned fear (which the authors measured) as well as to conditioned orienting (which the authors did not measure). Regression-like analyses would be a straightforward means of identifying neurons varying their firing in accordance with these or other behaviors. It was also noted that appetitive behavior was not measured in a rigorous way. Instead of measuring time near hopper, measures of licking would have been better. Further, measures of orienting behaviors such as startle were missing.
The authors also missed an opportunity for clustering-like analyses which could have been used to reveal neurons uniquely signaling cues, outcomes or combinations of cues and outcomes. If the authors calcium imaging approach is not able to detect expected central amygdala cue responding, might it be missing other critical aspects of responding?
As stated in the manuscript, we were surprised by the relatively low number of cue responsive cells; however, when using a less stringent statistical method (Figure 5 - Supplement 2), we observed 13% of neurons responded to the food associated cue and 23% responded to the shock associated cue. The differences are therefore likely a reflection of the rigor of the statistical measure to define the responsive units. The number of CS responsive units is less than reported in the CeAl by Ciocchi et al., 2010 (doi: 10.1038/nature09559 ) who observed 30% activated by the CS and 25% inhibited, but is not that dissimilar from the results of Duvarci et al., 2011 (doi: 10.1523/JNEUROSCI.4985-10.2011 ) who observed 11% activated in the CeAl and 25% inhibited by the CS. These numbers are also consistent with previous single cell calcium imaging of cell types in the CeA. For example, Yang et al., 2023 (doi: 10.1038/s41586-023-05910-2) observed that 13% of somatostatin neurons responded to a reward CS and 8% responded to a shock CS. Yu et al., 2017 (doi: 10.1038/s41593-017-0009-9) observed 26.5% of PKCdelta neurons responded to the shock CS. It should also be noted that our analysis was not restricted to the CeAl. Finally, Food learning was assessed in an operant chamber in freely moving mice with reward pellet delivery. Because liquids were not used for the reward US, licking is not a metric that can be used.
All reviewers point out that the evidence for salience encoding is even more limited than the evidence for valence. Although the specific concern for each reviewer varied, they all centered on an oversimplistic definition of salience. Salience ought to scale with the absolute value and intensity of the stimulus. Salience cannot simply be responding in the same direction. Further, even though the authors observed subsets of central amygdala neurons increasing or decreasing activity to both outcomes - the outcomes can readily be distinguished based on the temporal profile of responding.
We thank the reviewers for their comments relating to the definition of salience and valence encoding by central amygdala neurons. We have addressed each of the concerns below.
Additional concerns are raised by each reviewer. Our consensus is that this study sought to answer an important question - whether central amygdala signal salience or valence in cue-outcome learning. However, the experimental design, analyses, and interpretations do not permit a rigorous and definitive answer to that question. Such an answer would require additional experiments whose designs would address the significant concerns described here. Fully addressing the concerns of each reviewer would result in a re-evaluation of the findings. For example, experimental design better revealing valence and salience, and analyses describing diversity of neuronal responding and relationship to behavior would likely make the results Important or even Fundamental.
We appreciate the reviewers’ comments and have addressed each concern below.
Reviewer #2 (Public review):
In this article, Kong and authors sought to determine the encoding properties of central amygdala (CeA) neurons in response to oppositely valenced stimuli and cues predicting those stimuli. The amygdala and its subregional components have historically been understood to be regions that encode associative information, including valence stimuli. The authors performed calcium imaging of GABA-ergic CeA neurons in freely-moving mice conditioned in Pavlovian appetitive and fear paradigms, and showed that CeA neurons are responsive to both appetitive and aversive unconditioned and conditioned stimuli. They used a variant of a previously published 'circular shifting' technique (Harris, 2021), which allowed them to delineate between excited/non-responsive/inhibited neurons. While there is considerable overlap of CeA neurons responding to both unconditioned stimuli (in this case, food and shock, deemed "salience-encoding" neurons), there are considerably fewer CeA neurons that respond to both conditioned stimuli that predict the food and shock. The authors finally demonstrated that there are no differences in the order of Pavlovian paradigms (fear - shock vs. shock - fear), which is an interesting result, and convincingly presented given their counterbalanced experimental design.
In total, I find the presented study useful in understanding the dynamics of CeA neurons during a Pavlovian learning paradigm. There are many strengths of this study, including the important question and clear presentation, the circular shifting analysis was convincing to me, and the manuscript was well written. We hope the authors will find our comments constructive if they choose to revise their manuscript.
While the experiments and data are of value, I do not agree with the authors interpretation of their data, and take issue with the way they used the terms "salience" and "valence" (and would encourage them to check out Namburi et al., NPP, 2016) regarding the operational definitions of salience and valence which differ from my reading of the literature. To be fair, a recent study from another group that reports experiments/findings which are very similar to the ones in the present study (Yang et al., 2023, describing valence coding in the CeA using a similar approach) also uses the terms valence and salience in a rather liberal way that I would also have issues with (see below). Either new experiments or revised claims would be needed here, and more balanced discussion on this topic would be nice to see, and I felt that there were some aspects of novelty in this study that could be better highlighted (see below).
One noteworthy point of alarm is that it seems as if two data panels including heatmaps are duplicated (perhaps that panel G of Figure 5-figure supplement 2 is a cut and paste error? It is duplicated from panel E and does not match the associated histogram).
We thank the reviewer for their insightful comments and assessment of the manuscript.
Major concerns:
(1) The authors wish to make claims about salience and valence. This is my biggest gripe, so I will start here.
(1a) Valence scales for positive and negative stimuli and as stated in Namburi et al., NPP, 2016 where we operationalize "valence" as having different responses for positive and negative values and no response for stimuli that are not motivational significant (neutral cues that do not predict an outcome). The threshold for claiming salience, which we define as scaling with the absolute value of the stimulus, and not responding to a neutral stimulus (Namburi et al., NPP, 2016; Tye, Neuron, 2018; Li et al., Nature, 2022) would require the lack of response to a neutral cue.
We appreciate the reviewer’s comment on the definitions of salience and valence and agree that there is not a consistent classification of these response types in the field. As stated above, we used the designation of salience encoding if the cells respond in the same direction to different stimuli regardless of the valence of the stimulus similar to what was described previously (Lin and Nicolelis, 2008, doi: 10.1016/j.neuron.2008.04.031). Similar definitions of salience have also been reported elsewhere (for examples see: Stephenson-Jones et al., 2020, doi: 10.1016/j.neuron.2019.12.006, Zhu et al., 2018 doi: 10.1126/science.aat0481, and Comoli et al., 2003, doi: 10.1038/nn1113P). Per the suggestion of the reviewer, we longitudinally tracked cells on the first day of Pavlovian reward conditioning the fear conditioning day. Although there were considerably fewer head entries on the first day of reward conditioning, we were able to identify 10 cells that were activated by both the food US and shock US. We compared the responses to the first five head entries and last head entries and the first 5 shocks and last five shocks. Consistent with what has been reported for salience encoding neurons in the basal forebrain (Lin and Nicolelis, 2008, doi: 10.1016/j.neuron.2008.04.031), we observed that the responses were highest when the US was most unexpected and decreased in later trials.
Author response image 1.
(1b) The other major issue is that the authors choose to make claims about the neural responses to the USs rather than the CSs. However, being shocked and receiving sucrose also would have very different sensorimotor representations, and any differences in responses could be attributed to those confounds rather than valence or salience. They could make claims regarding salience or valence with respect to the differences in the CSs but they should restrict analysis to the period prior to the US delivery.
Perhaps the reviewer missed this, but analysis of valence and salience encoding to the different CSs are presented in Figure 5G, Figure 5 -Supplement 1 C-D, and Figure 5 -Supplement 2 N-O. Analysis of CS responsiveness to CSFood and CSShock were analyzed during the conditioning sessions Figure 3E-F, Figure 4B-C, Figure 5 – Supplement 2J-O and Figure 5 – Supplement 3K-L, and during recall probe tests for both CSFood and CSShock, Figure 5 – Supplement 1C-J.
(1c) The third obstacle to using the terms "salience" or "valence" is the lack of scaling, which is perhaps a bigger ask. At minimum either the scaling or the neutral cue would be needed to make claims about valence or salience encoding. Perhaps the authors disagree - that is fine. But they should at least acknowledge that there is literature that would say otherwise.<br /> (1d) In order to make claims about valence, the authors must take into account the sensory confound of the modality of the US (also mentioned in Namburi et al., 2016). The claim that these CeA neurons are indeed valence-encoding (based on their responses to the unconditioned stimuli) is confounded by the fact that the appetitive US (food) is a gustatory stimulus while the aversive US (shock) is a tactile stimulus.
We provided the same analysis for the US and CS. The US responses were larger and more prevalent, but similar types of encoding were observed for the CS. We agree that the food reward and the shock are very different sensory modalities. As stated above, the use of stimuli of the same modality is unlikely to elicit easily definable fear or reward responses or to be precisely matched for sensory intensity. We agree that the definition of cells that respond to only one stimulus is difficult to define in terms of valence encoding, as opposed to being specific for the sensory modality and without scaling of the stimulus it is difficult to fully address this issue. It should be noted however, that if the cells in the CeA were exclusively tuned to stimuli of different sensory modalities, we would expect to see a similar number of cells responding to the CS tones (auditory) as respond to the food (taste) and shock (somatosensory) but we do not. Of the cells tracked longitudinally 80% responded to the USs, with 65% of cells responding to food (activated or inhibited) and 44% responding to shock (activated or inhibited).
(2) Much of the central findings in this manuscript have been previously described in the literature. Yang et al., 2023 for instance shows that the CeA encodes salience (as demonstrated by the scaled responses to the increased value of unconditioned stimuli, Figure 1 j-m), and that learning amplifies responsiveness to unconditioned stimuli (Figure 2). It is nice to see a reproduction of the finding that learning amplifies CeA responses, though one study is in SST::Cre and this one in VGAT::cre - perhaps highlighting this difference could maximize the collective utility for the scientific community?
We agree that the analysis performed here is similar to what was conducted by Yang et al., 2023. With the major difference being the types of neurons sampled. Yang et al., imaged only somatostatin neurons were as we recorded all GABAergic cell types within the CeA. Moreover, because we imaged from 10 mice, we sampled neurons that ostensibly covered the entire dorsal to ventral extent of the CeA (Figure 1 – Supplement 1). Remarkably, we found that the vast majority of CeA neurons (80%) are responsive to food or shock. Within this 80% there are 8 distinct response profiles consistent with the heterogeneity of cell types within the CeA based on connectivity, electrophysiological properties, and gene expression. Moreover, we did not find any spatial distinction between food or shock responsive cells, with the responsive cell types being intermingled throughout the dorsal to ventral axis (Figure 5 – Supplement 3).
(3) There is at least one instance of copy-paste error in the figures that raised alarm. In the supplementary information (Figure 5- figure supplement 2 E;G), the heat maps for food-responsive neurons and shock-responsive neurons are identical. While this almost certainly is a clerical error, the authors would benefit from carefully reviewing each figure to ensure that no data is incorrectly duplicated.
We thank the reviewer for catching this error. It has been corrected.
(4) The authors describe experiments to compare shock and reward learning; however, there are temporal differences in what they compare in Figure 5. The authors compare the 10th day of reward learning with the 1st day of fear conditioning, which effectively represent different points of learning and retrieval. At the end of reward conditioning, animals are utilizing a learned association to the cue, which demonstrates retrieval. On the day of fear conditioning, animals are still learning the cue at the beginning of the session, but they are not necessarily retrieving an association to a learned cue. The authors would benefit from recording at a later timepoint (to be consistent with reward learning- 10 days after fear conditioning), to more accurately compare these two timepoints. Or perhaps, it might be easier to just make the comparison between Day 1 of reward learning and Day 1 of fear learning, since they must already have these data.
We agree that there are temporal differences between the food and shock US deliveries. This is likely a reflection of the fact that the shock delivery is passive and easily resolved based on the time of the US delivery, whereas the food responses are variable because they are dependent upon the consumption of the sucrose pellet. Because of these differences the kinetics of the responses cannot be accurately compared. This is why we restricted our analysis to whether the cells were food or shock responsive. Aside from reporting the temporal differences in the signals did not draw major conclusions about the differences in kinetics. In our experimental design we counterbalanced the animals that received fear conditioning firs then food conditioning, or food conditioning then fear conditioning to ensure that order effects did not influence the outcome of the study. It is widely known that Pavlovian fear conditioning can facilitate the acquisition of conditioned stimulus responses with just a single day of conditioning. In contrast, Pavlovian reward conditioning generally progresses more slowly. Because of this we restricted our analysis to the last day of reward conditioning to the first and only day of fear conditioning. However, as stated above, we compared the responses of neurons defined as salience during day 1 of reward conditioning and fear conditioning. As would be predicted based on previous definitions of salience encoding (Lin and Nicolelis, 2008, doi: 10.1016/j.neuron.2008.04.031), we observed that the responses were highest when the US was most unexpected
(5) The authors make a claim of valence encoding in their title and throughout the paper, which is not possible to make given their experimental design. However, they would greatly benefit from actually using a decoder to demonstrate their encoding claim (decoding performance for shock-food versus shuffled labels) and simply make claims about decoding food-predictive cues and shock-predictive cues. Interestingly, it seems like relatively few CeA neurons actually show differential responses to the food and shock CSs, and that is interesting in itself.
As stated above, valence and salience encoding were defined similar to what has been previously reported (Li et al., 2019, doi: 10.7554/eLife.41223; Yang et al., 2023, doi: 10.1038/s41586-023-05910-2; Huang et al., 2024, doi: 10.1038/s41586-024-07819; Lin and Nicolelis, 2008, doi: 10.1016/j.neuron.2008.04.031; Stephenson-Jones et al., 2020, doi: 10.1016/j.neuron.2019.12.006; Zhu et al., 2018, doi: 10.1126/science.aat0481; and Comoli et al., 2003, doi: 10.1038/nn1113P). Interestingly, many of these studies did not vary the US intensity.
Reviewer #3 (Public review):
Summary:
In their manuscript entitled Kong and colleagues investigate the role of distinct populations of neurons in the central amygdala (CeA) in encoding valence and salience during both appetitive and aversive conditioning. The study expands on the work of Yang et al. (2023), which specifically focused on somatostatin (SST) neurons of the CeA. Thus, this study broadens the scope to other neuronal subtypes, demonstrating that CeA neurons in general are predominantly tuned to valence representations rather than salience.
We thank the reviewer for their insightful comments and assessment of the manuscript.
Strengths:
One of the key strengths of the study is its rigorous quantitative approach based on the "circular-shift method", which carefully assesses correlations between neural activity and behavior-related variables. The authors' findings that neuronal responses to the unconditioned stimulus (US) change with learning are consistent with previous studies (Yang et al., 2023). They also show that the encoding of positive and negative valence is not influenced by prior training order, indicating that prior experience does not affect how these neurons process valence.
Weaknesses:
However, there are limitations to the analysis, including the lack of population-based analyses, such as clustering approaches. The authors do not employ hierarchical clustering or other methods to extract meaning from the diversity of neuronal responses they recorded. Clustering-based approaches could provide deeper insights into how different subpopulations of neurons contribute to emotional processing. Without these methods, the study may miss patterns of functional specialization within the neuronal populations that could be crucial for understanding how valence and salience are encoded at the population level.
We appreciate the reviewer’s comments regarding clustering-based approaches. In order to classify cells as responsive to the US or CS we chose to develop a statistically rigorous method for classifying cell response types. Using this approach, we were able to define cell responses to the US and CS. Importantly, we identified 8 distinct response types to the USs. It is not clear how additional clustering analysis would improve cell classifications.
Furthermore, while salience encoding is inferred based on responses to stimuli of opposite valence, the study does not test whether these neuronal responses scale with stimulus intensity-a hallmark of classical salience encoding. This limits the conclusions that can be drawn about salience encoding specifically.
As stated above, we used salience classifications similar to those previously described (Lin and Nicolelis, 2008, doi: 10.1016/j.neuron.2008.04.031; Stephenson-Jones et al., 2020, doi: 10.1016/j.neuron.2019.12.006; Zhu et al., 2018, doi: 10.1126/science.aat0481; and Comoli et al., 2003, doi: 10.1038/nn1113P). We agree that varying the stimulus intensity would provide a more rigorous assessment of salience encoding; however, several of the studies mentioned above classify cells as salience encoding without varying stimulus intensity. Additionally, the inclusion of recordings with varying US intensities on top of the Pavlovian reward and fear conditioning would further decrease the number of cells that can be longitudinally tracked and would likely decrease the number of cells that could be classified.
In sum, while the study makes valuable contributions to our understanding of CeA function, the lack of clustering-based population analyses and the absence of intensity scaling in the assessment of salience encoding are notable limitations.
Reviewer #4 (Public review):
Summary:
The authors have performed endoscopic calcium recordings of individual CeA neuron responses to food and shock, as well as to cues predicting food and shock. They claim that a majority of neurons encode valence, with a substantial minority encoding salience.
Strengths:
The use of endoscopic imaging is valuable, as it provides the ability to resolve signals from single cells, while also being able to track these cells across time. The recordings appear well-executed, and employ a sophisticated circular shifting analysis to avoid statistical errors caused by correlations between neighboring image pixels.
Weaknesses:
My main critique is that the authors didn't fully test whether neurons encode valence. While it is true that they found CeA neurons responding to stimuli that have positive or negative value, this by itself doesn't indicate that valence is the primary driver of neural activity. For example, they report that a majority of CeA neurons respond selectively to either the positive or negative US, and that this is evidence for "type I" valence encoding. However, it could also be the case that these neurons simply discriminate between motivationally relevant stimuli in a manner unrelated to valence per se. A simple test of this would be to check if neural responses generalize across more than one type of appetitive or aversive stimulus, but this was not done. The closest the authors came was to note that a small number of neurons respond to CS cues, of which some respond to the corresponding US in the same direction. This is relegated to the supplemental figures (3 and 4), and it is not noted whether the the same-direction CS-US neurons are also valence-encoding with respect to different USs. For example, are the neurons excited by CS-food and US-food also inhibited by shock? If so, that would go a long way toward classifying at least a few neurons as truly encoding valence in a generalizable way.
As stated above, valence and salience encoding were defined similar to what has been previously reported (Li et al., 2019, doi: 10.7554/eLife.41223; Yang et al., 2023, doi: 10.1038/s41586-023-05910-2; Huang et al., 2024, doi: 10.1038/s41586-024-07819; Lin and Nicolelis, 2008, doi: 10.1016/j.neuron.2008.04.031; Stephenson-Jones et al., 2020, doi: 10.1016/j.neuron.2019.12.006; Zhu et al., 2018, doi: 10.1126/science.aat0481; and Comoli et al., 2003, doi: 10.1038/nn1113P). As reported in Figure 5 and Figure 5 – Supplement 3, ~29% of CeA neurons responded to both food and shock USs (15% in the same direction and 13.5% in the opposite direction). In contrast, only 6 of 303 cells responded to both the CSfood and CSshock, all in the same direction.
A second and related critique is that, although the authors correctly point out that definitions of salience and valence are sometimes confused in the existing literature, they then go on themselves to use the terms very loosely. For example, the authors define these terms in such a way that every neuron that responds to at least one stimulus is either salience or valence-encoding. This seems far too broad, as it makes essentially unfalsifiable their assertion that the CeA encodes some mixture of salience and valence. I already noted above that simply having different responses to food and shock does not qualify as valence-encoding. It also seems to me that having same-direction responses to these two stimuli similarly does not quality a neuron as encoding salience. Many authors define salience as being related to the ability of a stimulus to attract attention (which is itself a complex topic). However, the current paper does not acknowledge whether they are using this, or any other definition of salience, nor is this explicitly tested, e.g. by comparing neural response magnitudes to any measure of attention.
As stated in response to reviewer 2, we longitudinally tracked cells on the first day of Pavlovian reward conditioning the fear conditioning day. Although there were considerably fewer head entries on the first day of reward conditioning, we were able to identify 10 cells that were activated by both the food US and shock US. We compared the responses to the first five head entries and last head entries and the first 5 shocks and last five shocks. Consistent with what has been reported for salience encoding neurons in the basal forebrain (Lin and Nicolelis, 2008, doi: 10.1016/j.neuron.2008.04.031), we observed that the responses were highest when the US was most unexpected and decreased in later trials.
The impression I get from the authors' data is that CeA neurons respond to motivationally relevant stimuli, but in a way that is possibly more complex than what the authors currently imply. At the same time, they appear to have collected a large and high-quality dataset that could profitably be made available for additional analyses by themselves and/or others.
Lastly, the use of 10 daily sessions of training with 20 trials each seems rather low to me. In our hands, Pavlovian training in mice requires considerably more trials in order to effectively elicit responses to the CS. I wonder if the relatively sparse training might explain the relative lack of CS responses?
It is possible that learning would have occurred more quickly if we had used greater than 20 trials per session. However, we routinely used 20-25 trials for Pavlovian reward conditioning (doi: 10.1073/pnas.1007827107; doi: 10.1523/JNEUROSCI.5532-12.2013; doi: 10.1016/j.neuron.2013.07.044; and doi: 10.1016/j.neuron.2019.11.024).
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
eLife Assessment
This is a useful report of a spatially-extended model to study the complex interactions between immune cells, fibroblasts, and cancer cells, providing insights into how fibroblast activation can influence tumor progression. The model opens up new possibilities for studying fibroblast-driven effects in diverse settings, which is crucial for understanding potential tumor microenvironment manipulations that could enhance immunotherapy efficacy. While the results presented are solid and follow logically from the model’s assumptions, some of these assumptions may require further validation, as they appear to oversimplify certain aspects in light of complex experimental findings, system geometry, and general principles of active matter research.
We thank the editor for recognizing the usefulness of our work. This work does not aim to precisely describe the complexity of the tumor microenvironment in lung cancer, but rather to classify and rigorously calibrate a minimum number of parameters to the clinical data we collect and generate, and reproduce the global structures of the microenvironment. We identify different scenarios, and show how they depend on the local interactions within this framework. Although we started in the first version with coalescence in the main text and anisotropic geometry in the supporting information, we realized that we needed to provide more directions to better show how our model can be extended. Thus, in Section III-4 we added an analysis of a microenvironment with blood vessels, and showed how to introduce anisotropic friction as a function of fiber orientation, as well as active stress, paving the way for further studies, that would make our model more complex. However, in a first step, it is crucial to start with a limited number of parameters that can be rigorously determined, and this is how this first work was conceived.
Public Reviews:
Reviewer #1 (Public review):
The authors present an important work where they model some of the complex interactions between immune cells, fibroblasts and cancer cells. The model takes into account the increased ECM production of cancer-associated fibroblasts. These fibres trap the cancer but also protect it from immune system cells. In this way, these fibroblasts’ actions both promote and hinder cancer growth. By exploring different scenarios, the authors can model different cancer fates depending on the parameters regulating cancer cells, immune system cells and fibroblasts. In this way, the model explores non-trivial scenarios. An important weakness of this study is that, though it is inspired by NSCLC tumors, it is restricted to modelling circular tumor lesions and does not explore the formation of ramified tumors, as in NSCLC. In this way, is only a general model and it is not clear how it can be adapted to simulate more realistic tumor morphologies.
We thank the reviewer for highligting the importance of our work. We acknowledge that although we provided anisotropic geometries and the study of the coalescence in the first version, more effort was needed to provide tools to extend our formalism to non-ideal cases. This is now added as Section III-4, where we analyze the impact of blood vessels, and the anisotropic friction due to the nematic order for the fibers; this nematic order can also be used to introduce active nematic stress.
Reviewer #2 (Public review):
Summary:
The authors develop a computational model (and a simplified version thereof) to treat an extremely important issue regarding tumor growth. Specifically, it has been argued that fibroblasts have the ability to support tumor growth by creating physical conditions in the tumor microenvironment that prevent the relevant immune cells from entering into contact with, and ultimately killing, the cancer cells. This inhibition is referred to as immune exclusion. The computational approach follows standard procedures in the formulation of models for mixtures of different material species, adapted to the problem at hand by making a variety of assumptions as to the activity of different types of fibroblasts, namely ”normal” versus ”cancer-associated”. The model itself is relatively complex, but the authors do a convincing job of analyzing possible behaviors and attempting to relate these to experimental observations.
Strengths:
As mentioned, the authors do an excellent job of analyzing the behavior of their model both in its full form (which includes spatial variation of the concentrations of the different cellular species) and in its simplified mean field form. The model itself is formulated based on established physical principles, although the extent to which some of these principles apply to active biological systems is not clear (see Weaknesses). The results of the model do offer some significant insights into the critical factors which determine how fibroblasts might affect tumor growth; these insights could lead to new experimental ways of unraveling these complex sets of issues and enhancing immunotherapy.
We thank the referee for this summary and for recognizing the strengths of our paper.
Weaknesses:
Models of the form being studied here rely on a large number of assumptions regarding cellular behavior. Some of these seemed questionable, based on what we have learned about active systems. The problem of T cell infiltration as well as the patterning of the extracellular matrix (ECM) by fibroblasts necessarily involve understanding cell motion and cell interactions due e.g. to cell signaling. Adopting an approach based purely on physical systems driven by free energies alone does not consider the special role that active processes can play, both in motility itself and in the type of self-organization that can occur due to these cell-cell interactions. This to me is the primary weakness of this paper.
We thank the referee for this important comment, that allows us to clarify this important point. Although biological materials are out of equilibrium, their behavior often resembles that dictated by thermodynamics. Hence the usefulness of constructing a free energy, in terms of these variables. In a first approach to decipher the complex interactions and describe the different and sometimes non-trivial outcomes in this system that involves many components, we must start by minimizing the number of parameters, and identifying those complex processes, that control the evolution of the system. The free energy that we build on this biological system contains therefore out-of-equilibrium processes that can be approximated by a ”close to equilibrium” description. Our approach is a classical one in statistical physics of active systems, namely in the effort to construct an equivalent free-energy for out-of-equilibrium systems. This allows to gain a clearer insight into those complex processes.
We have added a sentence in the main text, section III.1, to clarify this point:
“Building a free-energy density for a biological material is justified, because, although biological materials are out of equilibrium, their behavior often resembles that dictated by thermodynamics. It is therefore useful to write a free energy in terms of state variables.”
Nevertheless, we recognize that we should have provided more tools for using our formalism by making it active. This is why we introduced the nematic order in the fibers in Section III-4. This nematic order can be used to introduce active stress, and we have cited previous works by some of us see [?, ?, ?] as references for building active processes out of it.
We must also note that cell signaling has been introduced a minima in our system for providing the cue for the arrival of T-cells and NAFs from the boundaries. However, we found that although we had evoked the other role of the chemicals in the transformation from NAFs to CAFs in the text, details were not well explained. We have therefore corrected and added some explanations in the introduction of section III, and III.1, III.2.
A separate weakness concerns the assumption that fibroblasts affect T cell behavior primarily by just making a more dense ECM. There are a number of papers in the cancer literature (see, for some examples, Carstens, J., Correa de Sampaio, P., Yang, D. et al. Spatial computation of intratumoral T cells correlates with survival of patients with pancreatic cancer. Nat Commun 8, 15095 (2017);Sun, Xiujie, Bogang Wu, Huai-Chin Chiang, Hui Deng, Xiaowen Zhang, Wei Xiong, Junquan Liu et al. ” Tumour DDR1 promotes collagen fibre alignment to instigate immune exclusion.” Nature 599, no. 7886 (2021): 673-678) that seem to indicate that density alone is not a sufficient indicator of T cell behavior. Instead, the organization of the ECM (for example, its anisotropy) could be playing a much more essential role than is given credit for here. This possibility is hinted at in the Discussion section but deserves much more emphasis.
The referee is right in his comment, and we thank him for raising this issue. We have therefore introduced the anisotropic orientation of the fibers, which induces an anisotropic friction in a new section III-4. In addition, the references pointed out were included in this section. However, although the anisotropy strongly influences the fate of the tumor when the fibers are oriented perpendicular to the surface of the cancer nest, it is less effective when the fibroblasts are oriented in the direction of surface of the cancer nest. In the latter case, which is often the case before cancer cells reshape the tumor microenvironment, the matrix density should correlate with the friction.
Finally, the mixed version of the model is, from a general perspective, not very different from many other published models treating the ecology of the tumor microenvironment (for a survey, see Arabameri A, Asemani D, Hadjati J (2018), A structural methodology for modeling immune-tumor interactions including pro-and anti-tumor factors for clinical applications. Math Biosci 304:48-61). There are even papers in this literature that specifically investigate effects due to allowing cancer cells to instigate changes in other cells from being tumor-inhibiting to tumor-promoting. This feature occurs not only for fibroblasts but also for example for macrophages which can change their polarization from M1 to M2. There needed to be some more detailed comparison with this existing literature.
The referee is right that the first part of our approach, namely the dynamical system may be common in this kind of system, and it needs to be mentioned. So we added the following sentence in the discussion: ”This is in line with several similar mathematical models, that study through this lens the inhibition/activation of the immune system by cancer cells either by means of compartmental nonlinear models similar to our dynamical system, for instance regarding macrophage recruitment and cytokine signaling {arabameri2018structural} {li2019computational}, or mixture models {fotso2024mixture}. We combine the two approaches in order to rigorosly derive the parameters of the model and gain insights from both.”
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
The authors should address the following points:
Major issues
(1) The shape of tumors simulated differs immensely from the observed tumors in Fig. 2. Here, the tumor is constituted by irregular domains, not dissimilar from domains in phase separating mixtures. The domains simulated are circular. Since the authors are using the space dependent model to model the increase in tumor cells with time in the different scenarios (immune-desert, immune-excluded, immune inflamed), it should explain how non-spherical tumor structures can be observed in these scenarios. The authors introduce tumor coalescence in page 28, however, it is not expected that the structures observed in Fig 2 are the result from different tumors merging and coalescing, because that would result from an unlikely large number of initial mutation events in the same region of the tissue. The authors should explain what mechanisms present in the model can lead to non-spherical forms.
We agree with the reviewer that real tumors are rarely round contrary to what our numerics suggests. In fact, only the last figure of our paper in the supporting information was more appropriate for such a discussion. We are now adding discussions and new figures to better illustrate our spatial model, see Figure 6 and section III-4. The in situ geometry of tumors depends on the shape of the host organ, the diffusive (chemical) or advected species such as T cells and fibroblasts, and on the nutrients. Thus, in our case, only cancer cells are produced locally, but during growth the tumor is strongly constrained by the microenvironment, and thus the geometry of the domain we model in the numerics and its boundary conditions. This is also true for the chemicals responsible for growth, cellular advection and phenotypic transformation. Their concentration depends on a convection-diffusion equation and boundary conditions. For a tumor in situ, such as in the lung, the available space is a constraint that will dominate the final geometry of the tumor nests. We do not think that coalescence is controlled by mutational events, but most likely by the search for space necessary for growth. Compared to the first version, we add new figures (Figure 6) that show that the geometry of the organ, as well as the localization of blood vessels, are a cause of the irregularity of the tumor shapes. We also introduce orientational order, which as suggested in section III-4, can induce anisotropic friction and stresses, as well as anisotropic growth. We cite (Ackermann, Joseph, and Martine Ben Amar. ”Onsager’s variational principle in proliferating biological tissues, in the presence of activity and anisotropy.” The European Physical Journal Plus 138.12 (2023): 1103.) where we described active stresses and coupling related to anisotropic growth.
(2) According to the authors, the model presented in equations (1) and onwards simulates the evolution of the fraction of tumor cells in the tissue. However the fraction of tumor cells, for example, depends itself on the variation of other cell types. For example, if fibroblasts were to proliferate with rate alpha, even without tumor cells proliferating, the fraction of tumor cells in the mixture should decrease as alpha times the tumor cells fraction. These terms are missing. The equations do not describe the evolution of the cells’ fractions but of the amount of cells of each type, normalised by the total carrying capacity of non-normal cells in the tissue. The text should be rewritten accordingly.
We agree with the referee: our definition of cell density was not precise enough and may appear misleading. In the paragraph II1, we more explictly introduce the word mass fraction which is the correct physical quantity to introduce into the spatial model.
”All these cells have the same mass density and the sum of their mass fraction satisfies the relationship S = C + T + F<sub>NA</sub> + F<sub>A</sub> = 1-N, where N is a healthy non active component as healthy cells, for example.”
It is less intuitive than ”number of cells per unit volume” but necessary for the following (III)
(3) The authors start by calculating fixed points of different versions of the dynamical system without spatial dependence. They should explain what is the relevance of these fixed points: in a real situation, where the concentration of tumor fibroblasts and T-cells depend on position, in which conditions are these fixed points relevant?
The referee is right and we will clarify this point: the dynamic analysis is a help for understanding and predicting the scenario occurring in the system. After all the steps of paragraph 2.2, we are faced with 11 independent parameters only for the dynamical system and without the parameters generated by the space modeling itself. Our estimation concerns only lung cancer. These parameters do not appear in the literature. The parameters introduced in Sec. III which are more related to physical interactions such as friction, cell-cell adhesion, etc. can be found in the literature or can be estimated and thus measured in in vitro experiments (see Ackermann and Ben Amar, EPJP 2023, P. Benaroch, J. Nikolic et al. 2024, biorxiv). So what are the fixed points for: they help to get the right numbers for spatial analysis. To recover special features of cancer evolution, we need a model, but also correct estimates of the data in a code that is quite technical and heavy, with each simulation taking a certain amount of time. For users who only need rough predictions, the analysis in section 2 is sufficient.
It is also important to note that the global result depends only on the source terms, and on the boundary conditions. This can be illustrated with a simple example: Consider the governing equation for the density of a component with velocity v and source term:
Integrating the equation over a fixed volume V of surface S gives:
. This integrated equation can then be approximated by the dynamical system that we write. Thus, while the dynamical system does not give any information about the local structure of the system, it may be indicative of its global outcome.
(4) In page 15, the authors identify that α<sub>NA</sub> is proportional to δ𝝐<sup>4</sup>. However, in equation (7), they replace α<sub>NA</sub> by δ𝝐<sup>4</sup> without the proportionality constant. This should be corrected.
Thank you for your remark. This typo is now corrected.
(5) The tumor cell movement should be much slower than the T-cells. Here, the authors assign a similar friction coefficient for the cancer cells and T-cells, for example. However, in lung cancer tumor cells are epithelial, and adhere to each other in the tissue. Their movement is very restricted by the basement membranes and by cell-cell adhesion. Immune cells and T-cells on the other hand move rapidly throughout the stroma. It is a gross simplification to not consider the low epitelial tissue mobility in the context of lung cancer.
It is possible to assume different friction coe cients for each phase pair. This has been done in a previous publication, Ackermann et al., Physics report 2021. It is also possible to play with the cell-cell adhesion in the energy density and on the diffusion coe cient introduced in the Flory-Higgins free energy. Cell-cell adhesion is taken into account in the energy, and this makes the tumor a more dense phase, while T-cells can move towards cancer cells to which they are attracted. In the last part of the paper, we show the role of an anisotropic friction due to a nematic order for activated fibroblasts and all the other cells
(6) What is the biological mechanism by which the T-cells form a colony with a surface tension? In the phase-field model, the authors have a surface tension assigned to the cancer cells, T-cells and fibroblasts. Can the authors justify biologically why do they consider these surface tensions?
The fact that T-cells form a colony is due to the accumulation of T-cells at the outer boundary of the tumor, as they are attracted to it but cannot penetrate due to the strong cell-cell adhesion of the tumor cells in the nest. Adding a gradient square is standard in continuous models to limit the sharp variations. In a continuous approach, the gradient square contribution limits the sharp variations in cell density which are not physical.
Minor issues
(a) Page 6 (end), characterisation of the fibre barrier produced by CAFs missing: what is the fibre density, how it can hinder the spread of cancer and T-cell motility? Is it so dense that it prevents ameboid movement? Can cells move through it using matrix degradation proteins?
The fiber density corresponds to the fibrous organic extracellular matrix secreted by cancer-associated fibroblasts. In desmotic (highly fibrous tumors such as PDAC or NSCLC), this extracellular matrix deposited around the tumor forms a physical barrier around the tumor nest, preventing both cell migration and capillary and immune cells penetration. In these cases, the fibrous belt actually prevents ameboid movement and cells must deform significantly to migrate. The role of this barrier was particularly demonstrated in the reference (Grout, John A., et al. ”Spatial positioning and matrix programs of cancer-associated fibroblasts promote T-cell exclusion in human lung tumors.” Cancer Discovery 12.11 (2022): 2606-2625.). In later stages of cancer, the tumor may adapt and develop strategies to metastasize, such as matrix degradation. This matrix can be oriented, organized or disordered. To build a minimal model, we first considered an isotropic friction and also an anisotropic friction of the nematic belt, due to the activated fibroblasts. In the case of T-cells, as mentioned in section I.1, it is true that the biological literature also considers a phenotypic transformation of the T cells by the activated fibroblasts: this concerns both their proliferative capacities, antigen recognition and also their cytotoxic function. To better document the different mechanisms, we add the following publication: Cancer associated fibroblasts-an impediment to effective anti-cancer T cell immunity, by Koppensteiner, Lilian and Mathieson, Layla and O’Connor, Richard A and Akram, Ahsan R, Frontiers in immunology (2022).
However, our goal is to build a minimal model and to characterize and quantify the physical process in which CAFs are involved, namely the role of a physical barrier, that has been documented, as documented above.
(b) Page 19 (Fig 3), in the figure legend it is written ”resting fibroblasts”, should be ”non-activated fibroblasts”.
The referee is right: it will be better to write non-activated fibroblasts. This is now changed in the main text.
(c) Page 21 (equation), what is dΩ? It is dr?
We thank the referee for raising this point. The text was indeed ambiguous as sometimes dΩ was replaced by dr. To be clearer, all the elements of volume are now noted dV , and the element of surface of the system are noted dS.
In the article the units are in italic and should be in roman.
Thank you for raising this point. It has been corrected.
(d) Page 25 (beginning section III.3), the authors mention that the simulation is 2D, however, the simulation has radial symmetry. A 1D simulation in radial coordinates could simulate a 3D spherical system. Is the simulation of this section equivalent to a 1D radial simulation (in 2D)?
The referee is right that in radial symmetry, a 1d equation may be written. We therefore present numerics with irregular shapes of the tumor nest in order to make the system fully 2d.
(e) Page 26 (Fig 4). Legends inside the plots of plates A, B, C and D are not clear. Colorbar range of plates A and D is different. Would facilitate if the ranges were the same.
The referee is right: the surface plots presented in figure 4 would be easier to compare with the same colorbar range for the legends. In fact, as the referee noted, figures in A, B and C have the same legends, while figure in D has a different one. This is due to the fact that D represents the case of the immune-inflamed tumor where the cancer mass fraction is quite vanishing, resulting in values that are of 3 orders of magnitude lower than those present in A, B and C. Therefore, they would disappear if the colorbar range were equal to the others.We insist more on the change of scale in the legend of Figure 4, in the new version.
(f) Page 29 (Fig 5), would facilitate if the order of immune-desert, immune-excluded, immune-inflamed was maintained throughout the document. In this figure the immune-inflamed case appears first.
We agree with the reviewer that following the same order in which the different cases are presented throughout the manuscript would be helpful in comparing the different figures. Therefore, we have modified Figure 5.
(g) Page 31, the authors indicate that pharmacodynamics and pharmacokinetics are highly dependent on tumour spatial structure. Can they provide examples and citations?
In the discussion, we have added references concerning pharmacodynamics.
(h) Page 33 (Fig Sup2), would facilitate if the order of immune-desert, immune-excluded, immune-inflamed was maintained throughout the document. ±±
We thank the reviewer for pointing this out, the order of the different scenarios in Fig Sup 2 has now been changed.
Reviewer #2 (Recommendations for the authors):
Major points
(1) Following on from the discussion in the public review, I feel that there are a number of critical issues that need to be addressed regarding modeling assumptions. I would like to understand why the authors believe it is possible to use a free energy-driven model of the microenvironment when many of the processes relevant for their study have an undeniably ”active media” flavor.
The referee is right that processes in biology are active processes. However, it is a classical approach to model physical interactions between biological components with a free-energy, especially cell adhesion, as they often lead to quasi-stationary equilibrium-like patterns. The free-energy approach has also the advantage to derive straight-forwardly complex phenomena involving many components. Activity can indeed be introduced in such a framework, if we know that the fibroblasts transform into myo-fibroblasts, see for example our previous publication Ackermann and Ben Amar, EPJP 2023. However, in the interest of simplification and reduction of the number of free parameters, we have not not considered further complication of the model here, as a minimal model allows to distinguish the main processes that occur. Nevertheless, introducing more precisely activity, in the nematic approach already achieved for the friction, is a natural continuation of our work: See the new Section III-4, where we introduce the nematic order, and we indicate that active nematic stresses can be written from it.
Next, I don’t understand the assumption that T cells do not proliferate once they detect neoantigens on the cancer cells; activation of T cells usually causes them to become more proliferative.
We thank the referee for this question. The T-cell fraction has two origins: proliferation of T-cells in situ in the stroma or inside tumor nest or external arrival from the sources that we privilege. We recognize that a full analysis of the tumor-microenvironment would require to consider proliferation near the tumor, as many more other processes which is do able but requires the knowledge of more biological date. In addition, besides, the proliferation of T-cells will be equivalent to increase the killing abilities of T-cells and these two effect overlapp in our approach.
In order to clarify this point, we modify the following sentence in Section II.2:
“Although proliferation of cytotoxic T-cells has been observed, we do not consider explicitly proliferation in our study as we focus on their ability to infiltrate the tumor.”
Rather, we consider that T-cells proliferate outside the domain boundaries, so that this proliferation is included in the boundary source contributions.
Finally, the issue of whether the density of fibers is sufficient to understand the role of fibroblasts is not at all settled. There should be a full discussion of this issue including mentioning of the Nature paper (cited in the public review) that argues that orientation (and not density) is the key to the role of fibers, as well as the earlier cited work of Kalluri and collaborators on the role of ECM density in pancreatic cancer.
We thank the referee for this remark. As we wrote above in the response to the public review, we introduced significant additions that aim to tackle this question in the article.
(2) The authors present a picture of a tumor cell with fibroblasts apparently arrayed circumferentially around the tumor boundary and therefore blocking infiltration. This type of tumor structure has been seen before, for example in ”On the mechanism of long-range orientational order of fibroblasts.” Proceedings of the National Academy of Sciences 114, no. 34 (2017): 8974-8979, which should be cited. More importantly, in that paper the argument is made that positive feedback between fibroblasts and ECM geometry can cause structures like this to form. If this is indeed what is occurring, this would indicate the crucial importance of a mechanism beyond what is contained in the current model. This issue should therefore be discussed within this paper. This issue is of course connected to the previous point regarding the role of ECM structure beyond density.
We completely agree that the interplay between the fibroblast layer and the tumor shapes the tumor boundary. One of the authors has worked recently on this precise topic (Aging and freezing of active nematic dynamics of cancer-associated fibroblasts by fibronectin matrix remodeling, C Jacques, J Ackermann, S Bell, C Hallopeau, CP Gonzalez, ... bioRxiv, 2023.11. 22.568216, Ordering, spontaneous flows and aging in active fluids depositing tracks S Bell, J Ackermann, A Maitra, R Voituriez arXiv preprint arXiv:2409.05195). Since the fibroblast layer is an active material, it contributes to an anisotropic stress that can be introduced into the model. Our first strategy was to present the simplest modeling in order to focus on the most important interactions as cell-cell adhesion and cell-tissue adhesion. However, we recognize that those questions should be discussed in the text, and we discuss it in the new section III-4
Minor points
There are also a number of more minor points to consider:
(1) Since the parameter is taken to be O(1), why exactly does it matter how the other parameters scale with it?
It is very important to compare the order of magnitude of the other parameters once the selected parameter of order O(1) is really the driving parameter of the coupling. It gives a first picture of the main interactions that has to consider.
(2) I didn’t understand the relevance of referring specifically to IL 6 among many other possibly relevant signals, as is currently done on page 7.
This corresponds to studies aiming to correlate lung cancer risks and the concentration of interleukin, mostly IL6 and IL8 (McKeown, D. J., et al. ”The relationship between circulating concentrations of C-reactive protein, inflammatory cytokines and cytokine receptors in patients with non-small-cell lung cancer.” British journal of cancer 91.12 (2004): 1993-1995.,Brenner, Darren R., et al. ”Inflammatory cytokines and lung cancer risk in 3 prospective studies.” American journal of epidemiology 185.2 (2017): 86-95. ) but in the absence of very detailed biological information, the modeling and its results are not modified if other chemicals intervene..We slightly modeified the following phrase in section I.1:
“In particular, in the family of inflammatory proteins, also called cytokines, Interlukin-6 (IL6) and (IL8) seem, among others to stimulate the infiltration of CD8<sup>+</sup>.
(3) The authors need to mention the possibility of T-cell chemotaxis to the tumor being ”self-amplified” in the T cell system, as put forth in Galeano Nin˜o, Jorge Luis, Sophie V. Pageon, Szun S. Tay, Feyza Colakoglu, Daryan Kempe, Jack Hywood, Jessica K. Mazalo et al. ”Cytotoxic T cells swarm by homotypic chemokine signalling.” eLife 9 (2020): e56554. This might again reveal a needed extension of the current modelling strategy.
We thank the referee for his/her comment on the self-amplification of T-cell population in the stroma and we mention the indicated reference in our paper. This auto-chemoatactic process which induces a dynamic of more e cient recruitment towards the tumor, may be important for immunotherapy. To have more e cient T-cell arriving at the site of the tumor, will lead a better issue for the patient, if the swarming organization is maintained in a desmoplastic nematic stroma.
(4) It is not obvious to me that in sub figures 3F and 3H the tumor is enroute to being totally eradicated, as is stated in the text. The blue lines seemed to asymptote at non-zero population values.
Looking at sub-figures 3F and 3H, we stated in the main text that the tumor is eradicated as the representative population approaches a 0 value fraction, or at least decays around the 0 (0.01/0.05 to be more precise). This is even more evident when compared with the other cases where the tumor mass fraction reaches values of a higher order (up to 0.6), thus leading us to dinstinguish between these different scenarios.
(5) The description of the interaction of cells with fibers as being increased friction might be misleading, as the real effect could be actual trapping in the network (as opposed to just slowing down the motion).
We thank the referee for this question as it allow us to make an important distinction. Indeed, what the referee describes seems to correspond to a discrete event, namely a cell trapped in a network. However, coarse-graining the dynamics to the continuous modeling seems to us as leading to an effective friction between the two phases. Moreover, we also now introduced an anisotropic friction which can represent a trapping. The velocities are not only directed around the tumor but can also be oriented towards the tumor, so that eventually the friction along the radius mimics a trapping (see Fig.4 on top). We have introduced this anisotropic friction via a nematic model, see the appendix.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public review):
Summary:
In organisms with open mitosis, nuclear envelope breakdown at mitotic entry and re‐assembly of the nuclear envelope at the end of mitosis are important, highly regulated processes. One key regulator of nuclear envelope re‐assembly is the BAF (Barrier‐to‐Autointegration) protein, which contributes to cross‐linking of chromosomes to the nuclear envelope. Crucially, BAF has to be in a dephosphorylated form to carry out this function, and PP2A has been shown to be the phosphatase that dephosphorylates BAF. The Ankle2/LEM4 protein has previously been identified as an important regulator of PP2A in the dephosphorylation of BAF but its precise function is not fully understood, and Li and colleagues set out to investigate the function of Ankle2/LEM4 in both Drosophila flies and Drosophila cell lines.
Strengths:
The authors use a combination of biochemical and imaging techniques to understand the biology of Ankle2/LEM4. On the whole, the experiments are well conducted and the results look convincing. A particular strength of this manuscript is that the authors are able to study both cellular phenotypes and organismal effects of their mutants by studying both Drosophila D‐mel cells and whole flies.
The work presented in this manuscript significantly enhances our understanding of how Ankle2/LEM4 supports BAF dephosphorylation at the end of mitosis. Particularly interesting is the finding that Ankle2/LEM4 appears to be a bona fide PP2A regulatory protein in Drosophila, as well as the localisation of Ankle2/LEM4 and how this is influenced by the interaction between Ankle2 and the ER protein Vap33. It would be interesting to see, though, whether these insights are conserved in mammalian cells, e.g. does mammalian Vap33 also interact with LEM4? Is LEM4 also a part of the PP2A holoenzyme complex in mammalian cells?
We feel that conducting experiments to test the level of conservation of our findings in mammalian cells is outside the scope of our study, and we will leave it for other labs to investigate.
Weaknesses:
This work is certainly impactful but more discussion and comparison of the Drosophila versus mammalian cell system would be helpful. Also, to attract the largest possible readership, the Ankle2 protein should be referred to as Ankle2/LEM4 throughout the paper to make it clear that this is the same molecule.
We have reinforced our presentation and discussion of similarities and differences between Ankle2 from Drosophila vs humans where relevant throughout the Introduction and Discussion sections. Additionally, we have added the mention that Ankle2 is also called LEM4 in humans in the Abstract and Introduction. However, when referring to Drosophila Ankle2, we do not use LEM4 because it is not listed as an alternate name for this gene/protein in FlyBase.
A schematic model at the end of the final figure would be very useful to summarise the findings.
We have already provided a schematic model in Figure S3, where we think it is better placed.
Reviewer #2 (Public review):
The authors first identify Ankle2 as a regulatory subunit and direct interactor of PP2A, showing they interact both in vitro and in vivo to promote BAF dephosphorylation. The Ankyrin domain of Ankle2 is important for the interaction with PP2A. They then show Ankle2 also interacts with the ER protein Vap33 through FFAT motifs and they particularly co‐localize during mitosis. The recruitment of Ankle2 to Vap33 is essential to ER and nuclear envelop membrane in telophase while earlier in mitosis, it relies on the C terminus but not the FFAT motifs for recruitments to the nuclear membrane and spindle envelop in early mitosis. The molecular determinants and receptors are currently not known. The authors check the function of the PP2A recruitment to Ankle2/Vap33 in the context of embryos and show this recruitment pathway is functionally important. While the Ankle2/Vap33 interaction is dispensable in adult flies ‐looking at wing development, the PP2A/Ankle2 interaction is essential for correct wing and fly development. Overall, this is a very complete paper that reveals the molecular mechanism of PP2A recruitment to Ankle2 and studies both the cellular and the physiological effect of this interaction in the context of fly development.
Strengths:
The paper is well written and the narrative is well‐developed. The figures are of high quality, wellcontrolled, clearly labelled, and easy to understand. They support the claims made by the authors.
Weaknesses:
The study would benefit from being discussed in the context of what is already known on Ankle2 biology in C.elegans and human cells. It is important to highlight the structures shown in the paper are alphafold models, rather than validated structures.
We have enhanced our presentation of what is known about LEM‐4L/Ankle2 in C. elegans and humans in the Introduction, and further developed comparisons of our findings regarding Drosophila Ankle2 with these orthologs in the Results and Discussion sections. We have also specified in all sections and figure legends that the structures shown are AlphaFold3 models.
Reviewer #3 (Public review):
Summary:
The authors were interested in how Ankle2 regulates nuclear envelope reformation after cell division. Other published manuscripts, including those from the authors, show without a doubt that Ankle2 plays a role in this critical process. However, the mechanism by which Ankle2 functions was unclear. Previous work using worms and humans (Asencio et al., 2012) established that human ANKLE2 could bind endogenous PP2A subunits. The binding was direct and was mediated through a region before and including the first ankyrin repeat in human ANKLE2. In addition to its interaction with PP2A, Asencio et al., 2012 also show that ANKLE2 regulates VRK1 kinase activity. Together PP2A and VRK1 regulate BAF phosphorylation for proper nuclear envelope reformation. Here, the authors provide more evidence for interaction with PP2A by also mapping the domain of interaction to the ankyrin repeat in Drosophila. In addition, the ankyrin repeat is essential for nuclear envelope reformation after division. They show that Ankle2 can bind in a PP2A complex without other known regulatory subunits of PP2A. The authors also identify a novel interaction with ER protein Vap33, but functional relevance for this interaction in nuclear envelope reformation is not provided in the manuscript, which the authors explicitly state. This manuscript does not comment on the activity of Ballchen/VRK1 in relation to Ankle2 loss and BAF phosphorylation or nuclear envelope reformation, even though links were previously shown by multiple studies (Asencio et al., Link et al., Apridita Sebastian et al.,). Nuclear envelope defects were rescued by the reduction of VRK1 in two of these manuscripts. It is possible that BAF phosphorylation phenotypes can be contributed by both PP2A inactivity and VRK1 overactivity due to the loss of Ankle2.
Strengths:
This manuscript is a useful finding linking Ankle2 function during nuclear envelope reformation to the PP2A complex. The authors present solid data showing that Ankle2 can form a complex with PP2A‐29B and Mts and generate a phosphoproteomic resource that is fundamentally important to understanding Ankle2 biology.
Weaknesses:
However, the main findings/conclusions about subcellular localization might be incomplete since they are drawn from overexpression experiments. In addition, throughout the text, some conclusions are overstated or are not supported by data.
It is true that all experiments studying subcellular localization were done with tagged proteins overexpressed in flies and cell culture. Nevertheless, we show that Ankle2‐GFP is functional since it rescues phenotypes resulting from the loss of endogenous Ankle2 in both flies and cultured cells. The antibodies we generated against Ankle2 were unable to reliably detect the endogenous protein by immunofluorescence. We have now stated this caveat in our manuscript. Regarding the validity of our conclusions in relation to our data, we address each point raised by the reviewer under the Recommendations for the authors. In some cases, we have adjusted our conclusions and in other cases, we have provided additional clarification or justification.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
There are a few experimental issues that should be addressed, specific comments are listed below:
(1) Figure 1F: In this experiment, the authors immunoprecipitate GFP‐PP2A‐29B or PP2A‐B29BGFP and Western blot for Ankle2 and Mts to demonstrate that both are co‐immunoprecipitated. To demonstrate that these interactions are specific, the authors should also blot for a protein that is expected to definitely NOT co‐immunoprecipitate with PP2A‐B29; e.g. tubulin.
Our conclusion that GFP‐PP2A‐29B and PP2A‐29B‐GFP specifically interact with Ankle2 and Mts is also based on mass spectrometry analysis of the purification products from embryos and cells in culture, comparing with products of purification of GFP alone (Fig 1E‐F, S1C‐D and Tables S2, S3). The lists of identified proteins reveal that most proteins (including tubulins) are not enriched with GFP‐PP2A‐29B or PP2A‐29B‐GFP like Ankle2 and Mts are.
(2) Figure 2A: The colour coding of the dots is not explained in the figure legend.
We have now added the explanation.
(3) Figure 2B: The competition experiment is a good idea. Do the authors get the same results when they conduct the experiment the other way round, i.e. keep the concentration of Tws the same but increase the concentration of Ankle2?
We have tried this reverse experiment but saw little effect. The failure to observe displacement of Tws by Ankle2 in this context could be due to a higher affinity of Tws than Ankle2 in the PP2A complex, or to lower expression levels achieved for Ankle2 (a larger protein) relative to Tws.
(4) Figure 5D: The hyperphosphorylation of BAF is very difficult to see, and it is impossible to tell whether the hyperphosphorylation has been rescued or not by the different Ankle2 constructs. Can the phosphorylated and the hyperphosphorylated bands be separated better? This panel needs significant improvements to support the claims in the text.
In our opinion, the hyperphosphorylated (upper band) and unphosphorylated (lower band) forms of BAF are well resolved and readily distinguishable. The fainter band in the middle could correspond to a partially phosphorylated form of BAF but we do not venture to speculate on its precise identity nor do we need it to draw our conclusions. The important information from this blot is that the level of unphosphorylated BAF after Ankle2 RNAi increases when Ankle2WT‐GFP and Ankle2Fm+FL1‐GFP are expressed but not when Flag‐GFP or Ankle2ANK‐GFP are expressed. In these experiments, the rescue of unphosphorylated BAF is incomplete because not all cells express the GFP‐tagged protein in our non‐clonal stable cell lines.
Reviewer #2 (Recommendations for the authors):
(1) The alphafold models need to be labelled as such better on the figures, to distinguish them from X‐ray crystallography structures. Alphafold will always propose a solution but it is not necessarily correct.
We have added the note “MODEL” directly in Figures 2C, 2D, 4F and S3B, in addition to the information already provided in the text and figure legends specifying that these are models generated by AlphaFold3.
(2) Figure 4 F. Annotate the Ankle2 FL1 peptide.
We have indicated the amino acid residues in the figure.
(3) Problems with the statistical tests. T‐tests cannot be used for comparing multiple groups, as this favors error propagation.
All of our t‐tests compare only two groups at a time, as indicated. In this regard, our labeling in Fig 5C may have been misleading. We have now changed it.
(4) Close‐ups of ring canal in Figure S2. In Figure S2, there seem to be lots of GFP‐Ankle2 vesicles in the cytoplasm of the oocyte.
We agree that the image showing Ankle2‐GFP alone in the RNAi Vap33 condition suggested a cytoplasmic granular localization of unknown nature. However, upon examination, we realized that this image did not correspond to the same z‐step as the matching merged image (which also
included DNA staining). We have now replaced the image with the correct one.
Reviewer #3 (Recommendations for the authors):
Be more accurate about what conclusions can be made from reported data, particularly from overexpression and deletion studies.
(1) The domain analysis for physical interaction is quite thorough. However, localization information is taken from overexpressed constructs. While these data show what could happen, the authors are not using endogenous levels of Ankle2 in cells or tissues that are known to require Ankle2. As a result, it is difficult to determine whether localization results are biologically meaningful.
We have added the following text at the end of the third Results section:
“We were unable to examine the localization of endogenous Ankle2 because the antibodies that we generated gave inconclusive results in immunofluorescence. For the remainder of our study, we relied on the overexpression of Ankle2‐GFP, which may not perfectly reflect the localization and function of endogenous Ankle2. However, Ankle2‐GFP is functional as it can rescue phenotypes observed when endogenous Ankle2 is depleted (see below).”
(2) The data showing that Ankle2 is a regulator unit of the PP2A complex also relies on in vitro binding assays in an over‐expression context. Data certainly show Ankle2 can bind proteins in the PP2A complex when overexpressed. However, the authors could not isolate enough of the complex from the animal to test function, so Ankle2 acting as a regulatory subunit isn't functionally shown. There are other possibilities, such as Ankle2 acts as a scaffold for complex assembly.
The competition experiments shown in Fig 2 are based on complexes assembling in cells and are not in vitro binding assays. We show 4 lines of evidence supporting the idea that Ankle2 functions as a regulatory subunit of PP2A: 1) Ankle2 interacts with the structural (PP2A‐29B) and catalytic (Mts) subunits of PP2A without any known regulatory subunit of PP2A. 2) Depletion of Ankle2 leads to the hyperphosphorylation of the known PP2A substrate BAF. 3) The PP2A regulatory subunit Tws/B55 competes with Ankle2 for formation of a complex with PP2A. 4) AlphaFold3 predicts that Ankle2 engages in a complex with PP2A at a position similar to that of known regulatory subunits of PP2A including Tws/B55, and consistent with their mutually exclusive presence in PP2A complexes. If Ankle2 acted as a scaffold for the formation of a PP2A complex containing other regulatory subunits, we would expect to detect Ankle2 and another regulatory subunit in the same complex.
(3) Throughout the text, some conclusions are overstated or are not supported by data. Examples are below:
a. Page 1: "we show for the first time that Ankle2 is a regulatory subunit of PP2A" The authors show binding and changes in BAF phosphorylation levels, but changes in PP2A activity with modulation of Ankle2 weren't shown.
We have replaced this phrase with this one:
“…we provide several lines of evidence that suggest that Ankle2 is a regulatory subunit of PP2A…”
b. Page 3: "The requirement for Ankle2 in the development of the central nervous system was initially discovered through its targeting by the microcephaly‐causing Zika virus (Shah et al.,
2018)."
This is not the first paper showing ANKLE2 plays a role in the development of the CNS. Yamamoto et al., 2014 identified mutants in Ankle2 with defects in CNS development in flies and humans, establishing it as a human microcephaly‐causing gene.
We are sorry for this oversight. We have now cited this important work.
c. Page 6: "Moreover, BAF appears to be the only obligatory substrate of Ankle2‐dependent dephosphorylation for cell proliferation as lowering the dose of the BAF kinase NHK‐1/Ballchen rescues wing development defects caused by the partial depletion of Ankle2 (Li et al., 2024)." It is unclear why the authors conclude this since Ballchen/VRK1 can phosphorylate many things besides BAF.
Although the conclusion cannot be drawn categorically, it seems to be by far the most likely scenario. However, we agree that in principle, other mechanisms could also account for these genetic observations, such as the dephosphorylation of another, still unidentified obligatory substrate of PP2A‐Ankle2 that would also be phosphorylated by NHK‐1/Ballchen. However, we have also shown that expression of an unphosphorylatable mutant form of BAF rescues phenotypes observed upon loss of Ankle2 function (Li et al, 2024). We have changed our sentence as follows:
"Moreover, BAF could be the only obligatory substrate of Ankle2‐dependent dephosphorylation for cell proliferation as lowering the dose of the BAF kinase NHK‐1/Ballchen or expression of an unphosphorylatable mutant form of BAF rescues wing development defects caused by the partial depletion of Ankle2 (Li et al., 2024).”
d. Page 10: "These results suggest that a Vap33‐Ankle2‐PP2A complex can mediate the recruitment of a pool of PP2A at the NE."
There is insufficient evidence to indicate that Vap33‐Ankle2‐PP2A exists in a stable state in the cell and that this complex mediates recruitment of PP2A at the NE. The images do not include Vap33, showing no evidence it is present when PP2A is at the NE and the complex could only be detected with overexpression.
We agree with this caveat and recognize the need to be cautious when proposing our model. In this regard, we feel that our wording is reasonable and appropriate, using “suggest” rather than “prove”, “show” or “indicate”.
e. Page 11: These results suggest that the interaction of Ankle2 with PP2A is essential for its function in BAF dephosphorylation and nuclear reassembly." Page 14: "these results indicate that the interaction of Ankle2 with PP2A is essential during embryo". Page 14: "These results indicate that the interaction of Ankle2 with PP2A but not with Vap33 is essential for its function during cell proliferation in imaginal wing disc development."
These experiments show that the ankyrin repeat in Ankle2 is necessary for these processes. It does not say PP2A interaction with Ankle2 is necessary because other things could bind the domain.
We have revised the segments of the text mentioned, taking the reviewer’s legitimate concerns into consideration. We have also added the following sentence to the Discussion:
“However, it remains formally possible that the deletion of Ankyrin repeats used to disrupt the Ankle2‐PP2A interaction abrogated another, unknown aspect of Ankle2 function.”
f. Page 12: "Overall, we conclude that in addition to its N‐terminal PP2A‐interacting Ankyrin domain, Ankle2 requires the integrity of its C‐terminal portion for its essential function in nuclear reassembly."
No data was shown for differences in nuclear reassembly, only the ability for ANKLE2 truncation mutants to localize to the nuclear envelope. It isn't clear whether the nuclear envelope reformation is normal in Figure S6 which the authors refer to. Lamin staining could help determine and conclude the C‐terminal region is important for nuclear envelope reformation.
Our conclusion is drawn from the results shown in Figures S4 and S5 (described in the same section), where a rescue assay in cells was performed to assess the functionality of different variants of Ankle2‐GFP when endogenous Ankle2 was depleted. In this assay, Lamin and DNA staining were used to examine nuclear reassembly (as in Figure 5). Figure S6 shows the localizations of the different variants of Ankle2‐GFP, but endogenous Ankle2 is not depleted in these cells.
g. Page 13: "We conclude that the ability of Ankle2 to interact with PP2A is required for the timely recruitment of BAF at reassembling nuclei and ensuing NE reassembly."
It's possible the Ankyrin domain in ANKLE2 is interacting with proteins other than PP2A to recruit BAF at reassembling nuclei, especially since ANKLE2 is found to regulate VRK1 (Link 2019) which has been found to phosphorylate BAF during the cell cycle (Molitor 2014). Additionally, the images in Figure 6A appear to show fully reassembled nuclear envelopes in all mutants by 180s.
This point relates to point e, raised above by this reviewer. We have re‐written the sentence as follows:
“We conclude that the Ankyrin domain, required for the ability of Ankle2 to interact with PP2A, is necessary for the timely recruitment of BAF at reassembling nuclei and ensuing NE reassembly.”
Please note that in this paragraph, we discuss a delay in RFP‐BAF recruitment, rather than the complete elimination of this recruitment.
h. Page 16: "Our unbiased phosphoproteomic analysis confirmed that BAF dephosphorylation depends on Ankle2, despite the absence of a detectable interaction between Drosophila Ankle2 and BAF, which may be due to the lack of a LEM domain in the former (Fishburn et al., 2024). Moreover, while Ankle2 was shown to bind and inhibit the BAF counteracting kinase VRK1 in humans (Asencio et al., 2012), we detected no interaction between Ankle2 and NHK‐1/Ballchen (VRK1 ortholog) in Drosophila. This suggests that the loss of Ankle2 causes BAF hyperphosphorylation by preventing PP2A‐dependent dephosphorylation rather than by preventing inhibition of NHK‐1"
There could be transient binding between Ankle2 and Ballchen/VRK1/NHK‐1 or activity can be indirect, but that doesn't mean there is not a contribution of BAF phosphorylation by Ballchen/VRK1/NHK‐1. Genetic evidence from three model systems, including Drosophila, indicates there is a strong genetic interaction between Ankle2 and Ballchen/VRK1/NHK‐1 that includes rescue of lethality.
We agree and we have re‐written in this way:
“While a putative interaction between Ankle2 and NHK‐1 in Drosophila could occur transiently, thereby escaping detection, the simplest interpretation of our results is that the loss of Ankle2 causes BAF hyperphosphorylation by preventing PP2A‐dependent dephosphorylation rather than by preventing inhibition of NHK‐1.”
We do not question the fact that Ballchen/VRK1/NHK‐1 phosphorylates BAF and genetically interacts with Ankle2. The antagonistic relationship between Ballchen/VRK1/NHK‐1 and Ankle2 observed genetically can be explained by the fact that the kinase phosphorylates BAF while PP2AAnkle2 dephosphorylates it, without the need to invoke an additional inhibition of the kinase by Ankle2.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the current reviews.
Public Reviews:
Reviewer #1 (Public review):
The hypothesis is based on the idea that inversions capture genetic variants that have antagonistic effects on male sexual success (via some display traits) and survival of females (or both sexes) until reproduction. Furthermore, a sufficiently skewed distribution of male sexual success will tend to generate synergistic epistasis for male fitness even if the individual loci contribute to sexually selected traits in an additive way. This should favor inversions that keep these male-beneficial alleles at different loci together at a cis-LD. A series of simulations are presented and show that the scenario works at least under some conditions. While a polymorphism at a single locus with large antagonistic effects can be maintained for a certain range of parameters, a second such variant with somewhat smaller effects tends to be lost unless closely linked. It becomes much more likely for genomically distant variants that add to the antagonism to spread if they get trapped in an inversion; the model predicts this should drive accumulation of sexually antagonistic variants on the inversion versus standard haplotype, leading to the evolution of haplotypes with very strong cumulative antagonistic pleiotropic effects. This idea has some analogies with one of predominant hypotheses for the evolution of sex chromosomes, and the authors discuss these similarities. The model is quite specific, but the basic idea is intuitive and thus should be robust to the details of model assumption. It makes perfect sense in the context of the geographic pattern of inversion frequencies. One prediction of the models (notably that leads to the evolution of nearly homozygously lethal haplotypes) does not seem to reflect the reality of chromosomal inversions in Drosophila, as the authors carefully discuss, but it is the case of some other "supergenes", notably in ants. So the theoretical part is a strong novel contribution.
We appreciate the detailed and accurate summary of our main theoretic results.
To provide empirical support for this idea, the authors study the dynamics of inversions in population cages over one generation, tracking their frequencies through amplicon sequencing at three time points: (young adults), embryos and very old adult offspring of either sex (>2 months from adult emergence). Out of four inversions included in the experiment, two show patterns consistent with antagonistic effects on male sexual success (competitive paternity) and the survival of offspring, especially females, until an old age, which the authors interpret as consistent with their theory.
As I have argued in my comments on previous versions, the experiment only addresses one of the elements of the theoretical hypothesis, namely antagonistic effects of inversions on male reproductive success and other fitness components, in particular of females. Furthermore, the design of this experiment is not ideal from the viewpoint of the biological hypothesis it is aiming to test. This is in part because, rather than testing for the effects of inversion on male reproductive success versus the key fitness components of survival to maturity and female reproductive output, it looks at the effects on male reproductive success versus survival to a rather old age of 2 months. The relevance of survival until old age to fitness under natural conditions is unclear, as the authors now acknowledge. Furthermore, up to 15% of males that may have contributed to the next generation did not survive until genotyping, and thus the difference between these males' inversion frequency and that in their offspring may be confounded by this potential survival-based sampling bias. The experiment does not test for two other key elements of the proposed theory: the assumption of frequency-dependence of selection on male sexual success, and the prediction of synergistic epistasis for male fitness among genetic variants in the inversion. To be fair, particularly testing for synergistic epistasis would be exceedingly difficult, and the authors have now included a discussion of the above caveats and limitations, making their conclusions more tentative. This is good but of course does not make these limitations of the experiment go away. These limitations mean that the paper is stronger as a theoretical than as an empirical contribution.
We discuss the choice to focus on exploring the potential antagonistic effects of the inversion karyotype on male reproductive success and survival in our general response above. Primarily, this prediction seemed to be the most specific to the proposed model as compared to other alternate models. Still, further studies are clearly needed to elucidate the potential frequency dependence and genetic architecture of the inversions.
Regarding the choice of age at collection, it is unknown to what degree our selected collection age of 10 weeks correlates with survival in the wild, but we feel confident that there will be some positive correlation.
We now further clarify that across our experiments, a minimum of 5% and a mean of 9% of the males used in the parental generation died before collection. These proportions do not appear sufficient to explain the differences between paternal and embryo inversion frequencies shown in Figure 9.
Reviewer #2 (Public review):
Summary:
In their manuscript the authors address the question whether the inversion polymorphism in D. melanogaster can be explained by sexually antagonistic selection. They designed a new simulation tool to perform computer simulations, which confirmed their hypothesis. They also show a tradeoff between male reproduction and survival. Furthermore, some inversions display sex-specific survival.
Strengths:
It is an interesting idea on how chromosomal inversions may be maintained
Weaknesses:
The authors motivate their study by the observation that inversions are maintained in D. melanogaster and because inversions are more frequent closer to the equator, the authors conclude that it is unlikely that the inversion contributes to adaptation in more stressful environments. Rather the inversion seems to be more common in habitats that are closer to the native environment of ancestral Drosophila populations.
While I do agree with the authors that this observation is interesting, I do not think that it rules out a role in local adaptation. After all, the inversion is common in Africa, so it is perfectly conceivable that the non-inverted chromosome may have acquired a mutation contributing to the novel environment.
Based on their hypothesis, the authors propose an alternative strategy, which could maintain the inversion in a population. They perform some computer simulations, which are in line with the predicted behavior. Finally, the authors perform experiments and interpret the results as empirical evidence for their hypothesis. While the reviewer is not fully convinced about the empirical support, the key problem is that the proposed model does not explain the patterns of clinal variation observed for inversions in D. melanogaster. According to the proposed model, the inversions should have a similar frequency along latitudinal clines. So in essence, the authors develop a complicated theory because they felt that the current models do not explain the patterns of clinal variation, but this model also fails to explain the pattern of clinal variation.
To the contrary – in the Discussion paragraph beginning on Line 671, we explain why we would predict that a tradeoff between survival and reproduction should lead to clinal inversion frequencies. We suggest that a karyotype associated with a survival penalty should be increasingly disadvantageous in more challenging environments (such as high altitudes and latitudes for this species). Furthermore, an advantage in male reproductive competition conferred by that same haplotype may be reduced by the lower population densities that we would expect in more challenging environments (meaning that each female should encounter fewer males). Individually or jointly, these two factors predict that the equilibrium frequency of a balanced inversion frequency polymorphism should depend on a local population’s environmental harshness and population density, with the ensuing prediction that inversion frequency should correlate with certain environmental variables.
Reviewer #3 (Public review):
Summary:
In this study, McAllester and Pool develop a new model to explain the maintenance of balanced inversion polymorphism, based on (sexually) antagonistic alleles and a trade-off between male reproduction and survival (in females or both sexes). Simulations of this model support the plausibility of this mechanism. In addition, the authors use experiments on four naturally occurring inversion polymorphisms in D. melanogaster and find tentative evidence for one aspect of their theoretical model, namely the existence of the above-mentioned trade-off in two out of the four inversions.
Strengths:
(1) The study develops and analyzes a new (Drosophila melanogaster-inspired) model for the maintenance of balanced inversion polymorphism, combining elements of (sexually) antagonistically (pleiotropic) alleles, negative frequency-dependent selection and synergistic epistasis. Simulations of the model suggest that the hypothesized mechanism might be plausible.
(2) The above-mentioned model assumes, as a specific example, a trade-off between male reproductive display and survival; in the second part of their study, the authors perform laboratory experiments on four common D. melanogaster inversions to study whether these polymorphisms may be subject to such a trade-off. The authors observe that two of the four inversions show suggestive evidence that is consistent with a trade-off between male reproduction and survival.
Open issues:
(1) A gap in the current modeling is that, while a diploid situation is being studied, the model does not investigate the effects of varying degrees of dominance. It would thus be important and interesting, as the authors mention, to fill this gap in future work.
(2) It will also be important to further explore and corroborate the potential importance and generality of trade-offs between different fitness components in maintaining inversion polymorphisms in future work.
We appreciate the work put in to evaluating, improving, and summarizing our study. We agree that further work studying the effects of dominance and of the fitness components of the inversions is important.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
l. 354 : I don't understand what the authors mean by "an antagonistic and non-antagonistic allele". If there is a antagonistic polymorphism at a locus, then both alleles have antagonistic effects; i.e., allele B increases trait 1 and reduced trait 2 relative to allele A and vice versa.
Edited, agreed that the terminology used here was sub-optimal.
Reviewer #2 (Recommendations for the authors):
The motivation for their model is their claim that the clinal inversion frequencies are not compatible with local adaptation. The reviewer doubts this strong statement. Furthermore, the proposed model also fails to explain the inversion frequencies in natural populations.
Hence, rather than building a straw man, it would be better if the authors first show their experiments and then present their model as an explanation for the empirical results. Nevertheless, it is also clear that the empirical data are not very strong and cannot be fully explained by the proposed model.
This claim that we reject any role of local adaptation in clinal variation and selection upon inversion polymorphism does not hold up in a reading of our manuscript. We even suggest that locally varying selective pressures must be playing some role, although that does not imply that local adaptation is the ultimate driver of inversion frequencies. Indeed, we suggest that local adaptation alone is an insufficient explanation for inversion frequency clines in D. melanogaster, including because (1) these frequency clines do not approach the alternate fixed genotypes predicted by local directional selection, (2) these derived inversions tend to be more frequent in more ancestral environments (l.113-158).
In our public review response above, and in the Discussion section of our paper, we explain why our model can predict both the clinal frequencies of many Drosophila inversions and their intermediate maximal frequencies. Of course, we do not predict that most inversions in this species should follow the specific tradeoff investigated here. In fact, we were surprised to find even two inversions that experimentally supported our predicted tradeoff. Still, it remains possible that other inversions in this species are subject to other balanced tradeoffs not investigated here, which could help explain why they rarely reach high local frequencies.
Reviewer #3 (Recommendations for the authors):
My previous comments have been adequately addressed.
The following is the authors’ response to the original reviews.
Reviewer #1 (Public Review):
[…]
To provide empirical support for this idea, the authors study the dynamics of inversions in population cages over one generation, tracking their frequencies through amplicon sequencing at three time points: (young adults), embryos and very old adult offspring of either sex (>2 months from adult emergence). Out of four inversions included in the experiment, two show patterns consistent with antagonistic effects on male sexual success (competitive paternity) and the survival of offspring, especially females, until an old age, which the authors interpret as consistent with their theory.
There are several reasons why the support from these data for the proposed theory is not waterproof.
(1) As I have already pointed out in my previous review, survival until 2 months (in fact, it is 10 weeks and so 2.3 months) of age is of little direct relevance to fitness, whether under natural conditions or under typical lab conditions.
The authors argue this objection away with two arguments
First, citing Pool (2015) they claim that the average generation time (i.e. the average age at which flies reproduce) in nature is 24 days. That paper made an estimate of 14.7 generations per year under the North Carolina climate. As also stated in Pool (2015), the conditions in that locality for Drosophila reproduction and development are not suitable during three months of the year. This yields an average generation length of about 19.5 days during the 9 months during which the flies can reproduce. On the highly nutritional food used in the lab and at the optimal temperature of 25 C, Drosophila need about 11-12 days to develop from egg to adult. Even assuming these perfect conditions, the average age (counted from adult eclosion) would be about 8 days. In practice, larval development in nature is likely longer for nutritional and temperature reasons, and thus the genomic data analyzed by Pool imply that the average adult age of reproducing flies in nature would be about 5 days, and not 24 days, and even less 10 weeks. This corresponds neatly to the 2-6 days median life expectancy of Drosophila adults in the field based on capture-recapture (e.g., Rosewell and Shorrocks 1987).
Second, the authors also claim that survival over a period of 2 month is highly relevant because flies have to survive long periods where reproduction is not possible. However, to survive the winter flies enter a reproductive diapause, which involves profound physiological changes that indeed allow them to survive for months, remaining mostly inactive, stress resistant and hidden from predators. Flies in the authors' experiment were not diapausing, given that they were given plentiful food and kept warm. It is still possible that survival to the ripe old age of 10 weeks under these conditions still correlates well with surviving diapause under harsh conditions, but if so, the authors should cite relevant data. Even then, I do not think this allows the authors to conclude that longevity is "the main selective pressure" on Drosophila (l. 936).
This is overall a thoughtfully presented critique and we have endeavored to improve our discussion of Pool (2015) and to clarify some of the language used about survival elsewhere. While we agree that challenges other than survival to 10 weeks are very relevant to Drosophila melanogaster, collection at 10 weeks does encompass some of these other challenges. Egg to adult viability still contributes to the frequencies of the inversions at collection and is not separable from longevity in this data. Collection at longevity was chosen in part to encompass all lifetime fitness challenges that might influence the inversion frequency at collection, albeit still within permissive laboratory conditions. Future experiments exploring specific stressors independently and beyond permissive lab conditions would generate a clearer picture.
In addition to general edits, the specific phrase mentioned at 1. 936 [now line 1003] has been revised from “In many such cases females are in reproductive diapause, and so longevity is the main selective pressure.” to “While longevity is a key selective pressure underlying overwintering, the relationship between longevity in permissive lab conditions without diapause and in natural conditions under diapause is unclear (Schmidt et al. 2005; Flatt 2020), and our experiment represents just one of many possible ways to examine tradeoffs involving survival.”
(2) It appears that the "parental" (in fact, paternal) inversion frequency was estimated by sequencing sires that survived until the end of the two-week mating period. No information is provided on male mortality during the mating period, but substantial mortality is likely given constant courtship and mating opportunities. If so, the difference between the parental and embryo inversion frequency could reflect the differential survival of males until the point of sampling rather than / in addition to sexual selection.
We have further clarified that when referenced as parental frequency, the frequency presented is ½ the paternal frequency as the mothers were homokaryotypic for the standard arrangement. We chose to present both due to considerations in representing the frequency change from paternal to embryo frequencies, where a hypothetical change from 0.20 frequency in fathers to 0.15 frequency in embryos represents a selective benefit (a frequency increase in the population), despite the reality that this is a decrease in allele frequency between paternal and embryo cohorts.
We mentioned a maximum 15% paternal mortality at line 827 [now l.1056], but have now added complete data on the counts of flies in the experiment as a supplemental table (Table S1) and have added or corrected further references to this in the results and methods [lines 555, 638, 975]. It is true that this may influence the observed frequency changes to some degree, and while we adjusted our sampling method to account for the effects of this mortality on statistical power [l.1056ff], we have now edited the manuscript to better highlight potential effects of this phenomenon on the recorded frequency changes.
It is also worth noting that, if mortality among fathers over the mating period is codirectional with mortality among aged offspring, this would bias the results against detecting an opposing antagonistic selective effect of the inversions on paternity share. This is now also mentioned in the manuscript, l.639ff.
(3) Finally, irrespective of the above caveats, the experimental data only address one of the elements of the theoretical hypothesis, namely antagonistic effects of inversions on reproduction and survival, notably that of females. It does not test for two other key elements of the proposed theory: the assumption of frequency-dependence of selection on male sexual success, and the prediction of synergistic epistasis for male fitness among genetic variants in the inversion. To be fair, particularly testing the latter prediction would be exceedingly difficult. Nonetheless, these limitations of the experiment mean that the paper is much stronger theoretical than empirical contribution.
This is a fair criticism of the limitations of our results, and we now summarize such caveats more directly in the discussion summary, lines 876ff.
Reviewer #2 (Public Review):
[…]
Comments on the latest version:
I would like to give an example of the confusing terminology of the authors:
"Additionally, fitness conveyed by an allele favoring display quality is also frequency-dependent: since mating success depends on the display qualities of other males, the relative advantage of a display trait will be diminished as more males carry it..."
I do not understand the difference to an advantageous allele, as it increases in frequency the frequency increase of this allele decreases, but this has nothing to do with frequency dependent selection. In my opinion, the authors re-define frequency dependent selection, as for frequency dependent selection needs to change with frequency, but from their verbal description this is not clear.
We have edited this text for greater clarity, now line 232ff. We did not seek to redefine frequency dependence, and did mean by “the relative advantage of a display trait will be diminished” that an equivalent s would diminish with frequency. We have now remedied terminological issues introduced in the prior revision with regard to frequency dependent selection.
One example of how challenging the style of the manuscript is comes from their description of the DNA extraction procedure. In principle a straightforward method, but even here the authors provide a convoluted uninformative description of the procedure.
We have edited for clarity the text on lines 1016-1020. Citing a published protocol and mentioning our modifications seems an appropriate trade-off between representing what was done accurately, citing the sources we relied on in doing it, and limiting the volume of information in the main text for such a straightforward and common method.
It is not apparent to the reviewer why the authors have not invested more effort to make their manuscript digestible.
We have invested a great deal of effort in making this manuscript as clear as we are able to. We regret that our writing has not been to this reviewer’s liking. We believe we have been highly responsive to all specific criticisms, including revising all passages cited as unclear. In this round, we have again scrutinized the entire manuscript for any opportunity to clarify it, and we have made further changes throughout. Although our subject matter is conceptually nuanced, we nevertheless remain optimistic that a careful, fresh reading of our revised manuscript would yield a more favorable impression.
Reviewer #3 (Public Review):
[…]
Weaknesses:
A gap in the current modeling is that, while a diploid situation is being studied, the model does not investigate the effects of varying degrees of dominance. It would be important and interesting to fill this gap in future work.
Agreed, and now reinforced at lines 892ff.
Comments on the latest version:
Most of the comments which I have made in my public review have been adequately addressed.
Some of the writing still seems somewhat verbose and perhaps not yet maximally succinct; some additional line-by-line polishing might still be helpful at this stage in terms of further improving clarity and flow (for the authors to consider and decide).
We have made further changes and some polishing in this draft, and greatly appreciate the guidance provided in improving the draft so far.
Reviewer #1 (Recommendations For The Authors):
(1) While the model results are convincing, some of the verbal interpretation is confusing. In particular, the authors state that in their model the allele favoring male display quality shows a negative frequency dependence whereas the alternative allele has a positive frequency dependence. This does not make sense to me in the context of population genetics theory. For a one-locus, two-allele model the change of allele frequency under selection depends on the fitness of the genotypes concerned relative to each other. Thus, at least under no dominance assumed in this model, if the relative fitness of AA decreases with the frequency of allele A, the relative fitness of aa must decrease with the frequency of allele a. I.e., if selection is negatively frequency dependent, then it is so for both alleles.
This phrasing was wrong, and we have edited the relevant section.
(2) I am still not entirely sure that the synergistic epistasis assumed in the verbal model is actually generated in the simulations; this would be easy enough to check by extracting the mating success of males with different genotypes from the simulation output should be reported, e.g., as a figure supplement.
Our new Figure S2, which depicts haplotype frequencies for a set of the simulations presented in Figure 4, should demonstrate a necessary presence of synergistic epistasis. These results further clarify that the weaker allele B is only kept when linked to A. The same fitness classes of genotype are present in the simulations with and without the inversion, so the only mechanical difference is the rate of recombination, and the only way this might change selection on the alleles is if a variant has a different fitness in one haplotype background than another – i.e. epistasis. The maintenance of haplotypes AB and ab to the exclusion of Ab and aB relies on the lesser relative fitness of Ab and aB. And since survival values are multiplicative, this additional contribution must come from the mate success of AB being disproportionately larger than Ab or aB, indicating the emergent synergistic epistasis posited by our model. We have clarified this point in the text at line 363ff.
(3) l. 318ff: What was this set number of males? I could not find this information anywhere. Also, this model of the mating system is commonly referred to as "best of N", so the authors may want to include this label in the description.
We indicate this detail just after the referenced line, now reworded and on l. 338-340 as “For each female’s mating competition, 100 males were sampled, though see Figure S1 for plots with varying encounter number.” Among these edits, “one hundred” has been changed to a numeral for easier skimming, and Figure S1 is now referenced here earlier in the text. Several edits have also been made in the caption of Figures 2 and 3, and in the relevant methods section to clarify the number of encountered males simulated, mention best of N terminology, and clarify how the quality score is used in the mate competition.
(4) The description of the experiment is still confusing. The number of individuals of each sex entered in each mating cage is missing from the Methods (l. 914); although I did finally find it in the Results. These flies were laying over 2 weeks - does this mean that offspring from the entire period were used to obtain the embryo and aged offspring frequencies, or only from a particular egg collection? If the former, does this mean that the offspring obtained from different egg batches were aged separately? Were the offspring aged in cages or bottles, at what density? Given that only those males that survived until the end of the two-week mating period were sequenced, it is important to know what % of the initial number of males these survivors were. A substantial mortality of the parental males could bias the estimate of parental frequencies. How many parental males, embryos and aged offspring were sequenced? Were all individuals of a given cage and stage extracted and sequenced as a single pool or were there multiple pools? The description could also be structured better. For example, the food and grape agar recipes and cage construction are inserted at random points of the description of the crossing design, which does not help.
We have now reorganized and edited these portions of the Methods text. Portions of this comment overlap with edits responding to (2) of the Public Review and below for l. 921 in Details. Offspring from different laying periods were aged in different bottles, further separated by the time at which they eclosed. They were then pooled for DNA extraction and library preparation by sex and a binary early or late eclosion time. This data was present in the “D. mel. Sample Size” column of supplemental tables S6 and S7 (now S7 and S8), but we have added and referenced a new table to specifically collate the sample sizes of different experimental stages, table S1. Now referenced at lines 555, 638, 975, 1057.
(5) The caption of figure 9 and the discussion of its results should be clear and explicit about the fact that "adult offspring" in Fig 9A and "female" and "male" refers to adults surviving to old age (whereas "parental" in Fig 9A refers to young adults in their reproductive prime. This has consequences for the interpretation of the difference between "parental" and "adult offspring", as it combines one generation of usual selection as it occurs under the conditions of the lab culture (young adult at generation t -> young adult in generation t+1) with an additional step of selection for longevity. Thus, a marked change in allele frequency does not imply that the "parental" frequency does not represent an equilibrium frequency of the inversions under the lab culture conditions. Furthermore, it would be useful to state explicitly that Figure 9B represents the same results as figure 9A, but with the aged offspring split by sex.
Figure caption edited to provide further clarity on the age of cohorts and presented data, along with the relevant results section (2.3) referencing this figure.
We avoid making any statements about the equilibrium frequencies of inversions under lab conditions, and whether or not any step of our experiment reflects such equilibria, because our investigation does not rely upon or test for such conditions. Instead, our analysis focuses on whether inversions have contrasting effects (as indicated by frequency changes that are incompatible with neutral sampling) between different life history components. Under our model, such frequency reversals might be detectable both at equilibrium balanced inversion frequencies and also at frequencies some distance away from equilibria. We have now clarified this point at l. 970-972.
Details:
l. 211: this should be modified as male-only costs are now included.
Edited. “survival likelihood (of either or both sexes).”
l. 343: misplaced period
Edited.
l. 814: "We confirmed model predictions...": This sounds like it refers to an empirical confirmation of a theory prediction, but I think the authors just want to say that their simulations predicted antagonistic variants can be maintained at an intermediate equilibrium frequency. So the wording should be changed to avoid ambiguity.
Edited. Now line 869.
l. 853: How can a genome be "empty"? Do the authors mean an absence of any polymorphism?
Edited to: “In SAIsim, a population is instantiated as a python object, and populated with individuals which are also represented by python objects. These individuals may be instantiated using genomes specified by the user, or by default carry no genomic variation.” Lines 913ff.
l. 853: I do not see this diagramed in Figure 5
Apologies, fixed to Fig. 2
l. 864: is crossing-over in the model limited to female gametogenesis (reflecting the Drosophila case) or does it occur in both sexes?
There is a variable in the simulator to make crossover female-specific. All simulations were performed with female-only crossover. Edited for clarity. “While the simulator can allow recombination in both sexes, all simulations presented only generate crossovers and gene conversion events for female gametes, in accordance with the biology of D. melanogaster.” Lines 928-929.
l. 906: "F2" is ambiguous; does this mean that the mix of lines was allowed to breed for two generations? Also, in other places in the manuscript these flies appear to be referred to are "parental". So do not use F2.
Edited, F2 language removed and replaced with being allowed to breed for two generations. Now lines 967ff.
l. 910: this is incorrect/imprecise; what can be inferred is the frequency of the inversions in male gametes that contributed to fertilization. This would correspond to the frequency in successful males only if each successful male genotype had the same paternity share.
Edited, now “Since no inversions could be inherited through the mothers, inversion frequencies among successful male gametes could be inferred from their pooled offspring.” Now line 994.
l. 912: "without a controlled day/night cycle" meaning what? Constant light? Constant darkness? Daylight falling through the windows?
Edited to “Unless otherwise noted, all flies were kept in a lab space of 23°C with around a degree of temperature fluctuation and without a controlled day/night cycle. Light exposure was dependent on the varying use of the space by laboratory workers but amounted to near constant exposure to at least a minimal level of lighting, with some variable light due to indirect lighting from adjacent rooms with exterior windows.” Now lines 1007-1010.
l. 921: I cannot parse this sentence. Were the offspring isolated as virgins?
No, the logistics of collecting virgins would have been prohibitive, and it did not seem essential for our experiment. Hopefully the edits to this section are clearer, now lines 978ff.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the previous reviews.
Reviewer #1 (Public review):
Summary:
This manuscript reports the substrate-bound structure of SiaQM from F. nucleatum, which is the membrane component of a Neu5Ac-specific Tripartite ATP-dependent Periplasmic (TRAP) transporter. Until recently, there was no experimentally derived structural information regarding the membrane components of TRAP transporter, limiting our understanding of the transport mechanism. Since 2022, there have been 3 different studies reporting the structures of the membrane components of Neu5Ac-specific TRAP transporters. While it was possible to narrow down the binding site location by comparing the structures to proteins of the same fold, a structure with substrate bound has been missing. In this work, the authors report the Na+-bound state and the Na+ plus Neu5Ac state of FnSiaQM, revealing information regarding substrate coordination. In previous studies, 2 Na+ ion sites were identified. Here, the authors also tentatively assign a 3rd Na+ site. The authors reconstitute the transporter to assess the effects of mutating the binding site residues they identified in their structures. Of the 2 positions tested, only one of them appears to be critical to substrate binding.
Strengths:
The main strength of this work is the capture of the substrate bound state of SiaQM, which provides insight into an important part of the transport cycle.
Weaknesses:
The main weakness is the lack of experimental validation of the structural findings. The authors identified the Neu5Ac binding site, but only test 2 residues for their involvement in substrate interactions, which is quite limited. However, comparison with previous mutagenesis studies on homologues supports the location of the Neu5Ac binding site. The authors tentatively identified a 3rd Na+ binding site, which if true would be an impactful finding, but this site was not sufficiently experimentally tested for its contribution to Na+ dependent transport. This lack of experimental validation prevents the authors from unequivocally assigning this site as a Na+ binding site. However, the reporting of these new data is important as it will facilitate follow up studies by the authors or other researchers.
Comments on revisions:
Overall, the authors have done a good job of addressing the reviewers' comments. It's good to know that the authors are working on the characterisation of the potential metal binding site mutants - characterizing just a few of these will provide much-needed experimental support for this potential Na+ site.
The new MD simulations provide additional support for the new Na+ site and could be included.
However, as the authors know, direct experimental characterisation of mutants is the ideal evidence of the Na+ site.
Aside from the characterisation of mutants, which seems to be held up by technical issues, the only remaining issue is the comparison of the Na+- and Na+/Neu5Ac-bound states with ASCT2. It still does not make sense to me why the authors are not directly comparing their Na+ only and Na+/Neu5Ac states with the structures of VcINDY in the Na+-only and Na+/succinate bound states. These VcINDY structures also revealed no conformational changes in the HP loops upon binding succinate, as the authors see for SiaQM. Therefore, this comparison is very supportive. It is understood that the similarity to the DASS structure is mentioned on p.17, but it is also interesting and useful to note that TRAP and DASS transporters also share a lack of substrateinduced local conformational changes, to the extent these things have been measured.
We acknowledge the summary weakness that experimental data to support the third Na binding site is critical.
Based on the reviewer’s suggestion, we added the following in the main text and a supplementary figure comparing the Na ion binding sites between VcINDY and SiaQM. Page 13.
“These two sodium ion binding sites are also conserved in the structure of VcINDY (Supplementary Figure 7) (Sauer et al., 2022). In both cases, the sodium ions are bound at the helix-loop-helix ends of HP1 and HP2. The binding sites utilize both side chains and main chain carbonyl groups. The number of main chain carbonyl interactions suggests that they are critical, and using main chain rather than side chain interactions minimizes the likelihood of point mutations affecting the binding.”
Reviewer #3 (Public review):
The manuscript by Goyal et al report substrate-bound and substrate-free structures of a tripartite ATP independent periplasmic (TRAP) transporter from a previously uncharacterized homolog, F. nucleatum. This is one of most mechanistically fascinating transporter families, by means of its QM domain (the domain reported in his manuscript) operating as a monomeric 'elevator', and its P domain functioning as a substrate-binding 'operator' that is required to deliver the substrate to the QM domain; together, this is termed an 'elevator with an operator' mechanism.
Remarkably, previous structures had not demonstrated the substrate Neu5Ac bound. In addition, they confirm the previously reported Na+ binding sites, and report a new metal binding site in the transporter, which seems to be mechanistically relevant. Finally, they mutate the substrate binding site and use proteoliposomal uptake assays to show the mechanistic relevance of the proposed substrate binding residues.
Strengths:
The structures are of good quality, the presentation of the structural data has improved, the functional data is robust, the text is well-written, and the authors are appropriately careful with their interpretations. Determination of a substrate bound structure is an important achievement and fills an important gap in the 'elevator with an operator' mechanism.
Weaknesses:
Although the possibility of the third metal site is compelling, I do not feel it is appropriate to model in a publicly deposited PDB structure without directly confirming experimentally. The authors do not extensively test the binding sites due to technical limitations of producing relevant mutants; however, their model is consistent with genetic assays of previously characterized orthologs, which will be of benefit to the field. Finally, some clarifications of EM processing would be useful to readers, and it would be nice to have a figure visualizing the unmodeled lipid densities - this would be important to contextualize to their proposed mechanism.
Reviewer #3 (Recommendations for the authors):
I appreciate the authors' responses to our critiques; the revised manuscript is much improved and has addressed most of my concerns. I look forward to seeing their follow up experiments testing mutational e=ects. I think MD simulations of ion-binding sites on their own are supportive but by themselves not su=icient to prove the existence of a functional Na+-binding site. Some clarifications in the methods/supplements would satisfy my concerns about data processing and analysis.
- Unliganded map: were the 141,272 particles used for one class of ab initio? This is unusual, usually multiple ab initio classes are used to further eliminate junk particles. The authors themselves use 6 classes for the substrate-bound dataset.
We classified the particles into multiple 3-D classes. There was no improvement in statistics or maps on splitting these further. Hence, we did not pursue that further.
- Substrate-bound map: how did the four 'identical' classes independently refine? Are similar Na+/substate densities found in each separate class?
The other classes refined to worse than 4.5 Å resolution. We stopped characterizing them past that point. We were hoping to see multiple conformations that are diLerent – and hopefully a class where only two sodium ions could be bound. However, any interpretation at 4.5 Å would be unreliable.
- Both maps: all ab initio classes prior to final refinement should be displayed in the supplementary workflow, this is common for EM processing diagrams.
We agree it is common – however, unless there is a good reason to discuss the other classes, we are not convinced of the value of crowding the figures.
- What specific refinement package and version of Phenix are the authors using? It seems unusual that it is not possible to refine without a metal in Phenix real-space refinement, I have seen many structures where there is no issue refining without critical ions/waters. The authors should double check that they are using the appropriate scattering table for cryo-EM, which should be "electron".
Sorry for the confusion – we did not mean to say we cannot refine without a metal. If we want to add something to the density, we cannot refine it without suggesting a metal or solvent. The site without anything added will refine without any issues but in the absence of additional verification, we cannot be sure of the identity of the ions. We are confident of the metal binding site – but not confident of the exact metal bound. We used Sodium as our first hypothesis.
We don’t think the scattering factors will help in the identification of the ions. Servalcat as part of CCP-EM can produce diLerence maps and we believe that for identification of ions, it will require higher resolution (<2.5 Å) but at this resolution, we can say that there is a nonprotein density but not more than that. We were using “electron” (which we believe is default with phenix.real_space_refine). The refinement was performed using standard protocols and appropriate scattering factors (Phenix version 1.19x), and we have previously used similar refinement protocols for other maps/models (Example -Vinothkumar KR, Arya CK, Ramanathan G, Subramanian R. 2021. Comparison of CryoEM and X-ray structures of dimethylformamidase. Progress in Biophysics and Molecular Biology, CryoEM microscopy developments and their biological applications 160:66–78. doi:10.1016/j.pbiomolbio.2020.06.008).
To convince the reviewer of the quality of the maps, we have added figures that show the model-to-map fit of all of the main secondary structural elements in both the unliganded and the Neu5Ac bound forms.
- I certainly understand the authors' reluctance to not model the entirety of protein densities; however, I think it would be useful to highlight these densities in the global context of the protein. A common way to show this is to show the density proximal to protein chains in one color, and the remaining densities in a contrasting color (Figure 1 somewhat demonstrates this but it is di=icult to tell). I think this would be a nice figure to show the presence and location of unmodeled densities.
We have modified supplementary figure 3 to include unmodelled densities in panels G and H for both structures.
- Small detail, "uniform" is misspelled as "unifrom" in supplementary Figure 3.
Thank you. Corrected.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
We appreciate the positive assessment and agree that the experimental data offer valuable insights into HBV capsid assembly inhibition. Based on the reviewers' suggestions, we have clarified the cryo-EM data and added structural and mechanistic details throughout the manuscript, which we believe significantly enhance its overall clarity and impact. The manuscript now better reflects a promising strategy to interfere with the HBV life cycle. We have carefully addressed all comments to improve both the clarity and quality of the manuscript.
Response to Public Reviews
We greatly appreciate the insightful comments and suggestions from the reviewers. Below, we provide responses to the points raised in the public reviews.
Reviewer #1 (Public Review):
Summary:
In this paper, the authors present an interesting strategy to interfere with the HBV life cycle: the preparation of geranyl and peptides' dimers that could impede the correct assembly of hepatitis B core protein HBc into viable capsids. These dimers are of different nature, depending on the HBc site the authors plan to target. A preliminary study with geranyl dimers (targeting a hydrophobic site of HBc) was first investigated. The second series deals with peptide-PEG linker-peptide dimers, targeting the tips of HBc dimer spikes.
Strengths:
This work is very well conducted, combining ITC experiments (for determination of dimers' KD), cellular effects (thanks to the grafting of previously developed dimers with polyarginine-based cell penetrating peptide) HBV infected HEK293 cells and Cryo-EM studies.
The findings of these research teams unambiguously demonstrated the interest of such dimeric structures in impeding the correct HBV life cycle and thus, could bring solutions in the control of its development. Ultimately, a new class of HBV Capside Assembly Modulators could arise from this study.
There is no doubt that this work could bring very interesting information for people working on VHB.
Weaknesses:
Some minor corrections must be made, especially for a more precise description of the strategy and the chemical structure of the designed new VHB capsid assembly modulators.
We are grateful for the positive feedback on the experimental design, the combination of ITC, cellular effects, and Cryo-EM studies, and the potential for developing new classes of HBV Capsid Assembly Modulators (CAMs). In the revised version we have clarified the design rationale for the choice of the PEG linker length in the Supplementary Information, linking it to the structural measurements of the capsid. Chemical structures and detailed molecular formulas were added and terms have been corrected. A scrambled dimeric peptide served as a negative control, which showed no binding, confirming the specificity of our designed peptide and ruling out non-specific interactions from other elements of the molecules such as the linkers. Finally, we have revised the nomenclature for the geranyl dimers to better reflect the chemical structure. All figures, including Figure 3, have been updated to high-resolution. All mentioned typos have been corrected. Consultation dates have been added to the website references. HPLC terminology was corrected.
Reviewer #2 (Public Review):
Summary:
Vladimir Khayenko et al. discovered two novel binding pockets on HBc with in vitro binding and electron microscopy experiments. While the geranyl dimer targeting a central hydrophobic pocket displayed a micromolar affinity, the P1-dimer binding to the spike tip of HBc has a nanomolar affinity. In the turbidity assay and at the cellular level, an HBc aggregation from peptide crosslinking was demonstrated.
Strengths:
The study identifies two previously unexplored binding pockets on HBc capsids and develops novel binders targeting these sites with promising affinities.
Weaknesses:
While the in vitro and cellular HBc aggregation effects are demonstrated, the antiviral potential against HBV infection is not directly evaluated in this study.
Thank you for recognizing the innovative approach of our work and the potential for developing novel antivirals targeting HBc. We have now included additional discussion on potential future experiments aimed at evaluating the compounds' effects on cellular physiology and viral infectivity.
Reviewer #3 (public Review):
Summary:
HBV is a continuing public health problem and new therapeutics would be of great value. Khayenko et al examine two sites in the HBc dimer as possible targets for new therapeutics. Older drugs that target HBc bind at a pocket between two HBc dimers. In this study Khayenko et al examine sites located in the four helix bundle at the dimer interface.
The first site is a pocket first identified as a triton100 binding site. The authors suggest it might bind terpenes and use geraniol as an example. They also test a decyl maltose detergent and a geraniol dimer intended for bivalent binding. The KDs were all in the 100µM range. Cryo-EM shows that geraniol binds the targeted site.
The second site is at the tip of the spike. Peptides based on a 1995 study (reference 43) were investigated. The authors test a core peptide, two longer peptides, and a dimer of the longest peptide. A deep scan of the longest monomer sequence shows the importance of a core amino acid sequence. The dimeric peptide (P1-dimer) binds almost 100 fold better than the monomer parent (P1). Cryo-EM structures confirm the binding site. The dimeric peptide caused HBc capsid aggregation When HBc expressing cells were treated with active peptide attached to a cell penetrating peptide, the peptide caused aggregation of HBc antigen mirroring experiments with purified proteins.
Strengths:
The two sites have not been well investigated. This paper marks a start. The small collection of substrates investigated led to discovery of a dimeric peptide that leads to capsid aggregation, presumably by non-covalent crosslinking. The structures determined could be very useful for future investigations.
Weaknesses:
In this draft, the rational for targets for the triton x100 site is not well laid out. The target molecules bind with KDs weaker that 50µM. The way the structural results are displayed, one cannot be sure of the important features of binding site with respect to the the substrate. The peptide site and substrates are better developed, but structural and mechanistic details need to be described in greater detail.
We appreciate the reviewer’s positive comments on identifying and targeting previously unexplored sites on HBc, and the potential utility of our dimeric peptides in future studies. We have revised the Results section to better explain the rationale behind targeting the hydrophobic binding site. Additionally, the structures have been revised for clearer presentation, and we now emphasize the key features of the binding site and the role of substrate specificity.
Recommendations For The Authors:
Reviewer #1 (Recommendations For The Authors):
For clarity, the chemical structure of SLLGRM peptide, geraniol and HAP molecules must be indicated, preferably in Fig. 1 (at least in the Supplementary Information section).
We have now included the chemical structures of the SLLGRM peptide, geraniol, and HAP molecules for clarity in Figure 1 and in the main manuscript to ensure they are easily accessible for reference and to provide further detail and context.
In the same idea, in Fig. 1 (and in the text): The molecular formula of heteroaryldihydropyrimidine HAP must be clearly indicated, as the nature of the heteroatom (S, O, N?) in this "heteroaryl" derivative is not indicated.
The full molecular formula of HAP (((2S)-1-[[(4R)-4-(2-chloranyl-4-fluoranyl-phenyl)-5-methoxycarbonyl-2-(1,3-thiazol-2-yl)-1,4-dihydropyrimidin-6-yl]methyl]-4,4-bis(fluoranyl)-pyrrolidine-2-carboxylic acid), is now included the figure legend.
with a polyethylene glycol (PEG) linker that could bridge the distance of 38 Å between the two opposing hydrophobic pockets": what is the rationale of the design of this linker? Authors must explain briefly why/how they have chosen this linker length and nature (please indicate a reference for the appropriate choice of PEG linker). Same remarks for dimers targeting the capsid spike tips, having 50 angstroms PEG linkers. So, the choice of the linker length must be clearly explained and not be only mentioned in the sentence of the discussion part "Using our structural knowledge of the capsid, particularly the distances between the spikes.
We have now better clarified the rationale for the design of the PEG linker length. The linker lengths were specifically chosen based on structural knowledge of the capsid, particularly the measured distances between the spike tips (60 Å) and the hydrophobic pockets (40 Å). In the Supplementary Information (Supplementary Figure 1), we now clearly explain how these measurements guided the choice of PEG linker length, allowing for optimal bridging and interaction between the binding sites. This supplementary figure now explicitly connects the design rationale to the specific structural features of the capsid.
I do not agree with the authors when they claim a "nanomolar affinity of 312 nM". To me, a nanomolar affinity would require several of few tens of nanoM (but not three hundreds) ... So, please correct with "sub-micromolar affinity of 312 nM" and all the other parts of the manuscript (title and caption of Figure 3..., "the peptide dimer (P1dC) with nanomolar affinity" "nanomolar levels"...).
We thank the Rev#1 for pointing this out. Since the term "nanomolar affinity" can indeed be interpreted as referring to the lower end of the nanomolar range, rather than values close to 300 nM we have revised the manuscript to refer to the "sub-micromolar affinity" where applicable. This change has been made throughout the manuscript, including the subtitles and figure captions, and the text.
The drug design strategy was to combine two peptides showing low affinity, attached by a PEG linker with an appropriate length and appears obvious to me. But a control experiment is anyway missing: the peptide-PEG linker derivative (not the dimer peptide-PEG linker-peptide...) should have been evaluated for an unambiguous proof of concept of these dimeric peptides. To my opinion, for the publication of this work, these experiments should be brought (eg, when describing the affinities of SLLGR dimers). I agree that Cryo-EM experiments bring evidences of the dimer binding but the affinity values for (peptide-PEG linker) derivatives would bring an additional proof (as the PEG flexible linkers was not resolved by Cryo-EM).
Thank you for your thoughtful comment regarding the use of a monovalent control for the peptide-PEG linker. A scrambled dimeric peptide serves as a negative control. In ITC it showed no binding at all. Thereby ruling out possibly unspecific interactions mediated by the introduced PEG linker or handle itself.
Given the complete lack of binding with the scrambled dimeric peptide, we believe this thoroughly excludes the need for an additional monovalent control, as it provides strong evidence that the observed binding is driven specifically by the designed peptide sequence and not by the linker or other structural components. We have now made this clarification more explicit in the revised manuscript to avoid any ambiguity. We hope this addresses your concern, and we appreciate your suggestion to further strengthen the rigor of the work. Despite its identical charge, molecular weight and atom composition the scrambled control did not cause HBc aggregation in living cells, thus indicating sequence specific action of the aggregating dimer.
The nomenclature of the dimers must be modified because there is no logic between the name "long dimer" and the chemical structure. Particularly, the number of ethylene glycol motifs must be indicated: authors have to find an appropriate nomenclature indicating both the linker length and nature (small molecule or peptide) of the bivalent parts (and hence, do not mention anymore "short geranyl dimer" "long geranyl dimer").
Thank you for your valuable suggestion regarding the nomenclature of the dimers. We agree that the terms "short geranyl dimer" and "long geranyl dimer" do not fully reflect the chemical structure of the molecules. In response, we have revised the nomenclature to provide a clearer indication of both the linker length and the nature of the bivalent parts. We now refer to the dimers as (Geranyl)<sub>2</sub>-Lys for the dimer with two geranyl groups attached to lysine and (Geranyl-PEG3)<sub>2</sub>-Lys for the dimer with a PEG3 linker (three ethylene glycol units) between the lysine amine and the geranyl groups. These revised names more accurately describe the structural differences and should avoid any ambiguity.
Lines 198-199: "Among these, the dimerized P1 exhibited a higher 198 occupation of the binding site, as illustrated in Supplementary Figure 9." But in Supp. Fig. 9, dimer P1dC (10) is described. As the text above is describing P1-dimer (9), the Supp. Fig. 9 must be provided, if available. If not, please modify this conclusion accordingly. In the text, when mentioning dimerized P1 peptide, authors must indicate with which compound it deals: (9) or (10)?
Thank you for your careful reading of the manuscript and for pointing out the discrepancy. In Supplementary Figure 9, the dimer described is P1dC, not P1d. The text has been revised to clarify this. We appreciate your attention to detail.
Please note that the graphic quality of Figure 3 is bad as it results in pixelized drawings (especially for the chemical structures).
Thank you for your feedback regarding the quality of Figure 3. We have now updated all figures, including Figure 3, to high-resolution PNG format with 300-500 dpi to ensure optimal graphic quality. This should resolve the pixelization issue, particularly for the chemical structures.
Minor typos: "clinical studies, a third are CAMs.[6]" "to the spike base hydrophobic pocket" "geraniol affinity to the central hydrophobic pocket, we designed"
We have corrected the punctuation in the mentioned sentences and appreciate your careful review of the manuscript.
Concerning the citation of a website (references 5 and 6), I guess that the consultation date should be mentioned.
We have now updated the references accordingly, including the consultation dates.
In the Materials and Methods part, Peptide synthesis paragraph, authors must write "semi-preparative HPLC.
It’s now corrected to "semi-preparative HPLC".
In the supplementary information file, 1H and 13C NMR spectrum for the small molecule "Short Geranyl Dimer (SGD)" should be provided.
The purity and identity of this Geranyl derivate were confirmed through UV detection in LC-MS and supported by the mass spectra, which provide robust and clear evidence of the compound's structure and well-accepted method for confirming the structure in this context. While we understand the value of NMR in structural analysis, we believe that additional analytical evidence is not critical for this study.
Reviewer #2 (Recommendations For The Authors):
Overall, this study presents an innovative approach to target the HBV core protein and paves the way for developing new classes of antivirals with a distinct mechanism of action. The findings expand the current knowledge of druggable sites on HBc capsids and provide promising lead compounds. Future studies exploring the antiviral effects and optimizing the binders for therapeutic applications would be valuable next steps.
We sincerely thank the reviewer for the positive assessment of our work and for highlighting its innovative approach to targeting the HBV core protein. We appreciate your recognition of the study's potential in paving the way for developing new classes of antivirals with distinct mechanisms of action. Below, we provide responses to each of the points raised.
The significance of the central hydrophobic pocket as a target may require additional experiments for validation. Currently, the substrate binding activity is relatively low and appears to have a non-significant impact on HBc.
We agree that the central hydrophobic pocket exhibits relatively weak binding affinity with the ligands tested in this study. However, we have provided additional structural evidence and affinity data to support its relevance as a druggable site. In recognition of the weak affinity of these small molecules, we expanded our focus to include peptide-based binders, which yielded higher affinities, particularly when dimerized.
It might be more effective to present Figure 1B after summarizing all the results.
We understand the reviewer’s suggestion. However, we decided to highlight and summarize the major findings early in the manuscript. We included Figure 1B at the beginning to allow readers to quickly grasp the core concepts and outcomes of our study.
The labels for P1/P2 are presented in Figure 1A, yet their definitions are not provided until the second part of the Results section.
We appreciate the reviewer’s observation. While see a benefit of showing three trackable sites on HBV early and as an overview but we also agree that the early presentation of P1/P2 could lead to some confusion. To resolve this, we have revised the figure to introduce only on the minimal peptide to avoid any ambiguity. The full dimer sequences and names are introduced later.
Further investigation of the cytotoxic potential of peptide-induced HBc aggregation is necessary.
Investigating the cytotoxicity together with infectivity is an important future direction but outside the scope of this study. We now elaborate on this point in the discussion.
Reviewer #3 (Recommendations For The Authors):
Two sites in the dimer interface are shown to bind ligands. It is not shown that filling these regions will change infection. The exhaustive studies by Bruss showed point mutations directly alter infection and would be of value to discuss.
We thank Rev#3 for this very helpful comment. We now highlight how point mutations in these regions were shown to affect HBV infectivity. Thereby providing a link between our findings and how ligand binding might influence the viral life cycle.
It is not shown whether the two sites interact. Molecular dynamics by Hadden or Gumbart may be informative. The failure to look for a connection between these sites is an oversight.
We thank Rev#3 for the insightful suggestion to explore potential interactions between the two binding sites. We acknowledge that molecular dynamics (MD) simulations, such as those performed by Gumbart et al. and Hadden et al., could indeed provide valuable insights into the structural dynamics and potential cooperativity between these sites. Indeed, molecular dynamics of the HBV capsid by Perilla and Hadden has demonstrated significant flexibility in the capsid spikes and their interactions with neighboring subunits suggesting that the dynamics of binding sites could influence ligand accessibility and potential crosstalk.
We believe that our own previous structural studies together with data in this work provide substantial experimental evidence on this topic. In Makbul et al. 2021a (doi.org/10.3390/microorganisms9050956) we observed that peptide binding (particularly P2) did not stabilize the spikes; instead, the upper part of the spikes exhibited considerable wobbling. This variability mirrored the conformational diversity reported in MD simulations. Using local classification, we noted that the variability in the spike's upper region was greater when P2 was bound than in its absence. Additionally, in Makbul et al. 2021b (doi.org/10.3390/v13112115), we showed that peptide binding had little effect on the hydrophobic pocket beneath the mobile spike region, located in the more rigid part of the capsid. While we observed F97 in the D-monomer adopting two alternate rotamer orientations upon P2 binding this was not exclusive to P2, as similar changes were noted in the L60V mutant even without bound peptide.
We have updated the manuscript to briefly discuss this crosstalk, that provides additional context to our findings. Interestingly, only TX100—but not geraniol—completely flipped F97 into an alternate orientation, forming a new π-π stacking interaction with the mobile region of the spike. This finding suggests that interactions within the hydrophobic pocket are transmitted based on ligand specific interactions to the tips of the spikes. Thus, supporting and refining the concept of a crosstalk between binding sites, primarily initiated from the hydrophobic pocket in a ligand specific fashion.
The logic for proposing a terpene ligand is strained. Comparisons are made to HBs and the HDV delta antigen. However, HBs is myristoylated not farnesylated and delta antigen binds HBs not HBc.
We have revised the text to clarify the rationale for testing terpenes as ligands, focusing instead on the specific properties of the hydrophobic pocket targeted by geraniol.
The authors suggest larger terpenes as binding agents, but there does not appear to be room for a longer molecule in the binding site. The authors do not discuss whether a longer molecule could be modeled in the site based on their density.
We appreciate this observation and agree that the potential for larger terpenes to bind this site is not obvious from the structural data presented in this work. We have now included a more detailed visualization (Fig2D) and discussion of the hydrophobic binding pocket, based on the density observed in the presented geraniol structure and the previous triton structure and discuss its implications of the binding of larger hydrophobic molecules into the site (Fig 2D).
The authors note that the structure could explain molecular details of this site, but these are not discussed. A more complete analysis of the geraniol protein is necessary, including an estimate of the resolution of that density.
We agree that a more complete analysis of the hydrophobic binding site was warranted. We have now expanded the discussion of the structural details of this binding site based on the geraniol-bound structure, the density and occupancy accounted by this ligand. These additional details (Fig 2C,D and Fig 5) should provide a clearer understanding of the binding interactions observed.
The dimeric geraniol is marginally better binding than the monomer, two-fold, but this could be due to doubling the number of geraniols per ligand or due to an undefined interaction of the extended molecule with the surface of the capsid. A geraniol linker should be tested.
The modest improvement in binding may indeed only reflect the doubled number of geraniols rather than linker-mediated avidity effects. Interaction of the linker with the capsid surface is ruled-out by the scrambled control that included the same linkers but did not show any capacity to bind.
Is the enhanced binding of dimer due to bivalent binding of dimer to one capsid? Is it a chance interaction of the linker with the surface of HBc, which is easily tested? Is it an avidity effect due to aggregation of capsids?
Thank you for this insightful question. Our data suggest that the enhanced binding is due to bivalent interactions. To address the possibility of non-specific interactions from either the handle or the linker, we included a scrambled dimeric peptide as a negative control, which showed no binding. This rules out non-specific interactions from the linker or handle. Given this, we believe an additional monovalent control is unnecessary, as the scrambled control confirms that the binding is driven by the geraniol and peptide warheads alone. We have clarified this in the revised manuscript and appreciate your suggestion to strengthen the study.
The experimental analysis of point mutation of P1 is not analyzed beyond stating that it shows the importance of the core peptide sequence. Is there rationale for the effect of R3 to E and K10 to E mutation?
We appreciate the reviewer's curiosity and request for a more detailed discussion of the P1 deep mutational scan data and its implications. The observed low mutation tolerance of the core peptide sequence SLLGRM regarding HBc binding is highly consistent with our prior structural data and binding studies in solutions (https://doi.org/10.3390/microorganisms9050956) as well as the results from the original phage library screening (M. R. Dyson, K. Murray, Proceedings of the National Academy of Sciences 1995, 92, 2194–2198), and the binding data presented here. Notably, the data set does not suggest that additional binding interfaces contribute to the aggregation seen with N-terminal elongated P1 and P2 versus the non-aggregating shorter SLLGRM. While the positional scan largely aligns with previous phage binding hierarchy and quantified ligands, we were previously prompted by surprising affinity gains for positive to negative amino exchanges in related peptides in same way as Rev#3: Specifically, “SLLGEM” has been predicted previously and here to show enhanced affinity over “SLLGRM”. Quantification in solution, however, could not confirm this enhanced HBV binding affinity (Makbul et al. 2021 Microorganisms), which could not be recapitulated by in solution quantification. In the revised version of the manuscript we now highlight the possible limited predictive power of this assay for positions where positively charged residues are exchanged by negatively charged residues (Figure legend of Fig 3D).
The fluctuations in Figure 3B could be largely magnification of noise due to changing the y-axis. The fluctuations can be characterized as standard variation, excluding the injections, to allow a quantitative judgment.
Isothermal titration calorimetry heat fluctuations without injections are now shown in the supplementary information scaled to the same y-axis (Supplementary Figure 3D).
Molecular graphics throughout are too small and poorly labeled.
We have revised the molecular graphics throughout the manuscript to increase their size and improve labeling for clarity. All figures are now provided in 500dpi.
In Figure 2, compounds 1 and 2 are pyrophosphates. The label in the figure should be corrected.
Thank you for pointing this out. These compounds were removed for clarity.
In the introduction, the phrase "discontinuation frequently leads to relapse" should be changed to something less ambiguous.
Thank you for highlighting this point regarding the phrasing in the introduction. We have revised the statement to more accurately reflect the clinical situation by specifying that stopping treatment often results in viral rebound and disease recurrence in many patients. This adjustment clarifies the intended meaning and addresses the ambiguity you identified. We hope this revision better aligns with the clinical context of HBV management and improves the overall clarity of the manuscript.
Define "functional cure" in the introduction.
Thank you for your suggestion to clarify the term 'functional cure.' We have revised the manuscript and instead of ”functional cure” we mention the goal of sustained viral suppression without detectable HBV DNA and loss of hepatitis B surface antigen (HBsAg) without the need for continuous therapy. This should provide greater clarity for readers and improve the overall comprehensibility of the introduction.
The sentence beginning line 92 is not clear unless one has already read the paper. Figure 1 is not well described.
Thank you for your valuable feedback regarding the clarity of this sentence and the legend of Figure 1. We have revised the text and legend to provide more context and improve the flow for readers who are unfamiliar with the specifics of the study. The revised version now clearly explains the targeted binding sites and the purpose of the bivalent binders at the beginning of the results section.
In line 235 the meaning is not clear. What is in excess? Is there free CPP in solution? Is it the charge on the CPP?
We have clarified the passage as requested.
When describing peptide-induced aggregation, Figures 5 and 6, figure 1B is never referred to. Figure 1B would work better as part of Figure 6.
We understand the reviewer’s suggestion. However, we decided to highlight and summarize the major findings and the underlying hypothesis early in the manuscript. We included Figure 1B at the beginning to allow readers to quickly grasp a core concept and outcome of our study.
We now however refer to Figure 1B and together with all the other changes hope that we have improved the clarity and quality of the manuscript.
We appreciate your constructive feedback and the opportunity to further refine the work.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer 1 (Public Comments):
(1) The central concern for this manuscript is the apparent lack of reproducibility. The way the authors discuss the issue (lines 523-554) it sounds as though they are unable to reproduce their initial results (which are reported in the main text), even when previous versions of AlphaFold2 are used. If this is the case, it does not seem that AlphaFold can be a reliable tool for predicting antibody-peptide interactions.
The driving point behind the multiple sequence alignment (MSA) discussion was indeed to point out that AlphaFold2 (AF2) performance when predicting scFv:peptide complexes is highly dependent upon the MSA, but that is a function of MSA generation algorithm (MMseqs2, HHbiltz, jackhmmer, hhsearch, kalign, etc) and sequence databases, and less an intrinsic function of AF2. It is important to report MSA-dependent performance precisely because this results in changing capabilities with respect to peptide prediction.
Performance also significantly varies with the target peptide and scFv framework changes. By reporting the varying success rates (as a function of MSA, peptide target, and framework changes) we aim to help future researchers craft modified algorithms that can achieve increased reliability at protein-peptide binding predictions. Ultimately, tracking down how MSA generation details vary results (especially when the MSA’s are hundreds long) is significantly outside the scope of this paper. Our goal for this paper was to show a general method for identification of linear antibody epitopes using only sequence information, and future work by us or others should focus on optimization of the process.
(2) Aside from the fundamental issue of reproducibility, the number of validating tests is insufficient to assess the ability of AlphaFold to predict antibody-peptide interactions. Given the authors' use of AlphaFold to identify antibody binding to a linear epitope within a whole protein (in the mBG17:SARS-Cov-2 nucleocapsid protein interaction), they should expand their test set well beyond Myc- and HA-tags using antibody-antigen interactions from existing large structural databases.
Performing the calculations at the scale that the reviewer is requesting is not feasible at this time. We showed in this manuscript that we were able to predict 3 of 3 epitopes, including one antigen and antibody pair that have not been deposited into the PDB with no homologs. While we feel that an N=3 is acceptable to introduce this method to the scientific community, we will consider adding more examples of success and failure in the future to optimize and refine the method as computational resources become available. Notably, future efforts that attempt high-throughput predictions of this class using existing databases should take particular care to avoid contamination.
(3) As discussed in lines 358-361, the authors are unsure if their primary control tests (antibody binding to Myc-tag and HA-tag) are included in the training data. Lines 324-330 suggest that even if the peptides are not included in the AlphaFold training data because they contain fewer than 10 amino acids, the antibody structures may very well be included, with an obvious "void" that would be best filled by a peptide. The authors must confirm that their tests are not included in the AlphaFold training data, or re-run the analysis with these templates removed.
First, we address the simpler question of templates.
The reruns of AF2 with the local 2022 rebuild, the most reproducible method used with results most on par with the MMSEQS server in the Fall of 2022, were run without templates. This is because the MSA was generated locally; no templates were matched and generated locally. The only information passed then was the locally generated MSA, and the fasta sequence of the unchanging scFv and the dynamic epitope sequence. Because of how well this performed despite the absence of templates, we can confidently say the inclusion of the template flag is not significant with respect to how universally accurately PAbFold can identify the correct epitope.
Second, we can partially address the question of whether the AlphaFold models had access to models suitable, in theory, for “memorization” of pertinent structural details.
With respect to tracking the exact role and inclusion of specific PDB entries, the AF2 paper provides the following:
“Structures from the PDB were used for training and as templates (https://www.wwpdb.org/ftp/pdb-ftp-sites; for the associated sequence data and 40% sequence clustering see also https://ftp.wwpdb.org/pub/pdb/derived_data/ and https://cdn.rcsb.org/resources/sequence/clusters/bc-40.out). Training used a version of the PDB downloaded 28 August 2019, while the CASP14 template search used a version downloaded 14 May 2020. The template search also used the PDB70 database, downloaded 13 May 2020 (https://wwwuser.gwdg.de/~compbiol/data/hhsuite/databases/hhsuite_dbs/).”
Three of these links are dead. As such, it is difficult to definitively assess the role of any particular PDB entry with respect to AF2 training/testing, nor what impact homologous training structures given the very large number of immunoglobin structures in the training set. That said, we can summarize information for the potentially relevant PDB entries (l 2or9, which is shown in Fig. 1 and 1frg), and believe it is most conservative to assume that each such entry was within the training set.
PDB entry 2or9 (released 2008): the anti-c-myc antibody 9E10 Fab fragment in complex with an 11-amino acid synthetic epitope: EQKLISEEDLN. This crystal structure is also noteworthy for featuring a binding mode where the peptide is pinned between two Fab. The apo structure (2orb) is also in the database but lacks the peptide and a resolved structure for CDR H3.
PDB entry 1a93 (released 1998): a c-Myc-Max leucine zipper structure, where the c-Myc epitope (in a 34-amino acid protein) adopts an alpha helical conformation completely different from the epitope captured in entry 2or9.
PDB entries 5xcs and 5xcu (released 2017): engineered Fv-clasps (scFv alternatives) in complex with the 9-amino acid synthetic HA epitope: YPYDVPDYA.
PDB entry 1frg (released 1994): anti-HA peptide Fab in complex with HA epitope subset Ace-DVPDYASL-NH2.
Since the 2or9 entry has our target epitope (10 aa) embedded within an 11aa sequence, we have revised this line in the manuscript:
The AlphaFold2 training set was reported to exclude chains of less than 10, which would eliminate the myc and HA epitope peptides. => The AlphaFold2 training set was reported to exclude chains of less than 10, which would eliminate the HA epitope peptide from potential training PDB entries such as 5xcs or 5xcu”
It is important to note that we obtained the best prediction performance for the scFv:peptide pair that had no pertinent PDB entries (mBG17). Specifically, doing a Protein Blast against the PDB using the mBG17 scFv revealed diverse homologs, but a maximum sequence identity of 89.8% for the heavy chain (to an unrelated antibody) and 93.8% for the light chain (to an unrelated antibody). Additionally, while it is possible that the AF2 models might have learned from the complex in pdb entry 2or9, Supplemental Figure 3 shows how often the peptide is “misplaced”, and the performance does not exceed the performance for mBG17.
(4) The ability of AlphaFold to refine the linear epitope of antibody mBG17 is quite impressive and robust to the reproducibility issues the authors have run into. However, Figure 4 seems to suggest that the target epitope adopts an alpha-helical structure. This may be why the score is so high and the prediction is so robust. It would be very useful to see along with the pLDDT by residue plots a structure prediction by residue plot. This would help to see if the high confidence pLDDT is coming more from confidence in the docking of the peptide or confidence in the structure of the peptide.
The reviewer is correct that target mBG17 epitope adopts an alpha helical conformation, and we concur that this likely contributes to the more reliable structure prediction performance. When we predict the structure of the epitope alone without the mBG17 scFv, AF2 confidently predicts an alpha helix with an average pLDDT of 88.2 (ranging from 74.6 to 94.4).
Author response image 1.
The AF2 prediction for the mBG17 epitope by itself.
However, as one interesting point of comparison, a 10 a.a. poly-alanine peptide is also consistently folded into an alpha-helical coil by AF2. The A<sub>10</sub> peptide is also predicted to bind among the traditional scFv CDR loops, but the pLDDT scores are very poor (Supplemental Figure 5J). We also observed the opposite case; when a peptide has a very unstructured region in the binding domain but is nonetheless still be placed confidently, as seen in Supplemental Figure 3 C&D. Therefore, while we suspect peptides with strong alpha helical propensity are more likely to be accurately predicted, the data suggests that that alpha helix adoption is neither necessary nor sufficient to reach a confident prediction.
(5) Related to the above comment, pLDDT is insufficient as a metric for assessing antibody antigen interactions. There is a chance (as is nicely shown in Figure S3C) that AlphaFold can be confident and wrong. Here we see two orange-yellow dots (fairly high confidence) that place the peptide COM far from the true binding region. While running the recommended larger validation above, the authors should also include a peptide RMSD or COM distance metric, to show that the peptide identity is confident, and the peptide placement is roughly correct. These predictions are not nearly as valuable if AlphaFold is getting the right answer for the wrong reasons (i.e. high pLDDT but peptide binding to a nonCDR loop region). Eventual users of the software will likely want to make point mutations or perturb the binding regions identified by the structural predictions (as the authors do in Figure 4).
We agree with the reviewer that pLDDT is not a perfect metric, and we are following with great interest the evolving community discussion as to what metrics are most predictive of binding affinity (e.g. pAE, or pITM as a decent predictor for binding, but not affinity ranking). To our knowledge, there is not yet a consensus for the most predictive metrics for protein:protein binding nor protein:peptide binding. Intriguingly, since the antigen peptides are so small in our case, the pLDDT of the peptide residues should be mostly reporting on the confidence of the distances to neighboring protein residues.
As to the suggestion for a RMSD or COM distance metric, we agree that these are useful -with the caveat that these require a reference structure. The goal of our method is to quickly narrow down candidate linear epitopes and thereby guide experimentalists to more efficiently determine the actual binding sequence of an antibody-antigen sequence. Presumably this would not be necessary if a reference structure were known.
It may also be possible to invent a method to filter unlikely binding modes that is specific to antibodies and peptide epitopes that does not require a known reference structure, but this would be an interesting problem for subsequent study.
Reviewer 1 (Recommendations for the Authors):
(1) "Linear epitope" should be more precisely defined in the text. It isn't clear whether the authors hope that they can use AlphaFold to predict where on a given protein antigen an antibody will bind, or which antigenic peptide the antibody will bind to. The authors discuss both problems, and there is an important distinction between the two. If the authors are only concerned with isolated antigenic peptides, rather than linear epitopes in their full length structural contexts, they should be more precise in the introduction and discussion.
We thank the reviewer for the prompt towards higher precision. We are using the short contiguous antigen definition of “linear epitope” that depends on secondary rather than tertiary structure. The linear epitopes this paper considers are short “peptides” that form secondary structure independent of their structure in the complete folded antigen protein. We have clarified our definition of “linear epitope” in the text (lines 64-66).
(2) Line 101: "Not all portions of the antibody are critical". First, this is not consistent with the literature, particularly where computational biology is concerned.
See https://pubs.acs.org/doi/10.1021/acs.jctc.7b00080 . Second, while I largely agree with what I think the authors are trying to say (that we can largely reduce the problem to the CDR loops), this is inconsistent with what the authors later find, which is that inexplicably the VH/VL scaffold used alters results strongly.
We have adopted verbiage that should be less provocative: “Fortunately, with respect to epitope specificity, antibody constant domains are less critical than the CDR loops and the remainder of the variable domain framework regions.”
(3) Related to the above comment, do the authors have any idea why epitope prediction performance improved for the chimeric scFvs? Is this due to some stochasticity in AlphaFold? Or is there something systematic? Expanding the test dataset would again help answer this question.
We agree that future study with a larger test set could help address this intriguing result, for which we currently lack a conclusive explanation. Part of our motivation for this publication was to bring to light this unexpected result. Notably, these framework differences are not only implicated as a factor in driving AF2 performance, but also changing experimental intracellular performance as reported by our group (DOI: 10.1038/s41467-019-10846-1 ). We can generate a variety of hypotheses for this phenomenon. Just as MSA sub-sampling has been a popular approach to drive AF2 to sample alternative conformations, sequence recombination may be a generically effective way to generate usefully different binding predictions. However, it is difficult to discriminate between recombination inducing subtle structural tweaks that increase protein intracellular fitness and binding, from recombination causing changes to the MSA that affect the likelihood of sampling a good epitope binding conformation. It is also possible that the chimeras are more deftly predicted by AF2 due to differences in sequence representation during the training of the AF2 models (e.g. more exposure to models containing 15F11 or 2E2 structures). We attempted to deconvolute MSA differences by using single-sequence mode (Supplementary Figure 13) but this ablated performance.
(4) Figure 2: The reported consensus pLDDT scores are actually quite low here, suggesting low confidence in the result. This is in strong contrast to the reported consensus scores for mBG17. Again, a larger test dataset would help set a quantitative cutoff for where to draw the line for "trustworthy" AlphaFold predictions in antibody-peptide binding applications.
We agree that a larger dataset will be useful to begin to establish metrics and thresholds and will contribute to the aforementioned community discussion about reliable predictors of binding. Our current focus is not structure prediction per se. In the current work we are more focused on relative binding likelihood and increasing the efficiency of experimental epitope verification by flagging the most likely linear epitopes. Thus, while the pLDDT scores are low for Myc in Figure 2, it is remarkable (and worth reporting) that there is still useful signal in the relative variation in pLDDT. The utility of the signal variation is evident in the ability to short-list correct lead peptides via the two methods we demonstrate (consensus and per-residue max).
(5) Figure 4: if the authors are going to draw conclusions from the actual structure predictions of AlphaFold (not just the pLDDT scores), the side-chain accuracy placement should be assessed in the test dataset (RMSD or COM distance).
We agree with the reviewer that side-chain placement accuracy is important when evaluating the accuracy of AF2 structure predictions. However, here our focus was relative binding likelihood rather than structure prediction. The one case where we attempted to draw conclusions from the structure prediction was in the context of mBG17, where there is not yet an experimental reference structure. Absolutely, if we were to obtain a crystal structure for that complex, we would assess side-chain placement accuracy.
(6) Lines 493-508: I am not sure that this assessment for why AlphaFold has difficulty with antibody-antigen interactions is correct. If the authors' interpretation is correct (larger complicated structures are more challenging to move) then AlphaFold-Multimer (https://www.biorxiv.org/content/10.1101/2021.10.04.463034v2.full) wouldn't perform as well as it does. Instead, the issue is likely due to the incredibly high diversity in antibody CDR loops, which reduces the ability of the AlphaFold MSA step (which the authors show is quite critical to predictions: Figure S13) to inform structure prediction. This, coupled with the importance of side chain placement in antibody and TCR interactions, which is notoriously difficult (https://elifesciences.org/articles/90681), are likely the largest source of uncertainty in antibody-antigen interaction prediction.
We agree with the reviewer that CDR loop diversity (and associated side chain placement challenges) are a major barrier to successfully predict antibody-antigen complexes. Presumably this is true for both peptide antigens and protein antigens. Indeed, the authors of AlphaFold-multimer admit that the updated model struggles with antibody-antigen complexes, saying “As a limitation, we observe anecdotally that AlphaFold-Multimer is generally not able to predict binding of antibodies and this remains an area for future work.” The point about how loop diversity could reduce MSA quality is well taken. We have included the following thanks to the guidance of the reviewer when discussing MSA sensitivity is discussed later on in lines 570-572.:
“These challenges are presumably compounded by the incredible diversity of the CDR loops in antibodies which could decrease the useful signal from the MSA as well as drive inconsistent MSA-dependent performance”.
With respect to lines 493-508, we have also rephrased a key sentence to try to better explain that we are comparing the often-good recognition performance for short epitopes to the never-good performance when those epitopes are embedded within larger sequences. Instead of saying, “In contrast, a larger and complicated structure may be more challenging to move during the AlphaFold2 structure prediction or recycle steps.” we now say in lines 520-522 , “In contrast, embedding the epitope within a larger and more complicated structure appears to degrade the ability of AlphaFold2 to sample a comparable bound structure within the allotted recycle steps.”
(7) Related to major comment 1: Are AlphaFold predictions deterministic? That is, if you run the same peptide through the PAbFold pipeline 20 times, will you get the same pLDDT score 20 times? The lack of reproducibility may be in part due to stochasticity in AlphaFold, which the authors could actually leverage to provide more consistent results.
This is a good question that we addressed while dissecting the variable performance. When the random seed is fixed, AF2 returns the same prediction every time. After running this 10 times with a fixed seed, the mBG17 epitope was predicted with an average pLDDT of 88.94, with a standard deviation of 1.4 x 10<sup>-14</sup>. In contrast, when no seed is specified, AF2 did not return an *identical* result. However, the results were still remarkably consistent. Running the mBG17 epitope prediction 10 times with a different seed gave an average pLDDT of 89.24, with a standard deviation of 0.49.
(8) Related to major comment 2: The authors could use, for example, this previous survey of 1833 antibody-antigen interactions (https://www.sciencedirect.com/science/article/pii/S2001037023004725) the authors could likely pull out multiple linear epitopes to test AlphaFold's performance on antibody peptide interactions. A large number of tests are necessary for validation.
We thank the reviewer for this report of antibody-antigen interactions and will use it as a source of complexes in a future expanded study. Given the quantity and complexity of the data that we are already providing, as well as logistical challenges for compute and personnel the reviewer is asking for, we must defer this expansion to future work.
(9) Related to major comment 3: Apologies if this is too informal for a review, but this Issue on the AlphaFold GitHub may be useful: https://github.com/googledeepmind/alphafold/issues/416 .
We thank the reviewer for the suggestion – per our response above we have indeed run predictions with no templates. Since we are using local AlphaFold2 calculations with localcolabfold, the use or non-use of templates is fairly simple: including a “—templates” flag or not.
(10) Related to major comment 4: I am not sure if AlphaFold outputs by-residue secondary structure prediction by default, but I know that Phyre2 does http://www.sbg.bio.ic.ac.uk/~phyre2/html/page.cgi?id=index .
To our knowledge, AF2 does not predict secondary structure independent of the predicted tertiary structure. When we need to analyze the secondary structure we typically use the program DSSP from the tertiary structure.
(11) The documentation for this software is incomplete. The GitHub ReadMe should include complete guidelines for users with details of expected outputs, along with a thorough step-by-step walkthrough for use.
We thank the reviewer for pointing this out, but we feel that the level of detail we provide in the GitHub is sufficient for users to utilize the method described.
Stylistic comments:
(1) I do not think that the heatmaps (as in 1C, top) add much information for the reader. They are largely uniform across the y-axis (to my eyes), and the information is better conveyed by the bar and line graphs (as in 1C, middle and bottom panels).
We thank the reviewer for this feedback but elect to leave it in on the premise of more data presented is (usually) better. Including the y-axis reveals common patterns such as the lower confidence of the peptide termini, as well as the lack of some patterns that might have occurred. For example, if a subset of five contiguous residues was necessary and sufficient for local high confidence this could be visually apparent as a “staircase” in the heat map.
(2) A discussion of some of the shortcomings of other prediction-based software (lines 7177) might be useful. Why are these tools less well-equipped than AlphaFold for this problem? And if they have tried to predict antibody-antigen interactions, why have they failed?
We agree with the reviewer that a broader review of multiple methods would be interesting and useful. One challenge is that the suite of available methods is evolving rapidly, though only a subset work for multimeric systems. Some detail on deficiencies of other approaches was provided in lines 71-77 originally, although we did not go into exhaustive detail since we wanted to focus on AF2. We view using AF2 in this manner is novel and that providing additional options predict antibody epitopes will be of interest to the scientific community. We also chose AF2 because we have ample experience with it and is a software that many in the scientific community are already using and comfortable with. Additionally, AF2 provided us with a quantification parameter (pLDDT) to assess the peptides’ binding abilities. We think a future study that compares the ability of multiple emerging tools for scFv:peptide prediction will be quite interesting.
(3) Similar to the above comment, more discussion focused on why AlphaFold2 fails for antibodies (lines 126-128) might be useful for readers.
We thank the reviewer for the suggestion. The following line has been added shortly after lines 135-137:
“Another reason for selecting AF2 is to attempt to quantify its abilities the compare simple linear epitopes, since the team behind AF-multimer reported that conformational antibody complexes were difficult to predict accurately (14).”
Per earlier responses, we also added text that flags one particular possible reason for the general difficulty of predicting antibody-antigen complexes (the diversity of the CDR loops and associated MSA challenges).
(4) The first two paragraphs of the results section (lines 226-254) could likely be moved to the Methods. Additionally, details of how the scores are calculated, not just how the commands are run in python, would be useful.
Per the reviewer suggestion, we moved this section to the end of the Methods section. Also, to aid in the reader’s digestion of the analysis, the following text has been added to the Results section (lines 256-264):
“Both the ‘Simple Max’ and ‘Consensus’ methods were calculated first by parsing every pLDDT score received by every residue in the antigen sequence sliding window output structures. From the resulting data structure, the Simple Max method simply finds the maximum pLDDT value ever seen for a single residue (across all sliding windows and AF2 models). For the Consensus method, per-residue pLDDT was first averaged across the 5 AF2 models. These averages are reported in the heatmap view, and further averaged per sliding window for the bar chart below.
In principle, the strategy behind the Consensus method is to take into account agreement across the 5 AF2 models and provide insight into the confidence of entire epitopes (whole sliding windows of n=10 default) instead of disconnected, per-residue pLDDT maxima.”
(5) Figure 1 would be more useful if you could differentiate specifically how the Consensus and Simple Max scoring is different. Providing examples for how and why the top 5 peptide hits can change (quite significantly) using both methods would greatly help readers understand what is going on.
Per the reviewer suggestion, we have added text to discuss the variable hit selection that results from the two scoring metrics. The new text (lines 264-271) adds onto the added text block immediately above:
“Having two scoring metrics is useful because the selection of predicted hits can differ. As shown in Figure 2, part of the Myc epitope makes it into the top 5 peptides when selection is based on summing per-residue maximum pLDDT (despite there being no requirement that these values originate in the same physical prediction). In contrast, a Consensus method score more directly reports on a specific sliding window, and the strength of the highest confidence peptides is more directly revealed with superior signal to noise as shown in Figure 3. Variability in the ranking of top hits between the two methods arises from the fundamental difference in strategy (peptide-centric or residue-centric scoring) as well as close competition between the raw AF2 confidence in the known peptide and competing decoy sequences.”
(6) Hopefully the reproducibility issue is alleviated, but if not the discussion of it (lines 523554) should be moved to the supplement or an appendix.
The ability of the original AF2 model to predict protein-protein complexes was an emergent behavior, and then an explicit training goal for AF2.multimer. In this vein, the ability to predict scFv:peptide complexes is also an emergent capability of these models. It is our hope that by highlighting this capacity, as well as the high level of sensitivity, that this capability will be enhanced and not degraded in future models/algorithms (both general and specialized). In this regard, with an eye towards progress, we think it is actually important to put this issue in the scientific foreground rather than the background. When it comes to improving machine learning methods negative results are also exceedingly important.
Reviewer 2 (Recommendations for the Author):
- Line 113, page 3 - the structures of the novel scFv chimeras can be rapidly and confidently be predicted by AlphaFold2 to the structures of the novel scFv chimeras can be rapidly and confidently predicted by AlphaFold2.
The superfluous “be” was removed from the text.
- Line 276 and 278 page 9 - peptide sequences QKLSEEDLL and EQKLSEEDL in the text are different from the sequences reported in Figures 1 and 2 (QKLISEEDLL and EQKLISEEDL). Please check throughout the manuscript and also in the Figure caption (as in Figure 2).
These changes were made throughout the text.
- I would include how you calculate the pLDDT score for both Simple Max approach and Consensus analysis.
Good suggestion, this should be covered via the additions noted above.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
Summary:
Rigor in the design and application of scientific experiments is an ongoing concern in preclinical (animal) research. Because findings from these studies are often used in the design of clinical (human) studies, it is critical that the results of the preclinical studies are valid and replicable. However, several recent peer-reviewed published papers have shown that some of the research results in cardiovascular research literature may not be valid because their use of key design elements is unacceptably low. The current study is designed to expand on and replicate previous preclinical studies in nine leading scientific research journals. Cardiovascular research articles that were used for examination were obtained from a PubMed Search. These articles were carefully examined for four elements that are important in the design of animal experiments: use of both biological sexes, randomization of subjects for experimental groups, blinding of the experimenters, and estimating the proper size of samples for the experimental groups. The findings of the current study indicate that the use of these four design elements in the reported research in preclinical research is unacceptably low. Therefore, the results replicate previous studies and demonstrate once again that there is an ongoing problem in the experimental design of preclinical cardiovascular research.
Strengths:
This study selected four important design elements for study. The descriptions in the text and figures of this paper clearly demonstrate that the rate of use of all four design elements in the examined research articles was unacceptably low. The current study is important because it replicates previous studies and continues to call attention once again to serious problems in the design of preclinical studies, and the problem does not seem to lessen over time.
Weaknesses:
The current study uses both descriptive and inferential statistics extensively in describing the results. The descriptive statistics are clear and strong, demonstrating the main point of the study, that the use of these design elements is quite low, which may invalidate many of the reported studies. In addition, inferential statistical tests were used to compare the use of the four design elements against each other and to compare some of the journals. The use of inferential statistical tests appears weak because the wrong tests may have been used in some cases. However, the overall descriptive findings are very strong and make the major points of the study.
We sincerely appreciate the reviewer’s comments and detailed feedback and their recognition of the importance of this work in replicating previous studies and calling attention to the problems in preclinical study design. In response to the reviewer’s suggestions, we have recalculated our inferential statistics. In place of our previous inferential statistics, we have used an alternative correction calculation for p-values (Holm-Bonferroni corrections) and used median-based linear model analyses and nonparametric Kruskal-Wallis tests that are more appropriate for analyzing this dataset. Our overall trends in results remain the same.
Reviewer #2 (Public Review):
Summary
This study replicates a 2017 study in which the authors reviewed papers for four key elements of rigor: inclusion of sex as a biological variable, randomization of subjects, blinding outcomes, and pre-specified sample size estimation. Here they screened 298 published papers for the four elements. Over a 10 year period, rigor (defined as including any of the 4 elements) failed to improve. They could not detect any differences across the journals they surveyed, nor across models. They focused primarily on cardiovascular disease, which both helps focus the research but limits the potential generalizability to a broader range of scientific investigation. There is no reason, however, to believe rigor is any better or worse in other fields, and hence this study is a good 'snapshot' of the progress of improving rigor over time.
Strengths
The authors randomly selected papers from leading journals, e.g., PNAS). Each paper was reviewed by 2 investigators. They pulled papers over a 10-year period, 2011 to 2021, and have a good sample of time over which to look for changes. The analysis followed generally accepted guidelines for a structured review.
Weaknesses
The authors did not use the exact same journals as they did in the 2017 study. This makes comparing the results complicated. Also, they pulled papers from 2011 to 2021, and hence cannot assess the impact of their own prior paper.
The authors write "the proportion of studies including animals of both biological sexes generally increased between 2011 and 2021, though not significantly (R2= 0.0762, F(1,9)= 0.742, p= 0.411 (corrected p=8.2". This statement is not rigorous because the regression result is not statistically significant. Their data supports neither a claim of an increase nor a decrease over time. A similar problem repeats several times in the remainder of their results presentation.
I think the Introduction and the Discussion are somewhat repetitive and the wording could be reduced.
Impact and Context
Lack of reproducibility remains an enormous problem in science, plaguing both basic and translational investigations. With the increased scrutiny on rigor, and requirements at NIH and other funding agencies for more rigor and transparency, one would expect to find increasing rigor, as evidenced by authors including more study design elements (SDEs) that are recommended. This review found no such change, and this is quite disheartening. The data implies that journals-editors and reviewers-will have to increase their scrutiny and standards applied to preclinical and basic studies. This work could also serve as a call to action to investigators outside of cardiovascular science to reflect on their own experiences and when planning future projects.
We sincerely appreciate the reviewer’s insights and comments and recognition of our work contributing to the growing body of evidence on the lack of rigor in preclinical cardiovascular research study design. Regarding the weaknesses the reviewer noted; the referenced 2017 publication details a study by Ramirez et al, and was not conducted by our group. Our study aimed to expand upon their findings by using a more recent timeframe and an alternative list of highly respected cardiovascular research journals. We have now better clarified this distinction in the manuscript. We have also addressed our phrasing regarding the lack of statistical significance in the increase of the proportion of studies including animals of both sexes from 2011-2021.
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
Many of the methods in this study were strong or adequate. Although the descriptive statistics appear solid, there are significant problems that need to be addressed in the selection and use of inferential statistics.
(1) One of the design elements that was studied was sample size estimation. This is usually done by a power analysis. The authors should consider what group size for the examined journals is adequate for their statistics to be valid. Or they could report the power of their studies to achieve a given meaningful difference.
We thank the reviewer for this excellent observation. We unfortunately failed to conduct an a priori power analysis. Previous research (Gupta, et al. 2016) suggests that post-hoc power calculations should not be carried out after the study has been conducted. We acknowledge the importance of establishing a sufficient sample size to draw sound conclusions based on an adequate effect size, and we regret that we did not carry out the appropriate estimations. We are very appreciative of the reviewer’s suggestions and aim to implement such an appropriate study design element in future studies.
Gupta KK, Attri JP, Singh A, Kaur H, Kaur G. Basic concepts for sample size calculation: Critical step for any clinical trials!. Saudi J Anaesth. 2016;10(3):328-331. doi:10.4103/1658-354X.174918
(2) A Bonferroni correction was used extensively. Because of its use, the corrected p values often appear much too high. The Bonferroni test becomes much too conservative for more than 3 or 4 tests. I suggest using a different test for multiple comparisons.
We thank the reviewer for their insightful suggestion. We have updated all p-values to reflect a Holm-Bonferroni correction instead. All p-values have been corrected and updated.
(3) The use of the chi-square test for categorical data is appropriate. However, the t-test and multiple regression tests are designed for continuous variables. Here, it appears that they were used for the nominal variables (Table 1). For these nominal data, other nonparametric tests should be used.
We thank the reviewer for this valuable insight. We have updated our statistical analysis methods and now use nonparametric Kruskal-Wallis tests to analyze differences in SDE reporting across journals, instead of chi-square test. Our reported p-values have been adjusted accordingly.
(4) It is not clear exactly when each test is used. The stats section in Methods should better delineate when each test is used. In addition, it would be helpful to include the test used in the figure legends.
We thank the reviewer for bringing up this important point. We have now updated the methods section to better delineate which tests were used, and also included the specific tests in the figure legends.
(5) You will need to rewrite some sections of the text to reflect the changes due to changing your use of statistics.
We have rewritten the sections of the text to reflect the changes in our use of statistics.
Here are a few comments on the presentation.
(1) Some of the figure legends are almost impossible to read. They are too congested.
We thank the reviewer for pointing this out. We have edited the figure legends to make them more readable. We will also attach a pdf with the graphs to allow for easier formatting.
(2) Also, is it possible to drop some of the panels in Figure 1?
The panels in figure 1 have been rearranged to make them more readable. We believe that each panel provides valuable visual summaries of our data, that will aid readers in understanding our results.
(3) It is not mandatory that values of y-axis on the graphs go up 100% (Figs 2 and 3). Using a maximum value of 100% clumps the lines visually. I suggest a max value on the y-axis of the graph of 50% or 60%. That will spread the lines better visually so differences can better be seen.
We thank the reviewer for considering the experience of our paper’s readers. The y-axes of Figures 2 and 3 have been truncated to 50%. The trend lines in each Figure now appear more separated and differences can better be seen.
Reviewer #2 (Recommendations For The Authors):
The authors did not use the exact same journals as they did in the 2017 study. This makes comparing the results complicated. Also, they pulled papers from 2011 to 2021, and hence cannot assess the impact of their own prior paper.
We appreciate the reviewer’s concern in maintaining consistency with the paper published by Ramirez, et al. in 2017. To clarify, our efforts focused on providing a replication study that expanded upon the original Ramirez publication - which we have no affiliation with. For our study, we used different academic journals than those used by Ramirez, et al, and also a different time-frame. We have updated the language in the manuscript to better-clarify the purpose and parameters of our study relative to the previous, unaffiliated, study.
The authors write "the proportion of studies including animals of both biological sexes generally increased between 2011 and 2021, though not significantly (R2= 0.0762, F(1,9)= 0.742, p= 0.411 (corrected p=8.2". This statement is not rigorous because the regression result is not statistically significant. Their data supports neither a claim of an increase nor a decrease over time. A similar problem repeats several times in the remainder of their results presentation.
Thank you for bringing this information to our attention. We agree with the concern regarding the statement, “the proportion of studies including animals of both biological sexes generally increased between 2011 and 2021, though not significantly (R2= 0.0762, F(1,9)= 0.742, p= 0.411 (corrected p=8.2.” We have rephrased the statement. Our updated Holm-Bonferroni corrected p-value is now noted in this more appropriately worded description of our results. Lastly, we have addressed the wording and redundancy seen in both the introduction and discussion and have made both more concise.
I think the Introduction and the Discussion are somewhat repetitive and the wording could be reduced.
We thank the reviewer for bringing this to our attention. We have addressed the redundancy across the Introduction and the Discussion. We have also altered the wording to reflect a more concise explanation of our study.
The 'trends' are not statistically significant. A non-significant trend does not exist and no claim of a 'trend' is justified by the data.
We thank the reviewer for this observation. We have updated the phrasing of ‘trends’ in all areas of the manuscript.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Public Reviews:
Reviewer #1 (Public review):
Summary:
Authors of this article have previously shown the involvement of the transcription factor Zinc finger homeobox-3 (ZFHX3) in the function of the circadian clock and the development/differentiation of the central circadian clock in the suprachiasmatic nucleus (SCN) of the hypothalamus. Here, they show that ZFHX3 plays a critical role in the transcriptional regulation of numerous genes in the SCN. Using inducible knockout mice, they further demonstrate that the deletion Of Zfhx3 induces a phase advance of the circadian clock, both at the molecular and behavioral levels.
Strengths:
- Inducible deletion of Zfhx3 in adults
- Behavioral analysis
- Properly designed and analyzed ChIP-Seq and RNA-Seq supporting the conclusion of the behavioral analysis
Weaknesses:
- Further characterization of the disruption of the activity of the SCN is required.
(1) We thank the reviewer for their valuable inputs. Indeed, a comprehensive behavioral assessment of mice of this genotype was executed in Wilcox et al. ;2017 study. In Wilcox et al.; 2017, Figure 4, 6-h phase advance (jetlag) clearly showed faster reentrainment in ZFHX3-KO mice when compared to the controls.
- The description of the controls needs some clarification.
(2) We agree with the reviewer and will modify the text to clearly describe the controls wherever mentioned.
Reviewer #2 (Public review):
Summary:
ZFHX3 is a transcription factor expressed in discrete populations of adult SCN and was shown by the authors previously to control circadian behavioral rhythms using either a dominant missense mutation in Zfhx3 or conditional null Zfhx3 mutation using the Ubc-Cre line (Wilcox et al., 2017). In the current manuscript, the authors assess the function of ZFHX3 by using a multi-omics approach including ChIPSeq in wildtype SCNs and RNAseq of SCN tissues from both wildtype and conditional null mice. RNAseq analysis showed a loss of oscillation in Bmal1 and changes in expression levels of other clock output genes. Moreover, a phase advance gene transcriptional profile using the TimeTeller algorithm suggests the presence of a regulatory network that could underlie the observed pattern of advanced activity onset in locomotor behavior in knockout mice.
In figure1, the authors identified the ZFHX3 bound sites using ChIPseq and compared the loci with other histone marks that occur at promoters, TSS, enhancers and intergenic regions. And the analysis broadly points to a role for ZFHX3 in transcriptional regulation. The vast majority of nearly 40000 peaks overlapped H3K4me3 and K27ac marks, active promoters which also included genes falling under the GO category circadian rhythms. However, no significant differential ZFHX3 bound peaks were detected between ZT3 and ZT15. In these experiments, it is not clear if and how the different ChIP samples (ZFHX3 and histone PTM ChIPs) were normalized/downsampled for analysis. Moreover, it seems that ZFHX3 binding or recruitment has little to do with whether the promoters are active.
(3) We thank the reviewer for their valuable comment. Different ChIP samples. (ZFHX3 and histone PTM ChIPs) were treated in the same manner from preprocessing (quality control by FastQC, Trimming, Alignment to mm10 genome and Peak calling) using MACS2 as mentioned in Methods. The data was normalized using bamCoverage tools and bigwig files were generated for visual inspection using USCS Genome Browser. These additional details will be added to Methods. Finally, BEDTools was employed to study overlapping peaks between ZFHX3 and histone PTMs.
We agree that, alone, the current data does not make any claim for ZFHX3 being crucial for promoter to be active. Our data clearly suggests that a vast majority of ZFHX3 genomic binding in the SCN was observed at active promoters marked by H3K4me3 and H3K27ac and potentially regulating gene transcription.
Based on a enrichment of ARNT domains next to K4Me3 and K27ac PTMs, the authors propose a model where the core-clock TFs and ZFHX3 interact. If the authors develop other assays beyond just predictions to test their hypothesis, it would strengthen the argument for role in circadian transcription in the SCN. It would be important in this context to perform a ChIP-seq experiment for ZFHX3 in the knockout animal (described from Figure 2 onwards) to eliminate the possibility of non-specific enrichment of signal from "open chromatin'. Alternatively, a ChIPseq analysis for BMAL1 or CLOCK could also strengthen this argument to identify the sites co-occupied by ZFHX3 and core-clock TFs.
(4a) We agree that follow-up experiments such as BMAL1/CLOCK ChIPseq suggested by the reviewer will further confirm the proposed interaction of ZFHX3 with core-clock TFs. However, this is beyond the scope of the current study.
(4b) Again, conducting complementary ChIPseq in ZFHX3 knockout mice will strengthen the findings, but conducting TF-ChIPseq in a specific brain tissue such as the SCN (unlike peripheral tissues such as liver) does not only warrant use of multiple animals per sample but is also technically challenging and time-consuming to ensure specificity of the sample. For these reasons, datasets such as ours on the SCN are uncommon. Furthermore, in this particular context, we are certain that, based on current dataset, the ZFHX3 peaks (narrow) we observed were well-defined and met the specified statistical criteria mitigating any risk of signal arising from non-specific enrichment from open-chromatin regions.
Next, they compared locomotor activity rhythms in floxed mice with or without tamoxifen treatment. As reported before in Wilcox et al 2017, the loss of ZFHX3 led to a shorter free running period and reduced amplitude and earlier onset of activity. Overall, the behavioral data in Figure 2 and supplementary figure 2 has been reported before and are not novel.
(5) We recognise that a detailed circadian behavior assessment from adult mice lacking ZFHX3 has been conducted previously by Nolan lab (Wilcox et al; 2017). In the current study, however, we used a separate cohort of mice, to focus on the behavioral advance noted in 24-h LD cycle and generate a more refined assessment. Importantly, these mice were also used for transcriptomic studies as detailed in Figure 3, which we consider to be a positive feature of our experimental design: behavior and molecular analyses were performed on the same animals.
Next, the authors performed RNAseq at 4hr intervals on wildtype and knockout animals maintained in light/dark cycles to determine the impact of loss of ZFHX3. Overall transcriptomic analysis indicated changes in gene expression in nearly 36% of expressed genes, with nearly half being upregulated while an equal fraction was downregulated. Pathways affected included mostly neureopeptide neurotransmitter pathways. Surprisingly, there was no correlation between the direction in change in expression and TF binding since nearly all the sites were bound by ZFHX3 and the active histone PTMs. The ChIP-seq experiment for ZFHX3 in the UBC-Cre+Tam mice again could help resolve the real targets of ZFHX3 and the transcriptional state in knockout animals.
(6) We agree with the reviewer that most of the differentially expressed genes showed ZFHX3 binding at active promoter sites. That said, the current dataset is in line with recently published ZFHX3-CHIPseq data by Baca et al; 2024 [PMID: 38412861] in human neural stem cells and Hu et al; 2024 [PMID: 38871709] in human prostate cancer cells that clearly suggests ZFHX3 binds at active promoters and act as chromatin remodellers/mediators that modulate gene transcription depending on the accessory TFs assembled at target genes. Therefore, finding no correlation in the direction of change in expression is not striking.
To determine the fraction of rhythmic transcripts, Using dryR, the authors categorise the rhythmic transcriptome into modules that include genes that lose rhythmicity in the KO, gain rhythmicity in the KO or remain unaffected or partially affected. The analysis indicates that a large fraction of the rhythmic transcriptome is affected in the KO model. However, among core-clock genes only Bmal1 expression is affected showing a complete loss of rhythm. The authors state a decrease in Clock mRNA expression (line 294) but the panel figure 4A does not show this data. Instead it depicts the loss in Avp expression - {{ misstated in line 321 ( we noted severe loss in 24-h rhythm for crucial SCN neuropeptides such as Avp (Fig. 3a).}}
(7a) Indeed, among the core-clock genes rhythmic expression is lost after ZFHX3 knockout only for Bmal1. However, given the mice were rhythmic (as assessed by wheel-running activity) in LD conditions, the observed 24-h gene expression rhythm in the majority of core-clock genes (Pers and Crys) is consistent with behavior data, and suggests towards a molecular clock with plausible scenarios as explained at line 439. That said, the unique and well-defined changes (amplitude and phase) observed as demonstrated in Figure 5 highlights a model in which ZFHX3 exerts differential control, for example in case of Per2 noted advance in molecular rhythm (~2-h), but no such change in Cry, presents an opportunity to delineate further the regulation of TTFL genes.
(7b) Line 294 states- loss of Bmal1 rhythm and reduction in Clock mRNA . Figure 4a is in support of former. We shall revise the text for clarity.
(7c) As rightly pointed out by the reviewer, line 321 is referring to loss of Avp expression and we shall correct the typo by replacing “Figure 3a to 4a”. Thank you.
However, core-clock genes such as Pers and Crys show minor or no change in expression patterns while Per2 and Per3 show a ~2hr phase advance. While these could only weakly account for the behavioral phase advance, the authors used TimeTeller to assess circadian phase in wildtype and ZFHX3 deficient mice. This approach clearly indicated that while the clock is not disrupted in the knockout animals, the phase advance can be correctly predicted from a network of gene expression patterns.
Strengths:
The authors use a multiomic strategy in order to reveal the role of the ZFHX3 transcription factor with a combination of TF and histone PTM ChIPseq, time-resolved RNAseq from wildtype and knockout mice and modeling the transcriptomic data using TimeTeller. The RNAseq experiments are nicely controlled and the analysis of the data indicates a clear impact on gene-expression levels in the knockout mice and the presence of a regulatory network that could underlie the advanced activity onset behavior.
Weaknesses:
It is not clear whether ZFHX3 has a direct role in any of the processes and seems to be a general factor that marks H3K4me3 and K27ac marked chromatin. Why it would specifically impact the core-clock TTFL clock gene expression or indeed daily gene expression rhythms is not clear either. Details for treatment of different ChIP samples (ZFHX3 and histone PTM ChIPs) on data normalization for analysis are needed. The loss of complete rhythmicity of Avp and other neuropeptides or indeed other TFs could instead account for the transcriptional deregulation noted in the knockout mice.
(8) We thank the reviewer for the constructive feedback. The current data suggests ZFHX3 acts as a mediating factor, occupying targeted active promoter sites and regulating gene expression by partnering with other key TFs in the SCN. Please see point 7 for clarification. The binding sites of ZFHX3 clearly showed enrichment for E-box(CACGTG) motif bound by CLOCK/BMAL1 along with binding sites for key SCN-specific TFs such as RFX (please see Supplementary Fig1). Our data thereby shows that it affects both core-clock and clock output genes (at varied levels) thereby exercising a pervasive control over the SCN transcriptome.
For treatment of ChIP samples please see point 4. We followed ENCODE guidelines strictly.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
We sincerely appreciate the insightful feedback and constructive suggestions provided by the reviewers. We thank reviewers for their valuable support in improving our manuscript.
In response to the public reviews raised by reviewers, we plan to make the following revisions:
(1) Most metadata have been rectified through collaborative review of original literature sources rather than automated processes. We intend to incorporate a detailed discussion on this matter in the revised manuscript.
(2) We will include a corrections table for entries to provide clarity and transparency regarding any amendments made.
(3) Additional references will be included to elucidate the rationale behind the selection of interact residues definition methods and the set threshold. The threshold is not fixed. In fact, we utilized a 5Å cutoff in current version, listing all residues with distances less than 5Å alongside the corresponding distances. The researchers could screen the residues through distance according to their custom cutoff. To offer researchers flexibility, we will also provide interact residues and corresponding distances with higher cutoffs for custom screening. These enhancements will be detailed in the revised manuscript.
(4)We acknowledge the importance of expanding the database to include a wider range of experimental information and complexes with diverse target sizes. Regrettably, immediate updates to address these limitations are not feasible at this time. Thus, we will give an illustration in the later detail response to reviewers.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
We very much appreciate the reviewers’ and editor’s overall positive responses to our manuscript "Evolution of lateralized gustation in nematodes".
Reviewer #1:
The mechanism of lsy-6-independent establishment of ASEL/R asymmetry in P. pacificus remains uncharacterized.
We thank the reviewer for recognizing the novel contributions of our work in revealing the existence of alternative pathways for establishing neuronal lateral asymmetry despite the absence of the lsy-6 miRNA in a divergent nematode species. We are certainly encouraged now to search for genetic factors that abolish asymmetric expression of gcy-22.3.
Reviewer #2:
(1) The authors observe only weak attraction of C. elegans to NaCl. These results raise the question of whether the weak attraction observed is the result of the prior salt environment experienced by the worms. More generally, this study does not address how prior exposure to gustatory cues shapes gustatory responses in P. pacificus. Is salt sensing in P. pacificus subject to the same type of experience-dependent modulation as salt sensing in C. elegans?
Proposed revision: For our live imaging experiments, we had not considered if starved P. pacificus animals in the presence of salt may exhibit responses different from a well-fed state. However, we will venture to address the effect of experience-dependent modulation in P. pacificus chemotaxis behavior using NH4Cl.
(2) A key finding of this paper is that the Ppa-CHE-1 transcription factor is expressed in the Ppa-AFD neurons as well as the Ppa-ASE neurons, despite the fact that Ce-CHE-1 is expressed specifically in Ce-ASE. However, additional verification of Ppa-AFD neuron identity is required. Based on the image shown in the manuscript, it is difficult to unequivocally identify the second pair of CHE-1-positive head neurons as the Ppa-AFD neurons. Ppa-AFD neuron identity could be verified by confocal imaging of the CHE-1-positive neurons, co-expression of Ppa-che-1p::GFP with a likely AFD reporter, thermotaxis assays with Ppa-che-1 mutants, and/or calcium imaging from the putative Ppa-AFD neurons.
We are happy to provide additional evidence to confirm Ppa-AFD neuron identity since the expression of Ppa-CHE-1 in non-ASE amphid neurons is one of the major differences between the two nematode specie
Proposed revision: We will provide results showing the Ppa-ttx-1::gfp reporter expression in finger-like neuronal endings and Ppa-_TTX-1::ALFA co-localization with _Ppa-che-1::gfp in the putative AFD neurons and discuss the possible role of Ppa-CHE-1 in AFD differentiation. We attempted to obtain AFD markers using several reporter strains. However, Ppa-gcy-8.1p::gfp(csuEx101) (PPA24212) showed no expression while Ppa-gcy-8.2p::gfp(csuEx100) (PPA41407) showed only expression in pharyngeal cells.
(4) The authors show that silencing Ppa-ASE has a dramatic effect on salt chemotaxis behavior. However, these data lack control with histamine-treated wild-type animals, with the result that the phenotype of Ppa-ASE-silenced animals could result from exposure to histamine dihydrochloride. This is an especially important control in the context of salt sensing, where histamine dihydrochloride could alter behavioral responses to other salts.
Proposed revision: Thank you for noticing this oversight. The control for histamine-treated wild-type worms in the Ppa-ASE silencing experiments was inadvertently left out in the original submission. Because the HisCl transgene is on a randomly segregating transgene array, we have scored worms with and without the transgene expressing the co-injection marker (Ppa-egl-20p::rfp expressed in the tail) to show that the presence of the transgene is necessary for the knockdown of NH4Br attraction.
We will also address most of the other more minor suggestions and clarifications sought by the reviewers.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Public Reviews:
Reviewer #1 (Public review):
Summary:
In this paper Kawasaki et al describe a regulatory role for the PIWI/piRNA pathway in rRNA regulation in Zebrafish. This regulatory role was uncovered through a screen for gonadogenesis defective mutants, which identified a mutation in the meioc gene, a coiled-coil germ granule protein. Loss of this gene leads to redistribution of Piwil1 from germ granules to the nucleolus, resulting in silencing of rRNA transcription.
Strengths:
Most of the experimental data provided in this paper is compelling. It is clear that in the absence of meioc, PiwiL1 translocates in to the nucleolus and results in down regulation of rRNA transcription. the genetic compensation of meioc mutant phenotypes (both organismal and molecular) through reduction in PiwiL1 levels are evidence for a direct role for PiwiL1 in mediating the phenotypes of meioc mutant.
Weaknesses:
Questions remain on the mechanistic details by which PiwiL1 mediated rRNA down regulation, and whether this is a function of Piwi in an unperturbed/wildtype setting. There is certainly some evidence provided in support of the natural function for piwi in regulating rRNA transcription (figure 5A+5B). However, the de-enrichment of H3K9me3 in the heterozygous (Figure 6F) is very modest and in my opinion not convincingly different relative to the control provided. It is certainly possible that PiwiL1 is regulating levels through cleavage of nascent transcripts. Another aspect I found confounding here is the reduction in rRNA small RNAs in the meioc mutant; I would have assumed that the interaction of PiwiL1 with the rRNA is mediated through small RNAs but the reduction in numbers do not support this model. But perhaps it is simply a redistribution of small RNAs that is occurring. Finally, the ability to reduce PiwiL1 in the nucleolus through polI inhibition with actD and BMH-21 is surprising. What drives the accumulation of PiwiL1 in the nucleolus then if in the meioc mutant there is less transcription anyway?
Despite the weaknesses outlined, overall I find this paper to be solid and valuable, providing evidence for a consistent link between PIWI systems and ribosomal biogenesis. Their results are likely to be of interest to people in the community, and provide tools for further elucidating the reasons for this link.
The amount of cytoplasmic rRNA in piwi+/- was increased by 26% on average (figure 5A+5B), the amount of ChiP-qPCR of H3K9 was decreased by about 26% (Figure 6F), and ChiP-qPCR of Piwil1 was decreased by 35% (Figure 6G), so we don't think there is a big discrepancy. On the other hand, the amount of ChiP-qPCR of H3K9 in meioc<sup>mo/mo</sup> was increased by about 130% (Figure 6F), while ChiP-qPCR of Piwil1 was increased by 50%, so there may be a mechanism for H3K9 regulation of Meioc that is not mediated by Piwil1. As for what drives the accumulation of Piwil1 in the nucleolus, although we have found that Piwil1 has affinity for rRNA (Fig. 6A), we do not know what recruits it. Significant increases in the 18-35nt small RNA of 18S, 28S rRNA and R2 were not detected in meioc<sup>mo/mo</sup> testes enriched for 1-8 cell spermatogonia, compared with meioc<sup>+/mo</sup> testes. The nucleolar localization of Piwil1 has revealed in this study, which will be a new topic for future research.
Reviewer #2 (Public review):
Summary:
In this study, the authors report that Meioc is required to upregulate rRNA transcription and promote differentiation of spermatogonial stem cells in zebrafish. The authors show that upregulated protein synthesis is required to support spermatogonial stem cells' differentiation into multi-celled cysts of spermatogonia. Coiled coil protein Meioc is required for this upregulated protein synthesis and for increasing rRNA transcription, such that the Meioc knockout accumulates 1-2 cell spermatogonia and fails to produce cysts with more than 8 spermatogonia. The Meioc knockout exhibits continued transcriptional repression of rDNA. Meioc interacts with and sequesters Piwil1 to the cytoplasm. Loss of Meioc increases Piwil1 localization to the nucleolus, where Piwil1 interacts with transcriptional silencers that repress rRNA transcription.
Strengths:
This is a fundamental study that expands our understanding of how ribosome biogenesis contributes to differentiation and demonstrates that zebrafish Meioc plays a role in this process during spermatogenesis. This work also expands our evolutionary understanding of Meioc and Ythdc2's molecular roles in germline differentiation. In mouse, the Meioc knockout phenocopies the Ythdc2 knockout, and studies thus far have indicated that Meioc and Ythdc2 act together to regulate germline differentiation. Here, in zebrafish, Meioc has acquired a Ythdc2-independent function. This study also identifies a new role for Piwil1 in directing transcriptional silencing of rDNA.
Weaknesses:
There are limited details on the stem cell-enriched hyperplastic testes used as a tool for mass spec experiments, and additional information is needed to fully evaluate the mass spec results. What mutation do these testes carry? Does this protein interact with Meioc in the wildtype testes? How could this mutation affect the results from the Meioc immunoprecipitation?
Stem cell-enriched hyperplastic testes came from wild-type adult sox17::GFP transgenic zebrafish. Sperm were found in these hyperplastic testes, and when stem cells were transplanted, they self-renewed and differentiated into sperm. It is not known if the hyperplasias develop due to a genetic variant in the line. We will add the following comment.
“The stem cell-enriched hyperplastic testes, which are occasionally found in adult wildtype zebrafish, contain cells at all stages of spermatogenesis. Hyperplasia-derived SSCs self-renewed and differentiated in the same manner as SSCs of normal testes in transplants of aggregates mixed with normal testicular cells.”
Reviewer #3 (Public review):
Summary:
The paper describes the molecular pathway to regulate germ cell differentiation in zebrafish through ribosomal RNA biogenesis. Meioc sequesters Piwil1, a Piwi homolog, which suppresses the transcription of the 45S pre-rDNA by the formation of heterochromatin, to the perinuclear bodies. The key results are solid and useful to researchers in the field of germ cell/meiosis as well as RNA biosynthesis and chromatin.
Strengths:
The authors nicely provided the molecular evidence on the antagonism of Meioc to Piwil1 in the rRNA synthesis, which supported by the genetic evidence that the inability of the meioc mutant to enter meiosis is suppressed by the piwil1 heterozygosity.
Weaknesses:
(1) Although the paper provides very convincing evidence for the authors' claim, the scientific contents are poorly written and incorrectly described. As a result, it is hard to read the text. Checking by scientific experts would be highly recommended. For example, on line 38, "the global translation activity is generally [inhibited]", is incorrect and, rather, a sentence like "the activity is lowered relative to other cells" is more appropriate here. See minor points for more examples.
Thank you for pointing that out. I will correct the parts pointed out.
(2) In some figures, it is hard for readers outside of zebrafish meiosis to evaluate the results without more explanation and drawing.
We will refine Figure 1A and add schema of spermatogonia culture system in a supplemental figure.
(3) Figure 1E, F, cycloheximide experiments: Please mention the toxicity of the concentration of the drug in cell proliferation and viability.
When testicular tissue culture was performed at 0.1, 1, 10, 100, 250, and 500mM, abnormal strong OP-puro signals including nuclei were found in cells at 10mM or more. We will add the results in the Supplemental Material. In addition, at 1mM, growth was perturbed in fast-growing 32≤-cell cysts of spermatogonia, but not in 1-4-cell spermatogonia, as described in L122-125.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Reviewer #1 (Public review):
Summary:
By way of background, the Jiang lab has previously shown that loss of the type II BMP receptor Punt (Put) from intestinal progenitors (ISCs and EBs) caused them to differentiate into EBs, with a concomitant loss of ISCs (Tian and Jiang, eLife 2014). The mechanism by which this occurs was activation of Notch in Put-deficient progenitors. How Notch was upregulated in Put-deficient ISCs was not established in this prior work. In the current study, the authors test whether a very low level of Dl was responsible. But co-depletion of Dl and Put led to a similar phenotype as depletion of Put alone. This result suggested that Dl was not the mechanism. They next investigate genetic interactions between BMP signaling and Numb, an inhibitor of Notch signaling. Prior work from Bardin, Schweisguth and other labs has shown that Numb is not required for ISC self-renewal. However the authors wanted to know whether loss of both the BMP signal transducer Mad and Numb would cause ISC loss. This result was observed for RNAi depletion from progenitors and for mad, numb double mutant clones. Of note, ISC loss was observed in 40% of mad, numb double mutant clones, whereas 60% of these clones had an ISC. They then employed a two-color tracing system called RGT to look at the outcome of ISC divisions (asymmetric (ISC/EB) or symmetric (ISC/ISC or EB/EB)). Control clones had 69%, 15% and 16%, respectively, whereas mad, numb double mutant clones had much lower ISC/ISC (11%) and much higher EB/EB (37%). They conclude that loss of Numb in moderate BMP loss of function mutants increased symmetric differentiation which lead caused ISC loss. They also reported that Numb<sup>15</sup> and numb<sup>4</sup> clones had a moderate but significant increase in ISC-lacking clones compared to control clones, supporting the model that Numb plays a role in ISC maintenance. Finally, they investigated the relevance of these observation during regeneration. After bleomycin treatment, there was a significant increase in ISC-lacking clones and a significant decrease in clone size in numb<sup>4</sup> and Numb<sup>15</sup> clones compared to control clones. Because bleomycin treatment has been shown to cause variation in BMP ligand production, the authors interpret the numb clone under bleomycin results as demonstrating an essential role of Numb in ISC maintenance during regeneration.
Strengths:
(i) Most data is quantified with statistical analysis
(ii) Experiments have appropriate controls and large numbers of samples
(iii) Results demonstrate an important role of Numb in maintaining ISC number during regeneration and a genetic interaction between Mad and Numb during homeostasis.
Weaknesses:
(i) No quantification for Fig. 1
Thank you for your suggestion. Quantification of Fig.1 will be added.
(ii) The premise is a bit unclear. Under homeostasis, strong loss of BMP (Put) leads to loss of ISCs, presumably regardless of Numb level (which was not tested). But moderate loss of BMP (Mad) does not show ISC loss unless Numb is also reduced. I am confused as to why numb does not play a role in Put mutants. Did the authors test whether concomitant loss of Put and Numb leads to even more ISC loss than Put-mutation alone.
Thank you for your comment. We have tested the genetic interaction between punt and numb using punt RNAi and numb RNAi driven by esg<sup>ts</sup>. According to the results in this study and our previously published data, punt mutant clone or esg<sup>ts</sup>> punt RNAi could induce a rapid loss of ISC (whin 8 days). We did not observe further enhancement of stem cell loss phenotype caused punt RNAi by numb RNAi.
(iii) I think that the use of the word "essential" is a bit strong here. Numb plays an important role but in either during homeostasis or regeneration, most numb clones or mad, numb double mutant clones still have ISCs. Therefore, I think that the authors should temper their language about the role of Numb in ISC maintenance.
Thank you. We will revise the language.
Reviewer #2 (Public review):
Summary:
This work assesses the genetic interaction between the Bmp signaling pathway and the factor Numb, which can inhibit Notch signalling. It follows up on the previous studies of the group (Tian, Elife, 2014; Tian, PNAS, 2014) regarding BMP signaling in controlling stem cell fate decision as well as on the work of another group (Sallé, EMBO, 2017) that investigated the function of Numb on enteroendocrine fate in the midgut. This is an important study providing evidence of a Numb-mediated back up mechanism for stem cell maintenance.
Strengths:
(1) Experiments are consistent with these previous publications while also extending our understanding of how Numb functions in the ISC.
(2) Provides an interesting model of a "back up" protection mechanism for ISC maintenance.
Weaknesses:
(1) Aspects of the experiments could be better controlled or annotated:
(a) As they "randomly chose" the regions analyzed, it would be better to have all from a defined region (R4 or R2, for example) or to at least note the region as there are important regional differences for some aspects of midgut biology.
Thank you. Since we mainly focus on region 4, we have added the clarification in the manuscript.
(b) It is not clear to me why MARCM clones were induced and then flies grown at 18{degree sign}C? It would help to explain why they used this unconventional protocol.
To avoid spontaneous clone, we kept the flies under 18°C.
(2) There are technical limitations with trying to conclude from double-knockdown experiments in the ISC lineage, such as those in Figure 1 where Dl and put are both being knocked down: depending on how fast both proteins are depleted, it may be that only one of them (put, for example) is inactivated and affects the fate decision prior to the other one (Dl) being depleted. Therefore, it is difficult to definitively conclude that the decision is independent of Dl ligand.
In our hand, Dl-RNAi is very effective and exhibited loss of N pathway activity as determined by the N pathway reporter Su(H)-lacZ (Fig. 1D). Therefore, the ectopic Su(H)-lacZ expression in Punt Dl double RNAi (fig. 1E) is unlikely due to residual Dl expression. Nevertheless, we will change the statement “BMP signaling blocks ligand-independent N activity” to” Loss of BMP signaling results in ectopic N pathway activity even when Dl is depleted”
(3) Additional quantification of many phenotypes would be desired.
(a) It would be useful to see esg-GFP cells/total cells and not just field as the density might change (2E for example).
We focused on R4 region for quantification where the cell density did not exhibit apparent change in different experimental groups. In addition, we have examined many guts for quantification. It is unlikely that the difference in the esg+ cell number is caused by change in cell density.
(b) Similarly, for 2F and 2G, it would be nice to see the % of ISC/ total cell and EB/total cell and not only per esgGFP+ cell.
Unfortunately, we didn’t have the suggested quantification. However, we believe that quantification of the percentage of ISC or EB among all progenitor cells, as we did here, provides a faithful measurement of the self-renewal status of each experimental group.
(c) Fig1: There is no quantification - specifically it would be interesting to know how many esg+ are su(H)lacZ positive in Put- Dl- condition compared to WT or Put- alone. What is the n?
Quantification will be added.
(d) Fig2: Pros + cells are not seen in the image? Are they all DllacZ+?
Anti-Pros and anti-E(spl)mβ-CD2 were stained in the same channel (magenta). Pros+ is nuclear dot-like staining, while CD2 outlined the cell membrane of EB cell.
(e) Fig3: it would be nice to have the size clone quantification instead of the distribution between groups of 2 cell 3 cells 4 cell clones.
Thank you for your suggestion. In this study, we have quantified the clone size of each clone and calculated the average size for each genotype. However, the frequency distribution analysis was chosen because it highlights the significance of the clone size differences among genotypes.
(f) How many times were experiments performed?
All experiments are performed 3 times.
(4) The authors do not comment on the reduction of clone size in DSS treatment in Figure 6K. How do they interpret this? Does it conflict with their model of Bleo vs DSS?
numb<sup>4</sup> clone containing guts treated with DSS exhibited a slight reduction of clone size, evident by a higher percentage of 2-cell clones and lower percentage of > 8 cell clones. This reduction is less significant in guts containing numb<sup>15</sup> clones. However, the percentage of Dl<sup>+</sup>-containing clones is similar between DSS and mock-treated guts. It is possible that ISC proliferation is lightly reduced due to numb<sup>4</sup> mutation or the genetic background.
(5) There is probably a mistake on sentence line 314 -316 "Indeed, previous studies indicate that endogenous Numb was not undetectable by Numb antibodies that could detect Numb expression in the nervous system".
We will make a correction of the sentence.
Reviewer #3 (Public review):
Summary:
The authors provide an in-depth analysis of the function of Numb in adult Drosophila midgut. Based on RNAi combinations and double mutant clonal analyses, they propose that Numb has a function in inhibiting Notch pathway to maintain intestinal stem cells, and is a backup mechanism with BMP pathway in maintaining midgut stem cell mediated homeostasis.
Strengths:
Overall, this is a carefully constructed series of experiments, and the results and statistical analyses provides believable evidence that Numb has a role, albeit weak compared to other pathways, in sustaining ISC and in promoting regeneration especially after damage by bleomycin, which may damage enterocytes and therefore disrupt BMP pathway more. The results overall support their claim.
The data are highly coherent, and support a genetic function of Numb, in collaborating with BMP signaling, to maintain the number and proliferative function of ISCs in adult midguts. The authors used appropriate and sophisticated genetic tools of double RNAi, mutant clonal analysis and dual marker stem cell tracing approaches to ensure the results are reproducible and consistent. The statistical analyses provide confidence that the phenotypic changes are reliable albeit weaker than many other mutants previously studied.
Weaknesses:
In the absence of Numb itself, the midgut has a weak reduction of ISC number (Fig. 3 and 5), as well as weak albeit not statistically significant reduction of ISC clone size/proliferation. I think the authors published similar experiments with BMP pathway mutants. The mad<sup>1-2</sup> allele used here as stated below may not be very representative of other BMP pathway mutants. Therefore, it could be beneficial to compare the number of ISC number and clone sizes between other BMP experiments to provide the readers with a clearer picture of how these two pathways individually contribute (stronger/weaker effects) to the ISC number and gut homeostasis.
Thank you for your comment. We have tested other components of BMP pathway in our previously study (Tian et al., 2014). More complete loss of BMP signaling (for example, Put clones, Put RNAi, Tkv/Sax double mutant clones or double RNAi) resulted in ISC loss regardless of the status of numb, suggesting a more predominant role of BMP signaling in ISC self-renewal compared with Numb. We speculate that the weak stem cell loss phenotype associated with numb mutant clones in otherwise wild type background could be due to fluctuation of BMP signaling in homeostatic guts.
The main weakness of this manuscript is the analysis of the BMP pathway components, especially the mad<sup>1-2</sup> allele. The mad RNAi and mad<sup>1-2</sup> alleles (P insertion) are supposed to be weak alleles and that might be suitable for genetic enhancement assays here together with numb RNAi. However, the mad<sup>1-2</sup> allele, and sometimes the mad RNAi, showed weakly increased ISC clone size. This is kind of counter-intuitive that they should have a similar ISC loss and ISC clone size reduction.
We used mad<sup>1-2</sup> and mad RNAi here to test the genetic interaction with numb because our previous studies showed that partial loss of BMP signaling under these conditions did not cause stem cell loss, therefore, may provide a sensitized background to determine the role of Numb in ISC self-renewal. The increased proliferation of ISC/ clone size in associated with mad<sup>1-2</sup> and mad RNAi is due to the fact that the reduction of BMP signaling in either EC or EB will non-autonomously induce stem cell proliferation. However, in mad numb double mutant clones, there was a reduction in clone size, which correlated with loss of ISC.
A much stronger phenotype was observed when numb mutants were subject to treatment of tissue damaging agents Bleomycin, which causes damage in different ways than DSS. Bleomycin as previously shown to be causing mainly enterocyte damage, and therefore disrupt BMP signaling from ECs more likely. Therefore, this treatment together with loss of numb led to a highly significant reduction of ISC in clones and reduction of clone size/proliferation. One improvement is that it is not clear whether the authors discussed the nature of the two numb mutant alleles used in this study and the comparison to the strength of the RNAi allele. Because the phenotypes are weak and more variable, the use of specific reagents is important.
Numb<sup>15</sup> is a null allele, and the nature of numb<sup>4</sup> has not been elucidated. According to Domingos, P.M. et al., numb<sup>15</sup> induced a more severe phenotype than numb<sup>4</sup> did. Consistently, we also found that more numb<sup>15</sup> mutant clones were void of stem cell than numb<sup>4</sup>.
Furthermore, the use of possible activating alleles of either or both pathways to test genetic enhancement or synergistic activation will provide strong support for the claims.
Activation of BMP (Tkv<sup>CA</sup>) also induced stem cell tumor (Tian et al., 2014), which is not suitable for synergistic activation experiment.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
eLife Assessment
This study offers a useful treatment of how the population of excitatory and inhibitory neurons integrates principles of energy efficiency in their coding strategies. The analysis provides a comprehensive characterisation of the model, highlighting the structured connectivity between excitatory and inhibitory neurons. However, the manuscript provides an incomplete motivation for parameter choices. Furthermore, the work is insufficiently contextualized within the literature, and some of the findings appear overlapping and incremental given previous work.
We are genuinely grateful to the Editors and Reviewers for taking time to provide extremely valuable suggestions and comments, which will help us to substantially improve our paper. We decided to do our very best to implement all suggestions, as detailed in the point-by-point rebuttal letter below. We feel that our paper has improved considerably as a result.
Public Reviews:
Reviewer #1 (Public Review):
Summary: Koren et al. derive and analyse a spiking network model optimised to represent external signals using the minimum number of spikes. Unlike most prior work using a similar setup, the network includes separate populations of excitatory and inhibitory neurons. The authors show that the optimised connectivity has a like-to-like structure, leading to the experimentally observed phenomenon of feature competition. They also characterise the impact of various (hyper)parameters, such as adaptation timescale, ratio of excitatory to inhibitory cells, regularisation strength, and background current. These results add useful biological realism to a particular model of efficient coding. However, not all claims seem fully supported by the evidence. Specifically, several biological features, such as the ratio of excitatory to inhibitory neurons, which the authors claim to explain through efficient coding, might be contingent on arbitrary modelling choices. In addition, earlier work has already established the importance of structured connectivity for feature competition. A clearer presentation of modelling choices, limitations, and prior work could improve the manuscript.
Thanks for these insights and for this summary of our work.
Major comments:
(1) Much is made of the 4:1 ratio between excitatory and inhibitory neurons, which the authors claim to explain through efficient coding. I see two issues with this conclusion: (i) The 4:1 ratio is specific to rodents; humans have an approximate 2:1 ratio (see Fang & Xia et al., Science 2022 and references therein); (ii) the optimal ratio in the model depends on a seemingly arbitrary choice of hyperparameters, particularly the weighting of encoding error versus metabolic cost. This second concern applies to several other results, including the strength of inhibitory versus excitatory synapses. While the model can, therefore, be made consistent with biological data, this requires auxiliary assumptions.
We now describe better the ratio of numbers of E and I neurons found in real data, as suggested. The first submission already contained an analysis of how the optimal ratio of E vs I neuron numbers depends in our model on the relative weighting of the loss of E and I neurons and on the relative weighting of the encoding error vs the metabolic cost in the loss function (see Fig. 7E). We revised the text on page 12 describing Fig. 7E.
To allow readers to form easily a clear idea of how the weighting of the error vs the cost may influence the optimal network configuration, we now present how optimal parameters depend on the weighting in a systematic way, by always including this type of analysis when studying all other model parameters (time constants of single E and I neurons, noise intensity, metabolic constant, ratio of mean I-I to E-I connectivity). These results are shown on the Supplementary Fig. S4 A-D and H, and we comment briefly on each of them in Results sections (pages 9, 10, 11 and 12) that analyze each of these parameters.
Following this Reviewer’s comment, we now included a joint analysis of network performance relative to the ratio of E-I neuron numbers and the ratio of mean I-I to E-I connectivity (Fig. 7J). We found a positive correlation between optima values of these two ratios. This implies that a lower ratio of E-I neuron numbers, such as a 2:1 ratio in human cortex mentioned by the reviewer, predicts lower optimal ratio of I-I to E-I connectivity and thus weaker inhibition in the network. We made sure that this finding is suitably described in revision (page 13).
(2) A growing body of evidence supports the importance of structured E-I and I-E connectivity for feature selectivity and response to perturbations. For example, this is a major conclusion from the Oldenburg paper (reference 62 in the manuscript), which includes extensive modelling work. Similar conclusions can be found in work from Znamenskiy and colleagues (experiments and spiking network model; bioRxiv 2018, Neuron 2023 (ref. 82)), Sadeh & Clopath (rate network; eLife, 2020), and Mackwood et al. (rate network with plasticity; eLife, 2021). The current manuscript adds to this evidence by showing that (a particular implementation of) efficient coding in spiking networks leads to structured connectivity. The fact that this structured connectivity then explains perturbation responses is, in the light of earlier findings, not new.
We agree that the main contribution of our manuscript in this respect is to show how efficient coding in spiking networks can lead to structured connectivity implementing lateral inhibition similar to that proposed in the recent studies mentioned by the Reviewer. We apologize if this was not clear enough in the previous version. We streamlined the presentation to make it clearer in revision. We nevertheless think it useful to report the effects of perturbations within this network because these results give information about how lateral inhibition works in our network. Thus, we kept presenting it in the revised version, although we de-emphasized and simplified its presentation. We now give more emphasis to the novelty of the derivation of this connectivity rule from the principles of efficient coding (pages 4 and 6). We also describe better (page 8) what the specific results of our simulated perturbation experiments add to the existing literature.
(3) The model's limitations are hard to discern, being relegated to the manuscript's last and rather equivocal paragraph. For instance, the lack of recurrent excitation, crucial in neural dynamics and computation, likely influences the results: neuronal time constants must be as large as the target readout (Figure 4), presumably because the network cannot integrate the signal without recurrent excitation. However, this and other results are not presented in tandem with relevant caveats.
We improved the Limitations paragraph in Discussion, and also anticipated caveats in tandem with results when needed, as suggested.
We now mention the assumption of equal time constants between the targets and readouts in the Abstract.
We now added the analysis of the network performance and dynamics as a function of the time constant of the target (t<sub>x</sub>) to the Supplementary Fig S5 (C-E). These results are briefly discussed in text on page 13. The only measure sensitive to t<sub>x</sub> is the encoding error of E neurons, with a minimum at t<sub>x</sub> =9 ms, while I neurons and metabolic cost show no dependency. Firing rates, variability of spiking as well as the average and instantaneous balance show no dependency on t<sub>x</sub>. We note that t<sub>x</sub> = t, with t=1/l the time constant of the population readout (Eq. 9), is an assumption we use when we derive the model from the efficiency objective (Eq. 18 to 23). In our new and preliminary work (Koren, Emanuel, Panzeri, Biorxiv 2024), we derived a more general class of models where this assumption is relaxed, which gives a network with E-E connectivity that adapts to the time constant of the stimulus. Thus, the reviewer is correct in the intuition that the network requires E-E connectivity to better integrate target signals with a different time constant than the time constant of the membrane. We now better emphasize this limitation in Discussion (page 16).
(4) On repeated occasions, results from the model are referred to as predictions claimed to match the data. A prediction is a statement about what will happen in the future – but most of the “predictions” from the model are actually findings that broadly match earlier experimental results, making them “postdictions”.
This distinction is important: compared to postdictions, predictions are a much stronger test because they are falsifiable. This is especially relevant given (my impression) that key parameters of the model were tweaked to match the data.
We now comment on every result from the model as either matching earlier experimental results, or being a prediction for experiments.
In Section “Assumptions and emergent properties of the efficient E-I network derived from first principles”, we report (page 4) that neural networks have connectivity structure that relates to tuning similarity of neurons (postdiction).
In Section “Encoding performance and neural dynamics in an optimally efficient E-I network” we report (page 5) that in a network with optimal parameters, I neurons have higher firing rate than E neurons (postdiction), that single neurons show temporally correlated synaptic currents (postdiction) and that the distribution of firing rates across neurons is log-normal (postdiction).
In Section “Competition across neurons with similar stimulus tuning emerging in efficient spiking networks” we report (page 6) that the activity perturbation of E neurons induces lateral inhibition on other E neurons, and that the strength of lateral inhibition depends on tuning similarity (postdiction). We show that activity perturbation of E neurons induces lateral excitation in I neurons (prediction). We moreover show that the specific effects of the perturbation of neural activity rely on structured E-I-E connectivity (prediction for experiments, but similar result in Sadeh and Clopath, 2020). We show strong voltage correlations but weak spike-timing correlations in our network (prediction for experiments, but similar result in Boerlin et al. 2013).
In Section “The effect of structured connectivity on coding efficiency and neural dynamics”, we report (page 7) that our model predicts a number of differences between networks with structured and unstructured (random) connectivity. In particular, structured networks differ from unstructured ones by showing better encoding performance, lower metabolic cost, weaker variance over time in the membrane potential of each neuron, lower firing rates and weaker average and instantaneous balance of synaptic currents.
In Section “Weak or no spike-triggered adaptation optimizes network efficiency”, we report (page 9) that our model predicts better encoding performance in networks with adaptation compared to facilitation. Our results suggest that adaptation should be stronger in E compared to I (PV+) neurons (postdiction). In the same section, we report (page 10) that our results suggest that the instantaneous balance is a better predictor of model efficiency than average balance (prediction).
In Section “Non-specific currents regulate network coding properties”, we report (page 10) that our model predicts that more than half of the distance between the resting potential and firing threshold is taken by external currents that are unrelated to feedforward processing (postdiction). We also report (page 11) that our model predicts that moderate levels of uncorrelated (additive) noise is beneficial for efficiency (prediction for experiments, but similar results in Chalk et al., 2016, Koren et al., 2017, Timcheck et al. 2022).
In Section “Optimal ratio of E-I neuron numbers and of mean I-I to E-I synaptic efficacy coincide with biophysical measurements”, we predict the optimal ratio of E to I neuron numbers to be 4:1 (postdiction) and the optimal ratio of mean I-I to E-I connectivity to be 3:1 (postdiction). Further, we report (page 13) that our results predict that a decrease in the ratio of E-I neuron numbers is accompanied with the decrease in the ratio of mean I-I to E-I connectivity.
Finally, in Section “Dependence of efficient coding and neural dynamics on the stimulus statistics”, we report (page 13) that our model predicts that the efficiency of the network has almost no dependence on the time scale of the stimulus (prediction).
Reviewer #2 (Public Review):
Summary:
In this work, the authors present a biologically plausible, efficient E-I spiking network model and study various aspects of the model and its relation to experimental observations. This includes a derivation of the network into two (E-I) populations, the study of single-neuron perturbations and lateral-inhibition, the study of the effects of adaptation and metabolic cost, and considerations of optimal parameters. From this, they conclude that their work puts forth a plausible implementation of efficient coding that matches several experimental findings, including feature-specific inhibition, tight instantaneous balance, a 4 to 1 ratio of excitatory to inhibitory neurons, and a 3 to 1 ratio of I-I to E-I connectivity strength. It thus argues that some of these observations may come as a direct consequence of efficient coding.
Strengths:
While many network implementations of efficient coding have been developed, such normative models are often abstract and lacking sufficient detail to compare directly to experiments. The intention of this work to produce a more plausible and efficient spiking model and compare it with experimental data is important and necessary in order to test these models.
In rigorously deriving the model with real physical units, this work maps efficient spiking networks onto other more classical biophysical spiking neuron models. It also attempts to compare the model to recent single-neuron perturbation experiments, as well as some longstanding puzzles about neural circuits, such as the presence of separate excitatory and inhibitory neurons, the ratio of excitatory to inhibitory neurons, and E/I balance. One of the primary goals of this paper, to determine if these are merely biological constraints or come from some normative efficient coding objective, is also important.
Though several of the observations have been reported and studied before (see below), this work arguably studies them in more depth, which could be useful for comparing more directly to experiments.
Thanks for these insights and for the kind words of appreciation of the strengths of our work.
Weaknesses:
Though the text of the paper may suggest otherwise, many of the modeling choices and observations found in the paper have been introduced in previous work on efficient spiking models, thereby making this work somewhat repetitive and incremental at times. This includes the derivation of the network into separate excitatory and inhibitory populations, discussion of physical units, comparison of voltage versus spike-timing correlations, and instantaneous E/I balance, all of which can be found in one of the first efficient spiking network papers (Boerlin et al. 2013), as well as in subsequent papers. Metabolic cost and slow adaptation currents were also presented in a previous study (Gutierrez & Deneve 2019). Though it is perfectly fine and reasonable to build upon these previous studies, the language of the text gives them insufficient credit.
We indeed built our work on these important previous studies, and we apologize if this was not clear enough. We thus improved the text to make sure that credit to previous studies is more precisely and more clearly given (see detailed reply for the list of changes made).
To facilitate the understanding on how we built on previous work, we expanded the comparison of our results with the results of Boerlin et al. (2013) about voltage correlations and uncorrelated spiking (page 7), comparison with the derivation of physical units of Boerlin et al. (2013) (page 3), discussion of how results on the ratio of the number of E to I neurons relate to Calaim et al (2022) and Barrett et al. (2016) (page 16), and comment on the previous work by Gutierrez and Deneve about adaptation (page 8).
Furthermore, the paper makes several claims of optimality that are not convincing enough, as they are only verified by a limited parameter sweep of single parameters at a time, are unintuitive and may be in conflict with previous findings of efficient spiking networks. This includes the following.
Coding error (RMSE) has a minimum at intermediate metabolic cost (Figure 5B), despite the fact that intuitively, zero metabolic cost would indicate that the network is solely minimizing coding error and that previous work has suggested that additional costs bias the output.
Coding error also appears to have a minimum at intermediate values of the ratio of E to I neurons (effectively the number of I neurons) and the number of encoded variables (Figures 6D, 7B). These both have to do with the redundancy in the network (number of neurons for each encoded variable), and previous work suggests that networks can code for arbitrary numbers of variables provided the redundancy is high enough (e.g., Calaim et al. 2022).
Lastly, the performance of the E-I variant of the network is shown to be better than that of a single cell type (1CT: Figure 7C, D). Given that the E-I network is performing a similar computation as to the 1CT model but with more neurons (i.e., instead of an E neuron directly providing lateral inhibition to its neighbor, it goes through an interneuron), this is unintuitive and again not supported by previous work. These may be valid emergent properties of the E-I spiking network derived here, but their presentation and description are not sufficient to determine this.
With regard to the concern that our previous analyses considered optimal parameter sets determined with a sweep of a single parameter at a time, we have addressed this issue in two ways. First, we presented (Figure 6I and 7J and text on pages 11 and 13) results of joint sweeps of variations of pairs of parameters whose joint variations are expected to influence optimality in a way that cannot be understood varying one parameter at a time. These new analyses complement the joint parameter sweep of the time constants of single E and I neurons (t<sub>r</sub><sup>E</sup> and t<sub>r</sub><sup>I</sup>) that has already been presented in Fig. 5A (former Fig. 4A). Second, we conducted, within a reasonable/realistic range of possible variations of each individual parameter, a Monte-Carlo random joint sampling (10000 simulations with 20 trials each) of all 6 model parameters that we explored in the paper. We presented these new results on Fig. 2 and discuss it on pages 5-6.
The Reviewer is correct in stating that the error (RMSE) exhibits a counterintuitive minimum as a function of the metabolic constant despite the fact that, intuitively, for vanishing metabolic constant the network is solely minimizing the coding error (Fig. 6B). In our understanding, this counterintuitive finding is due to the presence of noise in the membrane potential dynamics. In the presence of noise, a non-vanishing metabolic constant is needed to suppress “inefficient” spikes purely induced by noise that do not contribute to coding and increase the error. This gives rise to a form of “stochastic resonance”, where the noise improves detection of the signal coming from the feedforward currents. We note that the metabolic constant and the noise variance both appear in the non-specific external current (Eq. 29f in Methods), and, thus, a covariation in their optimal values is expected. Indeed, we find that the optimal metabolic constant monotonically increases as a function of the noise variance, with stronger regularization (larger beta) required to compensate for larger variability (larger sigma) (Fig. 6I). Finally, we note that a moderate level of noise (which, in turn, induces a non-trivial minimum of the coding error as a function of beta) in the network is optimal. The beneficial effect of moderate levels of noise on performance in networks with efficient coding has been shown in different contexts in previous work (Chalk et al. 2016, Koren and Deneve, 2017). The intuition is that the noise prevents the excessive synchronization of the network and insufficient single neuron variability that decrease the performance. The points above are now explained in the revised text on page 11.
The Reviewer is also correct in stating that the network exhibits an optimal performance for intermediate values of the number of I neurons and the number of encoded features. In our understanding, the optimal number of encoded features of M=3 arises simply because all the other parameters were optimized for those values of M. The purpose of those analyses was not to state that a network optimally encodes only a given number of features, but how a network whose parameters are optimized for a given M perform reasonably well when M is varied. We clarify this on page 13 of Results in Discussion on page 16. In the same Discussion paragraph we refer also to the results of Calaim et al mentioned by the Reviewer.
To address the concern about the comparison of efficiency between the E-I and the 1CT model, we took advantage of the Reviewer’s suggestions to consider this issue more deeply. In revision, we now compare the efficiency of the 1CT model with the E population of the E-I model (Fig. 8H). This new comparison changes the conclusion about which model is more efficient, as it shows the 1CT model is slightly more efficient than the E-I model. Nevertheless, the E-I model performance is more robust to small variations of optimal parameters, e.g., it exhibits biologically plausible firing rates for non-optimal values of the metabolic constant. See also the reply to point 3 of the Public Review of Reviewer 2 for more detail. We added these results and the ensuing caveats for the interpretation of this comparison on Page 14, and also revised the title of the last subsection of Results.
Alternatively, the methodology of the model suggests that ad hoc modeling choices may be playing a role. For example, an arbitrary weighting of coding error and metabolic cost of 0.7 to 0.3, respectively, is chosen without mention of how this affects the results. Furthermore, the scaling of synaptic weights appears to be controlled separately for each connection type in the network (Table 1), despite the fact that some of these quantities are likely linked in the optimal network derivation. Finally, the optimal threshold and metabolic constants are an order of magnitude larger than the synaptic weights (Table 1). All of these considerations suggest one of the following two possibilities. One, the model has a substantial number of unconstrained parameters to tune, in which case more parameter sweeps would be necessary to definitively make claims of optimality. Or two, parameters are being decoupled from those constrained by the optimal derivation, and the optima simply corresponds to the values that should come out of the derivation.
We thank the reviewer for bringing about these important questions.
In the first submission, we presented both the encoding error and the metabolic cost separately as a function of the parameters, so that readers could get an understanding of how stable optimal parameters would be to the change of the relative weighting of encoding error and metabolic cost. We specified this in Results (page 5) and we kept presenting separately encoding and metabolic terms in the revision.
However, we agree that it is important to present the explicit quantification on how the optimal parameters may depend on g<sub>L</sub>. In the first submission, we showed the analysis for all possible weightings in case of two parameters for which we found this analysis was the most relevant – the ratio of neuron numbers (Fig. 7E, Fig. 6E in first submission) and the optimal number of input features M (see last paragraph on page 13 and Fig. 8D). We now show this analysis also for the rest of studied model parameters in the Supplementary Fig. S4 (A-D and H). This is discussed on pages 9, 10,11 and 12.
With regard to the concern that the scaling of synaptic weights should not be controlled separately for each connection type in the network, we agree and we would like to clarify that we did not control such scaling separately. Apologies if this was not clear enough. From the optimal analytical solution, we obtained that the connectivity scales with the standard deviation of decoding weights (s<sub>w</sub><sup>E</sup> and s<sub>w</sub><sup>I</sup>) of the pre and postsynaptic populations (Methods, Eq. 32). We studied the network properties as a function of the ratio of average I-I to E-I connectivity (Fig. 7 F-I; Supplementary Fig. S4 D-H), which is equivalent to the ratio of standard deviations s<sub>w</sub><sup>I</sup> /s<sub>w</sub><sup>E</sup> (see Methods, Eq. 35). We clarified this in text on page 12.
Next, it is correct that our synaptic weights are an order of magnitude smaller than the metabolic constant. We analysed a simpler version of the network that has the coding and dynamics identical to our full model (Methods, Eq. 25) but without the external currents. We found that the optimal parameters determining the firing threshold in such a simpler network were biologically implausible (see Supplementary Text 2 and Supplementary Table S1). We considered as another simple solution the rescaling of the synaptic efficacy such as to have biologically plausible threshold. However, that gave implausible mean synaptic efficacy (see Supplementary Text 2). Thus, to be able to define a network with biologically plausible firing threshold and mean synaptic efficacy, we introduced the non-specific external current. After introducing such current, we were able to shift the firing threshold to biologically plausible values while keeping realistic values of mean synaptic efficacy. Biologically plausible values for the firing threshold are around 15 -– 20 mV above the resting potential (Constantinople and Bruno, 2013), which is the value that we have in our model. A plausible value for the average synaptic strength is between a fraction of one millivolt to a couple of millivolts (Constantinople & Bruno, 2013, Campagnola et al. 2022), which also corresponds to values that the synaptic weights take. The above results are briefly explained in the revised text on page 4.
Finally, to study the optimality of the network when changing multiple parameters at a time, we added a new analysis with Monte-Carlo random joint sampling (10.000 parameter sets with 20 trials for each set) of all 6 model parameters that we explored in the paper. We compared (Fig 2) the so-obtained results of each simulation with those obtained from the understanding gained from varying one or two parameters at a time (optimal parameters reported in Table 1 and used throughout the paper). We found (Fig. 2) that the optimal configuration in Table 1 was never improved by any other simulations we performed, and that the first three random simulations that came the closest to the optimal one of Table 1 had stronger noise intensity but also stronger metabolic cost than the configuration on Table 1. The second, third and fourth configurations had longer time constants of both E and I single neurons (adaptation time constants). Ratio of E-I neuron numbers and of I-I to E-I connectivity in the second, third and fourth best configuration were either jointly increased or decreased with respect to our configuration. These results are reported on Fig. 2 and in Tables 2-3 and they are discussed in Results (page 5).
Reviewer #3 (Public Review):
Summary:
In their paper the authors tackle three things at once in a theoretical model: how can spiking neural networks perform efficient coding, how can such networks limit the energy use at the same time, and how can this be done in a more biologically realistic way than previous work?
They start by working from a long-running theory on how networks operating in a precisely balanced state can perform efficient coding. First, they assume split networks of excitatory (E) and inhibitory (I) neurons. The E neurons have the task to represent some lower dimensional input signal, and the I neurons have the task to represent the signal represented by the E neurons. Additionally, the E and I populations should minimize an energy cost represented by the sum of all spikes. All this results in two loss functions for the E and I populations, and the networks are then derived by assuming E and I neurons should only spike if this improves their respective loss. This results in networks of spiking neurons that live in a balanced state, and can accurately represent the network inputs.
They then investigate in-depth different aspects of the resulting networks, such as responses to perturbations, the effect of following Dale's law, spiking statistics, the excitation (E)/inhibition (I) balance, optimal E/I cell ratios, and others. Overall, they expand on previous work by taking a more biological angle on the theory and showing the networks can operate in a biologically realistic regime.
Strengths:
(1) The authors take a much more biological angle on the efficient spiking networks theory than previous work, which is an essential contribution to the field.
(2) They make a very extensive investigation of many aspects of the network in this context, and do so thoroughly.
(3) They put sensible constraints on their networks, while still maintaining the good properties these networks should have.
Thanks for this summary and for these kind words of appreciation of the strengths of our work.
Weaknesses:
(1) The paper has somewhat overstated the significance of their theoretical contributions, and should make much clearer what aspects of the derivations are novel. Large parts were done in very similar ways in previous papers. Specifically: the split into E and I neurons was also done in Boerlin et al (2008) and in Barrett et al (2016). Defining the networks in terms of realistic units was already done by Boerlin et al (2008). It would also be worth it to discuss Barrett et al (2016) specifically more, as there they also use split E/I networks and perform biologically relevant experiments.
We improved the text to make sure that credit to previous studies is more precisely and more clearly given (see rebuttal to the specific suggestions of Reviewer 2 for a full list).
We apologize if this was not clear enough in the previous version.
With regard to the specific point raised here about the E-I split, we revised the text on page 2. With regard to the realistic units, we revised the text on page 3. Finally, we commented on relation between our results and results of the study by Barrett et al. (2016) on page 16.
(2) It is not clear from an optimization perspective why the split into E and I neurons and following Dale's law would be beneficial. While the constraints of Dale's law are sensible (splitting the population in E and I neurons, and removing any non-Dalian connection), they are imposed from biology and not from any coding principles. A discussion of how this could be done would be much appreciated, and in the main text, this should be made clear.
We indeed removed non-Dalian connections because Dale’s law is a major constraint for biological plausibility. Our logic was to consider efficient coding within the space of networks that satisfy this (and other) biological plausibility constraints. We did not intend to claim that removing the non-Dalian connections was the result of an analytical optimization. We clarified this in revision (page 4).
(3) Related to the previous point, the claim that the network with split E and I neurons has a lower average loss than a 1 cell-type (1-CT) network seems incorrect to me. Only the E population coding error should be compared to the 1-CT network loss, or the sum of the E and I populations (not their average). In my author recommendations, I go more in-depth on this point.
We carefully considered these possibilities and decided to compare only the E population of the E-I model with the 1-CT model. On Fig.8G (7C of the first submission), E neurons have a slightly higher error and cost compared to the 1CT network. In the revision, we compared the loss of E neurons of the E-I model with the loss of the 1-CT model. Using such comparison, we found that the 1CT network has lower loss and is more efficient compared to E neurons of the E-I model. We revised Figure 8H and text on page 14 to address this point.
(4) While the paper is supposed to bring the balanced spiking networks they consider in a more experimentally relevant context, for experimental audiences I don't think it is easy to follow how the model works, and I recommend reworking both the main text and methods to improve on that aspect.
We tried to make the presentation of the model more accessible to a non-computational audience in the revised paper. We carefully edited the text throughout to make it as accessible as possible.
Assessment and context:
Overall, although much of the underlying theory is not necessarily new, the work provides an important addition to the field. The authors succeeded well in their goal of making the networks more biologically realistic, and incorporating aspects of energy efficiency. For computational neuroscientists, this paper is a good example of how to build models that link well to experimental knowledge and constraints, while still being computationally and mathematically tractable. For experimental readers, the model provides a clearer link between efficient coding spiking networks to known experimental constraints and provides a few predictions.
Thanks for these kind words. We revised the paper to make sure that these points emerge more clearly and in a more accessible way from the revised paper.
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
Referring to the major comments:
(1) Be upfront about particular modelling choices and why you made them; avoid talk of a "striking/surprising", etc. ability to explain data when this actually requires otherwise-arbitrary choices and auxiliary assumptions. Ideally, this nuance is already clear from the abstract.
We removed all the "striking/surprising" and similar expressions from the text.
We added to the Abstract the assumption of equal time constants of the stimulus and of the membrane of E and I neurons and the assumption of the independence of encoded stimulus features.
In revision, we performed additional analyses (joint parameter sweeps, Monte-Carlo joint sampling of all 6 model parameters) providing additional evidence that the network parameters in Table 1 capture reasonably well the optimal solution. These are reported on Figs. 2, 6I and 7J and in Results (pages 5, 11 and 13). See rebuttal to weaknesses of the public review of the Referee 2 for details.
(2) Make even more of an effort to acknowledge prior work on the importance of structured E-I and I-E connectivity.
We have revised the text (page 4) to better place our results within previous work on structured E-I and I-E connectivity.
(3) Be clear about the model's limitations and mention them throughout the text. This will allow readers to interpret your results appropriately.
We now comment more on model's limitations, in particular the simplifying assumption about the network's computation (page 16), the lack of E-E connectivity (page 3), the absence of long-term adaptation (page 10), and the simplification of only having one type of inhibitory neurons (page 16).
(4) Present your "predictions" for what they are: aspects of the model that can be made consistent with the existing data after some fitting. Except in the few cases where you make actual predictions, which deserve to be highlighted.
We followed the suggestion of the reviewer and distinguished cases where the model is consistent with the data (postdictions) from actual predictions, where empirical measurements are not available or not conclusive. We compiled a list of predictions and postdictions in response to the point 4 of Reviewer 1. In revision, we now comment about every property of the model as either reproducing a known property of biological networks (postdiction) or being a prediction. We improved the text in Results on pages 4, 5, 6, 7, 9, 10, 11, 12 and 13 to accommodate these requests.
Minor comments and recommendations
It's a sizable list, but most can be addressed with some text edits.
(1) The image captions should give more details about the simulations and analyses, particularly regarding sample sizes and statistical tests. In Figure 5, for example, it is unclear if the lines represent averages over multiple signals and, if so, how many. It's probably not a single realization, but if it is, this might explain the otherwise puzzling optimal number of three stimuli. Box plots visualize the distribution across simulation trials, but it's not clear how many. In Figure 7d, a star suggests statistical significance, but the caption does not mention the test or its results; the y-axis should also have larger limits.
All statistical results were computed on 100 or 200 simulation trials, depending on the figure, with duration of the trial of 1 second of simulated time. To compute statistical results in Fig. 1, we used 10 trials with duration of 10 seconds for each trial. Each trial consisted of M independent realizations of Ornstein-Uhlenbeck (OU) processes as stimuli, independent noise in the membrane potential and an independent draw of tuning parameters, such that the results are general over specific realization of these random variables. Realizations of the OU processes were independent across stimulus dimensions and across trials. We added this information in the caption of each figure.
The optimal number of M=3 stimuli is the result of measuring the performance of the network in 100 simulation trials (for each parameter value), thus following the same procedure as for all other parameters. Boxplots on Fig. 8G-H were also generated from results computed in 100 simulation trials, which we have now specified in the caption of the figure, together with the statistical test used for assessing the significance (twotailed t-test). We also enlarged the limits of Fig. 8H (7D in the previous version).
(2) The Oldenburg paper (reference 62) finds suppression of all but nearby neurons in response to two- photon stimulation of small neural ensembles (instead of single neurons, as in Chettih & Harvey). This isn't perfectly consistent with the model's results, even though the Oldenburg experiments seem more relevant given the model's small size, and strong connectivity/high connection probability between similarly tuned neurons. What might explain the potential mismatch?
We sincerely apologize for not having been precise enough on this point when comparing our model against Chettih & Harvey and Oldenburg et al. We corrected the sentence (page 6) to remove the claim that our model reproduces both.
We speculate that the discrepancy between perturbing our model and the Oldenburg data may arise from the lack of E-E connectivity in our model. Synaptic connections between E neurons with similar selectivity could create an enhancement instead of suppression between neuronal pairs with very similar tuning. We added a sentence about this in the section with perturbation experiments “Competition across neurons with similar stimulus tuning emerging in efficient spiking networks” (page 7) where we discuss this limitation of our model. We feel that this example shows the utility to derive some perturbation results from our model, as not all networks with some degree of lateral inhibition will show the same perturbation results. Comparing our model's perturbation with real data perturbation results has thus some value to better appreciate strengths and limitations of our approach.
(3) "Previous studies optogenetically stimulated E neurons but did not determine whether the recorded neurons were excitatory or inhibitory " (p. 11). I believe Oldenburg et al. did specifically image excitatory neurons.
The reviewer is correct about Oldenburg et al. imaging specifically excitatory neurons. We have revised this part of the Discussion (page 15).
(4) The authors write that efficiency is particularly achieved where adaptation is stronger in E compared to I neurons (p. 7; Figure 4). Although this would be consistent with experimental data (the I neurons in the model seem akin to fast-spiking Pv+ cells), I struggle to see it in the figure. Instead, it seems like there are roughly two regimes. If either of the neuronal timescales is faster than the stimulus timescale, the optimisation fails. If both are at least as slow, optimisation succeeds.
We agree with the reviewer that the adaptation properties of our inhibitory neurons are compatible with Pv+ cells. What is essential for determining the dynamical regime of the network is less the relation to the time constant of the stimulus (t<sub>x</sub>) but rather the relation between the time constant of the population readout (t, which is also the membrane time constant) and the time constant of the single neuron (t<sub>r</sub><sup>y</sup> for y=E and y=I; see Eq. 23, 25 or 29e). The relation between t and t<sub>r</sub><sup>y</sup> determines if single neurons generate spike-triggered adaptation (t<sub>r</sub><sup>y</sup> > t) or spike-triggered facilitation (t<sub>r</sub><sup>y</sup> < t; see Table 4). In regimes with facilitation in either E or I neurons (or both), the network performance strongly deteriorates compared to regimes with adaptation (Fig. 5A).
Beyond adaptation leading to better performance, we also found different effects of adaptation in E and I neurons. We acknowledge that the difference of these effects was difficult to see from the Fig. 4B in the first submission. We have now replotted results from previously shown Fig. 4B to focus on the adaptation regime only, (since the Fig. 5A already establishes that this is the regime with better performance). We also added figures showing the differential effect of adaptation in E and I cell type on the firing rate and on the average loss (Fig. 5C-D). Fig. 5B and C (top plots) show that with adaptation in E neurons, the error and the loss increase more slowly than with adaptation in I neurons. Moreover, the firing rate in both cell types decreases with adaptation in E neurons, while this is not the case with adaptation in I neurons (Fig. 5D). These results are added to the figure panels specified above and discussed in text on page 9.
To clarify the relation between neuronal and stimulus timescale, we now also added the analysis of network performance as a function of the time constant of the stimulus t<sub>x</sub> (Supplementary Fig. S5 C-E). We found that the model's performance is optimal when the time constant of the stimulus is close to the membrane time constant t. This result is expected, because the equality of these time constants was imposed in our analytical derivation of the model (t<sub>x</sub> = t). We see a similar decrease in performance for values of t<sub>x</sub> that are faster and slower with respect to the membrane time constant (Supplementary Fig. S5C, top). These results are added to the figure panels specified above and discussed in text on page 13.
(5) A key functional property of cortical interneurons is their lower stimulus selectivity. Does the model replicate this feature?
We think that whether I neurons are less selective than E neurons is still an open question. A number of recent empirical studies reported that the selectivity of I neurons is comparable to the selectivity of E neurons (see., e.g., Kuan et al. Nature 2024, Runyan et al. Neuron 2010, Najafi et al. Neuron 2020). In our model, the optimal solution prescribes a precise structure in recurrent connectivity (see Eq. 24 and Fig. 1C(ii)) and structured connectivity endows I neurons with stimulus selectivity. To show this, we added plots of example tuning curves and the distribution of the selectivity index across E and I neurons (Fig. 8E-F) and described these new results in Results (page 14). Tuning curves in our network were similar to those computed in a previous work that addressed stimulus tuning in efficient spiking networks (Barrett et al. 2016). We evaluated tuning curves using M=3 constant stimulus features and we varied one of the features while the two others were kept fixed. We provided details on how the tuning curves and the selectivity index were computed in a new Methods subsection (“Tuning curves and selectivity index”) on page 50.
(6) The final panels of Figure 4 are presented as an approach to test the efficiency of biological networks. The authors seem to measure the instantaneous (and time-averaged) E-I balance while varying the adaptation parameter and then correlate this with the loss. If that is indeed the approach (it's difficult to tell), this doesn't seem to suggest a tractable experiment. Also, the conclusion is somewhat obvious: the tighter the single neuron balance, the fewer unnecessary spikes are fired. I recommend that the authors clearly explain their analysis and how they envision its application to biological data.
We indeed measured the instantaneous (and time-averaged) E-I balance while varying the adaptation parameters and then correlating this with the loss. We did not want to imply that the latter panels of Figure 4 are a means to test the efficiency or biological networks or that we are suggesting new and possibly unfeasible experiments. We see it as a way to better conceptually understand how spike triggered adaptation helps the network’s coding efficiency, by tightening the E I balance in a way that it reduces the number of unnecessary spikes. We apologize if the previous text was confusing in this respect. We have now removed the initial paragraph of former Results Subsection (including removing the subsection title) and added new text about different effect of adaptation in E and I neurons on Page 9. We also thoroughly revised Figure 5.
(7) The external stimuli are repeatedly said to vary (or be tracked) across "multiple time scales", which might inadvertently be interpreted as (i) a single stimulus containing multiple timescales or (ii) simultaneously presented stimuli containing different timescales. These scenarios are potential targets for efficient coding through neuronal adaptation (reference 21 in the manuscript and Pozzorini et al. Nat. Neuro. 2013), but they are not addressed in the current model. I recommend the authors clarify their statements regarding timescales (and if they're up for it, acknowledge this as a limitation).
We thank the reviewer for bringing up this interesting point. To address the second point raised by the Reviewer (simultaneously presented stimuli containing multiple timescales), we performed new analyses to test the model with simultaneously presented stimuli that have different timescales. We found that the model encodes efficiently such stimuli. We tested the case with a 3-dimensional stimulus where each dimension is an Ornstein-Uhlenbeck process with a different time constant. More precisely, we kept the time constant in the first dimension fixed (at 10 ms), and varied the time constant in the second and third dimension such that the time constant in the third dimension is doubled with respect to the second dimension. We plotted the encoding error in every stimulus dimension for E and I neurons (Fig. 8B, left plot) as well as the encoding error and the metabolic cost averaged across stimulus dimensions (Fig. 8B, right plot). The results are briefly described with text on page 13.
Regarding the case i) (single stimulus containing multiple timescales), we considered two possibilities. One possibility is that timescales of the stimulus are separable, and in this case a single stimulus containing several time scales can be decomposed in several stimuli with a single time scale each. As we assign a new set of weights for each dimension of the decomposed stimulus, this case is similar to the case ii) that we already addressed. Another possibility is that timescales of the stimulus cannot be separated. This case is not covered in the present analysis and we listed it among the limitations of the model. We revised the text (page 13) around the question of multiple time scales and included the citation of Pozzorini et al. (2013).
(8) It is claimed that the model uses a mixed code to represent signals, citing reference 47 (Rigotti et al., Nature 2013). But whereas the model seems to use linear mixed selectivity, the Rigotti reference highlights the virtues of nonlinear mixed selectivity. In my understanding, a linearly mixed code does not enjoy the same benefits since it’s mathematically equivalent to a non-mixed code (simply rotate the readout matrix). I recommend that the authors clarify the type of selectivity used by their model and how it relates to the paper(s) they cite.
The reviewer is correct that our selectivity is a linear mixing of input variables, and differs from the selectivity in Rigotti et al. (2013) which is non-linear. We revised the sentence on page 4 to clarify better that the mixed selectivity we consider is linear and we removed Rigotti’s citation.
(9) Reference 46 is cited as evidence that leaky integration of sensory features is a relevant computation for sensory areas. I don’t think this is quite what the reference shows. Instead, it finds certain morphological and electrophysiological differences between single pyramidal neurons in the primary visual cortex compared to the prefrontal cortex. Reference 46’ then goes on to speculate that these are differences relevant to sensory computation. This may seem like a quibble, but given the centrality of the objectivee function in normative theories, I think it's important to clarify why a particular objective is chosen.
We agree that our reference of Amatrudo et al was not the best reference and that the previous text was confusing. We thus tried to improve on its clarity. We looked at the previous theoretical efficient coding papers introducing this leaky integration and we could not find in the previous theoretical work a justification of this assumption based on experimental papers. However, there is evidence that neurons in sensory structures, and in cortical association areas respond to time varying sensory evidence by summing stimuli over time with a weight that decreases steadily going back in time from the time of firing, which suggests that neurons integrate time-varying sensory features. In many cases, these integration kernels decay approximately exponentially going back in time, and several models explaining successfully perceptual readouts of neural activity work assuming leaky integration. This suggests that the mathematical approximation of leaky integration of sensory evidence, though possibly simplistic, is reasonable. We revised the text in this respect (page 2).
(10) The definition of the objective function uses beta as a tuning parameter, but later parts of the text and figures refer to a parameter g_L which might only be introduced in the convex combination of Eq. 40a.
This is correct. Parameter optimization has been performed on a weighted sum of the average encoding error and cost as given by the Eq. 39a (40a in first submission), with the weighting g<sub>L</sub> for the error versus the cost, and not the beta that is part of the objective in Eq.10. The convex combination in Eq. 39a allowed us to find a set of optimal parameters that is within biologically realistic parameter ranges, which includes realistic values for the firing threshold. The average encoding error and metabolic cost (the two terms on the right-hand side of Eq. 39a, without weighting with g<sub>L</sub>) in our network are of the same order (see Fig 8G for the E-I model where these values are plotted separately for the optimal network). Weighing the cost with optimal beta that is in the range of ~10 would have yielded a network that optimizes almost exclusively the metabolic cost and would bias the results towards solutions with poor encoding accuracy.
To document more fully how the choice of weighting of the error with the cost (g<sub>L</sub>) affects the optimal parameters, we now added new analysis (Fig. 8D and Supplementary Fig. S4 A-D and H) showing optimal parameters as a function of this weighting. We commented on these results in the text on pages 9-11 and 12. For further details, please see also the reply to point 1 or Reviewer 1.
(11) Figure 1J: "In E neurons, the distribution of inhibitory and of net synaptic inputs overlap". In my understanding, they are in fact identical, and this is by construction. It might help the reader to state this.
We apologize for an unclear statement. In E neurons, net synaptic current is the sum of the feedforward current and of recurrent inhibition (Eq. 29c and Eq. 42). With our choice of tuning parameters that are symmetric around zero and with stimulus features that have vanishing mean, the mean of the feedforward current is close to zero. Because of this, the mean of the net current is negative and is close to the mean of the inhibitory current. We have clarified this in the text (page 5).
(12) A few typos:
- p1. "Minimizes the encoding accuracy" should be "maximizes..."
- p1: "as well the progress" should be something like "as well as the progress"
- p.11 In recorded neurons where excitatory or inhibitory. ", "where" should be "were" - Fig3: missing parentheses (B)
- Fig4B: the 200 ticks on the y-scale are cut off.
- Panel Fig. 5a: "stimulus" should be "stimuli".
- Ref 24 "Efficient andadaptive sensory codes" is missing a space.
- p. 26: "requires" should be "required".
- On several occasions, the article "the" is missing.
We thank the reviewer for kindly pointing out the typos that we now corrected.
Reviewer #2 (Recommendations For The Authors):
I would like to give the authors more details about the two main weaknesses discussed above, so that they may address specific points in the paper. First, there is the relation to previous work. Several published articles have presented very similar results to those discussed here, including references 5, 26, 28, 32, 33, 42, 43, 48, and an additional reference not cited by the authors (Calaim et al. 2022 eLife e73276). This includes:
(1) Derivation of an E-I efficient spiking network, which is found in refs. 28, 42, 43, and 48. This is not reflected in the text: e.g., "These previous implementations, however, had neurons that did not respect Dale's law" (Introduction, pg. 1); "Unlike previous approaches (28, 48), we hypothesize that E and I neurons have distinct normative objectives...". The authors should discuss how their derivation compares to these.
We have now fully clarified on page 3 that our model builds on the seminal previous works that introduced E-I networks with efficient coding (Supplementary text in Boerlin et al. 2013, Chalk et al. 2016, Barrett et al. 2016).
(2) Inclusion of a slow adaptation current: I believe this also appears in a previous paper (Gutierrez & Deneve 2019, ref. 33) in almost the exact same form, and is again not reflected in the text: "The strength of the current is proportional to the difference in inverse time constants ... and is thus absent in previous studies assuming that these time constants are equal (... ref. 33). Again, the authors should compare their derivation to this previous work.
We thank the reviewer for pointing this out. We sincerely apologize if our previous version did not recognize sufficiently clearly that the previous work of Gutierrez and Deneve (eLife 2019; ref 33) introduced first the slow adaptation current that is similar to spike-triggered adaptation in our model. We have made sure that the revised text recognizes it more clearly. We also explained better what we changed or added with respect to this previous work (see revised text on page 8).
The work by Gutierrez and Deneve (2019) emphasizes the interplay between single neuron property (an adapting current in single neurons) and network property (networklevel coding through structured recurrent connections). They use a network that does not distinguish E and I neurons. Our contribution instead focuses on the adaptation in an E-I network. To improve the presentation following the Reviewer’s comment, we now better emphasize the differential effect of adaptation in E and in I neurons in revision (Fig. 5 B-D). Moreover, Gutierrez and Deneve studied the effect of adaptation on slower time scales (1 or 2 seconds) while we study the adaptation on a finer time scale of tens of milliseconds. The revised text detailed this is reported on Page 8.
(3) Background currents and physical units: Pg. 26: "these models did not contain any synaptic current unrelated to feedforward and recurrent processing" and "Moreover previous models on efficient coding did not thoroughly consider physical units of variables" - this was briefly described in ref. 28 (Boerlin et al. 2013), in which the voltage and threshold are transformed by adding a common constant, and additional aspects of physical units are discussed.
It is correct that Boerlin et al (2013) suggested adding a common constant to introduce physical units. We now revised the text to make clearer the relation between our results and the results of Boerlin et al. (2013) (page 3). In our paper, we built on Boerlin et al. (2013) and assigned physical units to computational variables that define the model's objective (the targets, the estimates, the metabolic constant, etc.). We assigned units to computational variables in such a way that physical variables (such as membrane potential, transmembrane currents, firing thresholds and resets) have the correct physical units. We have now clarified how we derived physical units in the section of Results where we introduce the biophysical model (page 3) and specified how this derivation relates to the results in Boerlin et al. (2013).
(4) Voltage correlations, spike correlations, and instantaneous E/I balance: this was already pointed out in Boerlin et al. 2013 (ref 28; from that paper: "Despite these strong correlations of the membrane potentials, the neurons fire rarely and asynchronously") and others including ref. 32. The authors mention this briefly in the Discussion, but it should be more prominent that this work presents a more thorough study of this well-known characteristic of the network.
We agree that it would be important to comment on how our results relate to these results in Boerlin et al. (2013). It is correct that in Boerlin et al. (2013) neurons have strong correlations in the membrane potentials, but fire asynchronously, similarly to what we observe in our model. However, asynchronous dynamics in Boerlin et al. (2013) strongly depends on the assumption of instantaneous synaptic transmission and time discretization, with a “one spike per time bin” rule in numerical implementation. This rule enforces that at most one spike is fired in each time bin, thus actively preventing any synchronization across neurons. If this rule is removed, their network synchronizes, unless the metabolic constant is strong enough to control such synchronization to bring it back to asynchronous regime (see ref. 36). Our implementation does not contain any specific rule that would prevent synchronization across neurons. We now cite the paper by Boerlin and colleagues and briefly summarize this discussion when we describe the result of Fig. 3D on page 7.
(5) Perturbations and parameters sweep: I found one previous paper on efficient spiking networks (Calaim et al. 2022) which the authors did not cite, but appears to be highly relevant to the work presented here. Though the authors perform different perturbations from this previous study, they should ideally discuss how their findings relate to this one. Furthermore, this previous study performs extensive sweeps over various network parameters, which the authors might discuss here, when relevant. For example, on pg. 8, the authors write “We predict that, if number of neurons within the population decreases, neurons have to fire more spikes to achieve an optimal population readout” – this was already shown in Calaim et al. 2022 Figure 5, and the authors should mention if their results are consistent.
We apologize for not being aware of Calaim et al. (2022) when we submitted the first version of our paper. This important study is now cited in the revised version. We have now, as suggested, performed sweeps of multiple parameters inspired by the work of Calaim. This new analysis is described extensively in reply to Weaknesses in the Public Review of reviewer 2 and is found in Fig 2, 6I and 7J and described on pages 5,11 and 13.
The Reviewer is also correct that the compensation mechanism that applies when changing the ratio of E-I neuron numbers is similar to the one described in Barrett et al. (2016) and related to our claim “if number of neurons within the population decreases, neurons have to fire more spikes to achieve an optimal population readout”. We have now added (page 11) that this prediction is consistent with the finding of Barrett et al. (2016).
With regard to the dependence of optimal coding properties on the number of neurons, we have tried to better describe similarities and differences with our work and that of Calaim et al as well as with the work of Barrett et al. (2016) which reports highly relevant results. These additional considerations are summarized in a paragraph in Discussion (page 16).
(6) Overall, the authors should distinguish which of their results are novel, which ones are consistent with previous work on efficient spiking networks, and which ones are consistent in general with network implementations of efficient and sparse coding. In many of the above cases, this manuscript goes into much more depth and study of each of the network characteristics, which is interesting and commendable, but this should be made clear. In clarifying the points listed above, I hope that the authors can better contextualize their work in relation to previous studies, and highlight what are the unique characteristics of the model presented here.
We made a number of clarifications of the text to provide better contextualization of our model within existing literature and to credit more precisely previous publications. This includes commenting on previous studies that introduced separate objective functions of E and I neurons (page 2), spike-triggered adaptation (page 8), physical units (page 3), and changes in the number of neurons in the network (page 16).
Next, there are the claims of optimal parameters. As explained on pg. 35 (criterion for determining optimal model parameters), it appears to me that they simply vary each parameter one at a time around the optimal value. This argument appears somewhat circular, as they would need to know the optimal parameters before starting this sweep. In general, I find these optimality considerations to be the most interesting and novel part of the paper, but the simulations are relatively limited, so I would ask the authors to either back them up with more extensive parameter sweeps that consider covariations in different parameters simultaneously (as in Calaim et al. 2022). Furthermore, the authors should make sure that they are not breaking any of the required relationships between parameters necessary for the optimization of the loss function. Again, some of the results (such as coding error not being minimized with zero metabolic cost) suggests that there might be issues here.
We thank the reviewer for this insightful suggestion. We have now added a joint sweep of all relevant model parameters using Monte-Carlo parameter search with 10.000 iterations. We randomly drew parameter configurations from predetermined parameter ranges that are detailed in the newly added Table 2. Parameters were sampled from a uniform distribution. We varied all the six model parameters studied in the paper (metabolic constant, noise intensity, time constant of single E and I neurons, ratio of E to I neurons and ratio of the mean I-I to E-I connectivity). We now present these results on a new Figure 2. We did not find any set of parameters with lower loss than the parameters in Table 1 when the weighting of the error with the cost was in the following range: 0.4<g<sub>L</sub><0.81 (Fig. 2C). While our large but finite Monte-Carlo random sampling does not fully prove that the configuration we selected as optimal (on Table 1) is a global optimum, it shows that this configuration is highly efficient. Further, and as detailed in the rebuttal to the Weaknesses of the Public Review of Referee 2, analyses of the near optimal solutions are compatible with the notion (resulting from the join parameter sweep studies that we added to Figures 6 and 7) that network optimality may be influenced by joint covariations in parameters. These new results are reported in Results (page 5, 11 and 13) and in Figure 2, 6I an 7J.
Some more specific points:
(1) In general, I find it difficult to understand the scaling of the RMSE, cost, and loss values in Figures 4-7. Why are RMSE values in the range of 1-10, whereas loss and cost values are in the range of 0-1? Perhaps the authors can explicitly write the values of the RMSE and loss for the simulation in Figure 1G as a reference point.
Encoding error (RMSE), metabolic cost (MC) and average loss for a well performing network are within the range of 1-10 (see Fig. 8G or 7C in the first submission). To ease the visualization of results, we normalized the cost and the loss on Figs. 6-8 in order to plot them on the same figure (while the computation of the optima is done following the Eq. 39 and is without normalization). We have now explicitly written the values of RMSE, MC and the average loss (non-normalized) for the simulation in Fig. 1D on page 5, as suggested by the reviewer. We have also revised Fig. 4 and now show the absolute and not the relative values of the RMSE and the MC (metabolic cost).
(2) Optimal E-I neuron ratio of 4:1 and efficacy ratio of 3:1: besides being unintuitive in relation to previous work, are these two optimal settings related to one another? If there are 4x more excitatory neurons than inhibitory neurons, won't this affect the efficacy ratio of the weights of the two populations? What happens if these two parameters are varied together?
Thanks for this insightful point. Indeed, the optima of these two parameters are interdependent and positively correlated - if we decrease the E-I neuron ratio, the optimal efficacy ratio decreases as well. To better show this relation we added figures with 2dimensional parameter search (Fig. 7J) where we varied jointly the two ratios. The red cross on the right figure marks the optimal ratios used as optimal parameters in our study. These finding are discussed on page 13.
(3) Optimal dimensionality of M=[1,4]: Again, previous work (Calaim et al. 2022) would suggest that efficient spiking networks can code for arbitrary dimensional signals, but that performance depends on the redundancy in the network - the more neurons, the better the coding. From this, I don't understand how or why the authors find a minimum in Figure 7B. Why does coding performance get worse for small M?
We optimized all model parameters with M=3 and this is the reason why M=3 is the optimal number of inputs when we vary this parameter. Our network shows a distinct minimum of the encoding error as a function of the stimulus dimensionality for both E and I neurons (Fig. 8C, top). This minimum is reflected in the minimum of the average loss (Fig. 8C, bottom). The minimum of the loss is shifted (or biased) by the metabolic cost, with strong weighting of the cost lowering the optimal number of inputs. This is discussed on pages 13-14.
Here are a list of other, more minor points, that the authors can consider addressing to make the results and text more clear:
(1) Feedforward efficient coding models: in the introduction (pg. 1) and discussion (pg. 11) it is mentioned that early efficient coding models, such as that of Olshausen & Field 96, were purely feedforward, which I believe to be untrue (e.g., see Eq. 2 of O&F 96). Later models made this even more explicit (Rozell et al. 2008). Perhaps the authors can either clarify what they meant by this, or downplay this point.
We sincerely apologize for the oversight present in the previous version of the text. We agree with the reviewer that the model in Olshausen and Field (1996) indeed defines a network with recurrent connections, and the same type of recurrent connectivity has been used by Rozell et al. (2008, 2013). The structure of the connectivity in Olshausen and Field (as well as in Rozell et al (2008)) is closely related to the structure of connectivity that we derived in our model. We have corrected the text in the introduction (page 1) to remove these errors.
(2) Pg. 2 - The authors state: "We draw tuning parameters from a normal distribution...", but in the methods, it states that these are then normalized across neurons, so perhaps the authors could add this here, or rephrase it to say that weights are drawn uniformly on the hypersphere.
We rephrased the description of how weights were determined (page 2).
(3) Pg. 2 - "We hypothesize the time-resolved metabolic cost to be proportional to the estimate of a momentary firing rate of the neural population" - from what I can see, this is not the usual population rate, which would be an average or sum of rates across the population.
Indeed, the time-dependent metabolic cost is not the population rate (in the sense of the sum of instantaneous firing rates across neurons), but is proportional to it by a factor of 1/t. More precisely, we can define the instantaneous estimate of the firing rate of a single neuron i as z<sub>i</sub>(t) = 1/t<sub>r</sub> r<sub>i</sub>(t) with r<sub>i</sub>(t) as in Eq. 7. We have clarified this in the revised text on page 3.
(4) Pg. 3: "The synaptic strength between two neurons is proportional to their tuning similarity if the tuning similarity is positive" - based on the figure and results, this appears to be the case for I-E, E-I, and I-I connections, but not for E-E connections. This should be clarified in the text. Furthermore, one reference given in the subsequent sentence (Ko et al. 2011, ref. 51), is specifically about E-E connections, so doesn't appear to be relevant here.
We have now specified that the Eq. 24 does not describe E-E connections. We also agree that the reference (Ko et al. 2011) did not adequately support our claim and we thus removed it and revised the text on page 3 accordingly.
(5) Pg. 3: "the relative weight of the metabolic cost over the encoding error controls the operating regime of the network" and "and an operating regime controlled by the metabolic constant" - what do you mean by operating regime here?
We used the expression “operating regime” in the sense of a dynamical regime of the network. However, we agree that this expression may be confusing and we removed it in revision.
(6) Pg. 3: "Previous studies interpreted changes of the metabolic constant beta as changes to the firing thresholds, which has less biological plausibility" - can the authors explain why this is less plausible, or ideally provide a reference for it?
In biological networks, global variables such as brain state can strongly modulate the way neural networks respond to a feedforward stimulus. These variables influence neural activity in at least two distinct ways. One is by changing non-specific synaptic inputs to neurons, which is a network-wide effect (Destexhe and Pare, Nature Reviews Neurosci. 2003). This is captured in our model by changing the strength of the mean and fluctuations in the external currents. Beyond modulating synaptic currents, another way of modulating neural activity is by changing cell-intrinsic factors that modulate the firing threshold in biological neurons (Pozzorini et al. 2013). Previous studies on spiking networks with efficient coding interpreted the effect of the metabolic constant as changes to the firing threshold (Koren and Deneve, 2017, Gutierrez and Deneve 2019), which corresponds to cell-intrinsic factors. Here we instead propose that the metabolic constant modulates the neural activity by changing the non-specific synaptic input, homogeneously across all neurons in the network. Interpreting the metabolic constant as setting the mean of the non-specific synaptic input was necessary in our model to find an optimal set of parameters (as in Table 1) that is also biologically plausible. We revised the text accordingly (page 4).
(7) Pg. 4: Competition across neurons: since the model lacks E-E connectivity, it seems trivial to conclude that there is competition through lateral inhibition, and it can be directly determined from the connectivity. What is gained from running these perturbation experiments?
We agree that a reader with a good understanding of sparse / efficient coding theory can tell that there is competition across neurons with similar tuning already from the equation for the recurrent connectivity (Eq. 24). However, we presume that not all readers can see this from the equations and that it is worth showing this with simulations.
Following the reviewer's comment, we have now downplayed the result about the model manifesting lateral inhibition in general on page 6. We have also removed its extensive elaboration in Discussion.
One reason to run perturbation experiments was to test to what extent the optimal model qualitatively replicates empirical findings, in particular, single neuron perturbation experiments in Chettih and Harvey, 2019, without specifically tuning any of the model parameters. We found that the model reproduces qualitatively the main empirical findings, without tuning the model to replicate the data. We revised the text on page 5 accordingly.
Further reason to run these experiments was to refine predictions about the minimal amount of connectivity structure that generates perturbation response profiles that are qualitatively compatible with empirical observations. To establish this, we did perturbation experiments while removing the connectivity structure of a particular connectivity sub-matrices (E-I, I-I or I-E; Fig. S3 F). This allowed us to determine which connectivity matrix has to be structured to observe results that qualitatively match empirical findings. We found that the structure of E-I and I-E connectivity is necessary, but not the structure of I-I connectivity. Finally, we tested partial removal of the connectivity structure where we replaced the precise (and optimal) connectivity structure and imposed a simpler connectivity rule. In the optimal connectivity, the connection strength is proportional to the tuning similarity. A simpler connectivity rule, in contrast, only specifies that neurons with similar tuning share a connection, and beyond this the connection strength is random. Running perturbation experiments in such a network obeying a simpler connectivity rule still qualitatively replicated empirical results from Chettih and Harvey (2019). This is shown on the Supplementary Fig. S2F on described on page 8.
(8) Pg. 4: "the optimal E-I network provided a precise and unbiased estimator of the multidimensional and time-dependent target signal" - from previous work (e.g., Calaim et al. 2022), I would guess that the estimator is indeed biased by the metabolic cost. Why is this not the case here? Did you tune the output weights to remove this bias?
Output weights were not tuned to remove the bias. On Fig. 1H in the first submission we plotted the bias for the network that minimizes the encoding error. We forgot to specify this in the text and figure caption, for which we apologize. We now replaced this figure with a new one (Fig. 1E) where we plot the bias of the network minimizing the average loss (with parameters as in Table 1). The bias of the network minimizing the error is close to zero, B^E = 0.02 and B^I = 0.03. The bias of the network minimizing the loss is stronger and negative, B^E = -0.15 and B^I=-0.34. In the text of Results, we now report the bias of both networks (i.e., optimizing the encoding error and optimizing the loss). We also added a plot showing trial-averaged estimates and a time-dependent bias in each stimulus dimension (Supplementary figure S1 F). Note that the network minimizing the encoding error requires a lower metabolic constant (β = 6) than the network optimizing the loss (β=14), however, the optimal metabolic cost in both networks is nonzero. We revised the text and explained these points on page 5.
(9) Pg. 4: "The distribution of firing rates was well described by a log-normal distribution" - I find this quite interesting, but it isn't clear to me how much this is due to the simulation of a finitetime noisy input. If the neurons all have equal tuning on the hypersphere, I would expect that the variability in firing is primarily due to how much the input correlates with their tuning. If this is true, I would guess that if you extend the duration of the simulation, the distribution would become tighter. Can you confirm that this is the stationary distribution of the firing rates?
We now simulated the network with longer simulation time (10 seconds of simulated time instead of 2 seconds used previously) and also iterated the simulation across 10 trials to report a result that is general across random draws of tuning parameters (previously a single set of tuning parameters was used). The reviewer is correct that the distribution of firing rates of E neurons has become tighter with longer simulation time, but distributions remain log-normal. We also recomputed the coefficient of variation (CV) using the same procedure. We updated these plots on Fig. 1F.
(10) Pg. 4: "We observed a strong average E-I balance" - based on the plots in Figure 1J, the inputs appear to be inhibition-dominated, especially for excitatory neurons. So by what criterion are you calling this strong average balance?
The reviewer is correct about the fact that the net synaptic input to single neurons in our optimal network shows excess inhibition and the network is inhibition-dominated, so we revised this sentence (page 5) accordingly.
(11) Pg. 4: Stronger instantaneous balance in I neurons compared to E neurons - this is curious, and I have two questions: (1) can the authors provide any intuition or explanation for why this is the case in the model? and (2) does this relate to any literature on balance that might suggest inhibitory neurons are more balanced than excitatory neurons?
In our model, I neurons receive excitatory and inhibitory synaptic currents through synaptic connections that are precisely structured. E neurons receive structured inhibition and a feedforward current. The feedforward current consists of M=3 independent OU processes projected on the tuning vectors of E neurons w<sub>i</sub><sup>E</sup>. We speculate that because the synaptic inhibition and feedforward current are different processes and the 3 OU inputs are independent, it is harder for E neurons to achieve the instantaneous balance that would be as precise as in I neurons. While we think that the feedforward current in our model reflects biologically plausible sensory processing, it is not a mechanistic model of feedforward processing. In biological neurons, real feedforward signals are implemented as a series of complex feedforward synaptic inputs from downstream areas, while the feedforward current in our model is a sum of stimulus features, and is thus a simplification of a biological process that generates feedforward signals. We speculate that a mechanistic implementation of the feedforward current could increase the instantaneous balance in E neurons. Furthermore, the presence of EE connections could potentially also increase the instantaneous balance in E neurons. We revised the Discussion about these important questions that lie on the side of model limitations and could be advanced in future work. We could not find any empirical evidence directly comparing the instantaneous balance in E versus I neurons. We have reported these considerations in the revised Discussion (page 16).
(12) Pg. 5, comparison with random connectivity: "Randomizing E-I and I-E connectivity led to several-fold increases in the encoding error as well as to significant increases in the metabolic cost" and Discussion, pg. 11: "the structured network exhibits several fold lower encoding error compared to unstructured networks": I'm wondering if these comparisons are fair. First, regarding activity changes that affect the metabolic cost - it is known that random balanced networks can have global activity control, so it is not straightforward that randomizing the connectivity will change the metabolic cost. What about shuffling the weights but keeping an average balance for each neuron's input weights? Second, regarding coding error, it is trivial that random weights will not map onto the correct readout. A fairer comparison, in my opinion, would at least be to retrain the output weights to find the best-fitting decoder for the threedimensional signal, something more akin to a reservoir network.
Thank you for raising these interesting questions. The purpose of comparing networks with and without connectivity structure was to observe causal effects of the connectivity structure on the neural activity. We agree that the effect on the encoding error is close to trivial, because shuffling of connectivity weights decouples neural dynamics from decoding weights. We have carefully considered Reviewer's suggestions to better compare the performance of structured and unstructured networks.
In reply to the first point, we followed the reviewer's suggestion and compared the optimal network with a shuffled network that matched the optimal network in its average balance. This was achieved by increasing the metabolic constant, decreasing the noise intensity and slightly decreasing the feedforward stimulus (we did not find a way to match the net current in both cell types by changing a single parameter). As we compared the metabolic cost between the optimal and the shuffled network with matched average balance, we still found lower metabolic cost in the optimal network, even though the difference was now smaller. We replaced Fig. 3B from the first submission with these new results in Fig. 4B and commented on them in the text (page 7).
In reply to the second point, we followed reviewer’s suggestion and compared the encoding error (RMSE) of the optimal network and the network with shuffled connectivity where decoding weights are trained such as to optimally reconstruct the target signal. As suggested, we now analyzed the encoding error of the networks using decoding weights trained on the set of spike trains generated by the network using linear least square regression to minimize the decoding error. For a fair and quantitative comparison and because we did not train decoding weights of our structured model, we performed this same analysis using spike trains generated by networks with structured and shuffled recurrent connectivity. We found that the encoding error is smaller in the E population and much smaller in the I population in the structured compared to the random network. Decoding weights found numerically in the optimal network approach uniform distribution of weights that we used in our model (Fig. 4A, right). In contrast, decoding weights obtained from the random network do not converge to a uniform distribution, but instead form a much sparser distribution, in particular in I neurons (Supplementary Fig. S3 A). These additional results reported in the above mentioned figures are discussed in text on page 14.
(13) Pg. 5: "a shift from mean-driven to fluctuation-driven spiking" and Pg. 11 "a network structured as in our efficient coding solution operates in a dynamical regime that is more stimulus-driven, compared to an unstructured network that is more fluctuation driven" - I would expect that the balanced condition dictates that spiking is always fluctuation driven. I'm wondering if the authors can clarify this.
We agree with the reviewer that networks with and without connectivity structure are fluctuation-driven, because in a mean-driven network the mean current must be suprathreshold (Ahmadian and Miller, 2021), which is not the case of either network. We removed the claim of the change from mean to fluctuation driven regime in the revised paper. We are grateful to the Reviewer for helping us tighten the elaboration of our findings.
(14) Pg. 5: "suggesting that variability of spiking is independent of the connectivity structure" - the literature of balanced networks argues against this. Is this not simply because you have a noisy input? Can you test this claim?
We thank the reviewer for the suggestion. We tested this claim by measuring the coefficient of variation in networks receiving a constant stimulus. In particular, we set the same strength in each of the M=3 stimulus dimensions and set the stimulus amplitude such as to match the firing rate of the optimal network in response to the OU stimulus. We computed the coefficient of variation in 200 simulation trials. The removal of connectivity structure did not cause significant change of the coefficient of variation in a network driven by a constant stimulus (Fig. 4E). These additional results are discussed in text on page 7.
We also taken the suggestion about variability of spiking being independent of the connectivity structure. We removed this claim in the revision, because we only tested a couple of specific cases where the connectivity is structured with respect to tuning similarity (fully structured, fully unstructured and partially unstructured networks). This is not exhaustive of all possible structures that recurrent connectivity may have.
(15) Pg. 6: "we also removed the connectivity structure only partially, keeping like-to-like connectivity structure and removing all structure beyond like-to-like" - can you clarify what this means, perhaps using an equation? What connectivity structure is there besides like-to-like?
In the optimal model, the strength of the synapse between a pair of neurons is proportional to the tuning similarity of the two neurons, Y<sub>ij</sub> proportional to J<sub>ij</sub> for Y<sub>ij</sub> >0 (see Eq. 24 and Fig. 1C(ii)). Besides networks with optimal connectivity, we also tested networks with a simpler connectivity rule. Such a simpler rule prescribes a connection if the pair of neurons has similar tuning (Y<sub>ij</sub> >0), and no connection otherwise. The strength of the connection following this simpler connectivity rule is otherwise random (and not proportional to pairwise tuning similarity Y<sub>ij</sub> as it is in the optimal network). We clarified this in the revision (page 8), also by avoiding the term “like-to-like” for the second type of networks, which could indeed be prone to confusion.
(16) Pgs. 6-7: "we indeed found that optimal coding efficiency is achieved with weak adaptation in both cell types" and "adaptation in E neurons promotes efficient coding because it enforces every spike to be error- correcting" - this was not clear to me. First, it appears as though optimal efficiency is achieved without adaptation nor facilitation, i.e., when the time constants are all equal. Indeed, this is what is stated in Table 1. So is there really a weak adaptation present in the optimal case? Second, it seems that the network already enforces each spike to be errorcorrecting without adaptation, so why and how would adaptation help with this?
We agree with the Reviewer that the network without adaptation in E and I neurons is already optimal. It is also true that most spikes in an optimal network should already be error-correcting (besides some spikes that might be caused by the noise). However, regimes with weak adaptation in E neurons remain close to optimality. Spike-triggered facilitation, meanwhile, ads spikes that are unnecessary and decrease network efficiency. We revised the Fig.5 (Fig. 4 in first submission) and replaced 2-dimensional plots in Fig.4 C-F with plots that show the differential effect of adaptation in E neurons (top) and in I neurons (bottom plots) for the measures of the encoding error (RMSE), the efficiency (average loss) and the firing rate (Fig. 5B-D). On the new Fig. 5C it is evident that the loss of E and I population grows slowly with adaptation in E neurons (top) while it grows faster with adaptation in I neurons (bottom). These considerations are explained in revised text on page 9.
(17) Pg. 7: "adaptation in E neurons resulted in an increase of the encoding error in E neurons and a decrease in I neurons" - it would be nice if the authors could provide any explanation or intuition for why this is the case. Could it perhaps be because the E population has fewer spikes, making the signal easier to track for the I population?
We agree that this could indeed be the case. We commented on it in revision (page 9).
(18) Pg. 7: "The average balance was precise...with strong adaptation in E neurons, and it got weaker when increasing the adaptation in I neurons (Figure 4E)" - I found the wording of this a bit confusing. Didn't the balance get stronger with larger I time constants?
By increasing the time constant of I neurons, the average imbalance got weaker (closer to zero) in E neurons (Fig. 5G, left), but stronger (further away from zero) in I neurons (Fig. 5G, right). We have revised the text on page 9 to make this clearer.
(19) Pg. 7: Figure 4F is not directly described in the text.
We have now added text (page 9) commenting on this figure in revision.
(20) Pg. 8: "indicating that the recurrent network dynamics generates substantial variability even in the absence of variability in the external current" -- how does this observation relate to your earlier claim (which I noted above) that "variability of spiking is independent of connectivity structure"?
We agree that the claim about variability of spiking being independent of connectivity structure was overstated and we thus removed it. The observation that we wanted to report is that both structured and unstructured networks have very similar levels of variability of spiking of single neurons. The fact that much of the variability of the optimal network is generated by recurrent connections is not incompatible. We revised the related text (page 11) for clarity.
(21) Pg. 9: "We found that in the optimally efficient network, the mean E-I and I-E synaptic efficacy are exactly balanced" - isn't this by design based on the derivation of the network?
True, the I-E connectivity matrix is the transpose of the E-I connectivity matrix, and their means are the same by the analytical solution. This however remains a finding of our study. We have clarified this in the revised text (page 12).
(22) Pg. 30, eq. 25: the authors should verify if they include all possible connectivity here, or if they exclude EE connectivity beforehand.
We now specify that the equation for recurrent connectivity (Eq. 24, Eq. 25 in first submission) does not include the E-E connectivity in the revised text (page 41).
Reviewer #3 (Recommendations For The Authors):
Essential
(1) Currently, they measure the RMSE and cost of the E and I population separately, and the 1CT model. Then, they average the losses of the E and I populations, and compare that to the 1CT model, with the conclusion that the 1CT model has a higher average loss. However, it seems to me that only the E population should be compared to the 1CT model. The I population loss determines how well the I population can represent the E population representation (which it can do extremely well). But the overall coding accuracy of the network of the input signal itself is only represented by the E population. Even if you do combine the E and I losses, they should be summed, not averaged. I believe a more fair conclusion would be that the E/I networks have generally slightly worse performance because of needing to follow Dale's law, but are still highly efficient and precise nonetheless. Of course, I might be making a critical error somewhere above, and happy to be convinced otherwise!
We carefully considered the reviewer's comment and tested different ways of combining the losses of the E and I population. We decided to follow the reviewer's suggestion and to compare the loss of the E population of the E-I model with the loss of the one cell type model. As evident already from the Fig. 8G, such comparison indeed changes the result to make the 1CT model more efficient. Also, the sum of losses of E and I neurons results in the 1CT model being more efficient than the E-I model. Note, however, the robustness of the E-I model to changes in the metabolic constant (Fig. 6C, top). The firing rates of the E-I model stay within physiological ranges for any value of the metabolic constant, while the firing rate of the 1CT model skyrocket for the metabolic constant that is lower than optimal (Fig. 8I).
We added to Results (page 14) a summary of these findings.
(2) The methods and main text should make much clearer what aspects of the derivation are novel, and which are not novel (see review weaknesses for specifics).
We specified these aspects, as discussed in more detail in the above reply to point 4 of the public review of Reviewer 1.
Request:
If possible, I would like to see the code before publication and give recommendations on that (is it easy to parse and reproduce, etc.)
We are happy to share the computer code with the reviewer and the community. We added a link to our public repository containing the computer code that we used for simulations and analysis to the preprint and submission (section “Code availability” on page 17).
Suggestions:
(1) I believe that for an eLife audience, the main text is too math-heavy at the beginning, and it could be much simplified, or more effort could be made to guide the reader through the math.
We tried to do our best to improve the clarity of description of mathematical expressions in the main text.
(2) Generally vector notation makes network equations for spiking neurons much clearer and easier to parse, I would recommend using that throughout the paper (and not just in the supplementary methods).
We now use vector notation throughout the paper whenever we think that this improves the intelligibility of the text.
(3) In the discussion or at the end of the results adding a clear section summarizing what the minimal requirements or essential assumptions are for biological networks to implement this theory would be helpful for experimentalists and theorists alike.
We have added such a section in Discussion (page 15).
(5) I think the title is a bit too cumbersome and hard to parse. Might I suggest something like 'Efficient coding and energy use in biophysically realistic excitatory-inhibitory spiking networks' or 'Biophysically constrained excitatory-inhibitory spiking networks can efficiently implement efficient coding'.
We followed reviewer’s suggestion and changed the title to “Efficient coding in biophysically realistic excitatory-inhibitory spiking networks.”
(6) How the connections were shuffled exactly was not clear to me in how it was described now. Did they just take the derived connectivity, and shuffle the connections around? I recommend a more explicit methods section on it (I might have missed it).
Indeed, the connections of the optimal network were randomly shuffled, without repetition, between all neuronal pairs of a specific connectivity matrix. This allows to preserve all properties of the distribution of connectivity weights and only removes the structure of the connectivity, which is precisely what we wanted to test. We now added a section in Methods (“Removal of connectivity structure”) on pages 51-52 where we explain how the connectivity structure is removed.
(7) Figure 1 sub-panel ordering was confusing to read (first up down, then left right). Not sure if re- arranging is possible, but perhaps it could be A, B, and C at the top, with subsublabels (i) and (ii). Might become too busy though.
We followed this suggestion and rearranged the Fig. 1 as suggested by the reviewer.
(8) Equation 3 in the main text should specify that 'y' stands for either E or I.
This has been specified in the revision (page 3).
(9) Figure 1D shows a rough sketch of the types of connectivities that exist, but I would find it very useful to also see the actual connection strengths and the effect of enforcing Dale's law.
We revised this figure (now Fig. 1B (ii)) and added connection strengths as well as a sketch of a connection that was removed because of Dale’s law.
(10) The main text mentions how the readout weights are defined (normal distributions), but I think this should also be mentioned in the methods.
Agreed. We indeed had Methods section “Parametrization of synaptic connectivity (page 46), where we explain how readout weights are defined. We apologize if a call on this section was not salient enough in the first submission. We made sure that the revised main text contains a clear pointer to this Methods section for details.
(11) The text seems to mix ‘decoding weights’ and ‘readout weights’.
Thanks for this suggestion to use consistent language. We opted for ‘decoding weights’ and removed ‘readout weights’.
(12) The way the paper is written makes it quite hard to parse what are new experimental predictions, and what results reproduce known features. I wonder if some sort of 'box' is possible with novel predictions that experimentalists could easily look at and design an experiment around.
We now revised the text. We clarified for every property of the model if this property is a prediction of facts that were not yet experimentally tested or if it accounts for previously observed properties of biological neurons. Please see the reply to point 4 of Reviewer 1.
(13) Typo's etc.:
Page 5 bottom -- ("all") should have one of the quotes change direction (common latex typo, seems to be the only place with the issue).
We thank the reviewer for pointing out this typo that has been removed in revision.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the previous reviews.
Public Reviews:
Reviewer #1 (Public review):
Summary:
The authors investigated the anatomical features of the synaptic boutons in layer 1 of the human temporal neocortex. They examined the size of each synapse, the macular or perforated appearance, the size of the synaptic active zone, the number and volume of the mitochondria, and the number of synaptic and dense core vesicles, also differentiating between the readily releasable, the recycling, and the resting pool of synaptic vesicles. The coverage of the synapse by astrocytic processes was also assessed, and all the above parameters were compared to other layers of the human temporal neocortex. The authors conclude that the subcellular morphology of the layer 1 synapses are suitable for the functions of the neocortical layer, i.e. the synaptic integration within the cortical column. The low glial coverage of the synapses might allow increased glutamate spillover from the synapses, enhancing synaptic crosstalk within this cortical layer.
Strengths:
The strengths of this paper are the abundant and very precious data about the fine structure of the human neocortical layer 1. Quantitative electron microscopy data (especially that derived from the human brain) are very valuable since this is a highly time- and energy-consuming work. The techniques used to obtain the data, as well as the analyses and the statistics performed by the authors are all solid, strengthen this manuscript, and mainly support the conclusions drawn in the discussion.
We would like to thank reviewer#1 for his very positive comments on our manuscript stating that such data about the fine structure of the human neocortex are are highly relevant.
Weaknesses:
There are several weaknesses in this work. First, the authors should check and review extensively for improvements to the use of English. Second, several additional analyses performed on the existing data could substantially elevate the value of the data presented. Much more information could be gained from the existing data about the functions of the investigated layer, of the cortical column, and about the information processing of the human neocortex. Third, several methodological concerns weaken the conclusions drawn from the results.
We would like to thank the reviewer for his critical and thus helpful comments on our manuscript. We took the first comment of the reviewer concerning the English and have thus improved our manuscript by rephrasing and shortening sentences. Secondly, according to the reviewer several additional analyses should be performed on the existing data, which could substantially elevate the value of the data presented. We will implement some of the suggestions in the improved version of the manuscript where appropriate. We will address a more detailed answer to the reviewer’s queries in her/his suggestions to the authors (see below). However, the reviewer states himself: “The techniques used to obtain the data, as well as the analyses and the statistics performed by the authors are all solid, strengthen this manuscript, and mainly support the conclusions drawn in the discussion”.
Reviewer #2 (Public review):
Summary:
The study of Rollenhagen et al. examines the ultrastructural features of Layer 1 of the human temporal cortex. The tissue was derived from drug-resistant epileptic patients undergoing surgery, and was selected as far as possible from the epilepsy focus, and as such considered to be non-epileptic. The analyses included 4 patients with different ages, sex, medication, and onset of epilepsy. The manuscript is a follow-on study with 3 previous publications from the same authors on different layers of the temporal cortex:
Layer 4 - Yakoubi et al 2019 eLife
Layer 5 - Yakoubi et al 2019 Cerebral Cortex
Layer 6 - Schmuhl-Giesen et al 2022 Cerebral Cortex.
They find, that the L1 synaptic boutons mainly have a single active zone, a very large pool of synaptic vesicles, and are mostly devoid of astrocytic coverage.
Strengths:
The manuscript is well-written and easy to read. The Results section gives a detailed set of figures showing many morphological parameters of synaptic boutons and glial elements. The authors provide comparative data of all the layers examined by them so far in the Discussion. Given that anatomical data in the human brain are still very limited, the current manuscript has substantial relevance. The work appears to be generally well done, the EM and EM tomography images are of very good quality. The analysis is clear and precise.
We would like to thank the reviewer for his very positive evaluation of our paper and the comments that such data have a substantial relevance, in particular in the human neocortex. In contrast to reviewer#1, this reviewer’s opinion is that the manuscript is well written and easy to read.
Weaknesses:
One of the main findings of this paper is that "low degree of astrocytic coverage of L1 SBs suggests that glutamate spillover and as a consequence synaptic cross-talk may occur at the majority of synaptic complexes in L1". However, the authors only quantified the volume ratio of astrocytes in all 6 layers, which is not necessarily the same as the glial coverage of synapses. In order to strengthen this statement, the authors could provide 3D data (that they have from the aligned serial sections) detailing the percentage of synapses that have glial processes in close proximity to the synaptic cleft, that would prevent spillover.
We agree with the reviewer that we only quantified the volume ratio of the astrocytic coverage but not necessarily the percentage of synapses that may or not contribute to the formation of the ‘tripartite’ synapse. As suggested, we will re-analyze our material with respect to the percentage of coverage for individual synaptic boutons in each layer and will implement the results in the improved version of the manuscript. However, since this is a completely new analysis that is time-consuming we would like to ask the reviewer for additional time to perform this task.
A specific statement is missing on whether only glutamatergic boutons were analyzed in this MS, or GABAergic boutons were also included. There is a statement, that they can be distinguished from glutamatergic ones, but it would be useful to state it clearly in the Abstract, Results, and Methods section what sort of boutons were analyzed. Also, what is the percentage of those boutons from the total bouton population in L1?
We would like to thank the reviewer for this comment. Although our title clearly states, we focused on quantitative 3D-models of excitatory synaptic boutons, we will point out that more clearly in the Methods and Result chapters. Our data support recent findings by others (see for example Cano-Astorga et al. 2023, 2024; Shapson-Coe et al. 2024) that have evaluated the ratio between excitatory vs. inhibitory synaptic boutons in the temporal lobe neocortex, the same area as in our study, which was between 10-15% inhibitory terminals but with a significant layer and region specific difference. We will include the excitatory vs. inhibitory ratio and the corresponding citations in the Results section.
Synaptic vesicle diameter (that has been established to be ~40nm independent of species) can properly be measured with EM tomography only, as it provides the possibility to find the largest diameter of every given vesicle. Measuring it in 50 nm thick sections results in underestimation (just like here the values are ~25 nm) as the measured diameter will be smaller than the true diameter if the vesicle is not cut in the middle, (which is the least probable scenario). The authors have the EM tomography data set for measuring the vesicle diameter properly.
We partially disagree with the reviewer on this point. Using high-resolution transmission electron microscopy, we measured the distance from the outer-to-outer membrane only on those synaptic vesicles that were round in shape with a clear ring-like structure to avoid double counts and discarded all those that were only partially cut according to criteria developed by Abercrombie (1946) and Boissonnat (1988). We assumed that within a 55±5 nm thick ultrathin section (silver to gray interference contrast) all clear-ring-like vesicles were distributed in this section assuming a vesicle diameter between 25 to 40nm. For large DCVs, double-counts were excluded by careful examination of adjacent images and were only counted in the image where they appeared largest.
In addition, we have measured synaptic vesicles using TEM tomography and came to similar results. We will address this in Material and Methods that both methods were used.
It is a bit misleading to call vesicle populations at certain arbitrary distances from the presynaptic active zone as readily releasable pool, recycling pool, and resting pool, as these are functional categories, and cannot directly be translated to vesicles at certain distances. Indeed, it is debated whether the morphologically docked vesicles are the ones, that are readily releasable, as further molecular steps, such as proper priming are also a prerequisite for release.
We thank the reviewer for this comment. However, nobody before us tried to define a morphological correlate for the three functionally defined pools of synaptic vesicles since synaptic vesicles normally are distributed over the entire nerve terminal. As already mentioned above, after long and thorough discussions with Profs. Bill Betz, Chuck Stevens, Thomas Schikorski and other experts in this field we tried to define the readily releasable (RRP), recycling (RP) and resting pools by measuring the distance of each synaptic vesicle to the presynaptic density (PreAZ). Using distance as a criterion, we defined the RRP including all vesicles that were located within a distance (perimeter) of 10 to 20 nm from the PreAZ that is less than an average vesicle diameter (between 25 to 40 nm). The RP was defined as vesicles within a distance of 60-200 nm away, still quite close but also rapidly available on demand and the remaining ones beyond 200 nm were suggested to belong to the resting pool. This concept was developed for our first publication (Sätzler et al. 2002) and this approximation since then is very much acknowledged by scientist working in the field of synaptic neuroscience and computational neuroscientist. We were asked by several labs worldwide whether they can use our data of the perimeter analysis for modeling. We agree that our definition of the three pools can be seen as arbitrary but we never claimed that our approach is the truth but nothing as the truth. Concerning the debate whether only docked vesicles or also those very close the PreAZ should constitute the RRP we have a paper in preparation using our perimeter analysis, EM tomography and simulations trying to clarify this debate. Our preliminary results suggest that the size of the RRP should be reconsidered.
Tissue shrinkage due to aldehyde fixation is a well-documented phenomenon that needs compensation when dealing with density values. The authors cite Korogod et al 2015 - which actually draws attention to the problem comparing aldehyde fixed and non-fixed tissue, still the data is non-compensated in the manuscript. Since all the previous publications from this lab are based on aldehyde fixed non-compensated data, and for this sake, this dataset should be kept as it is for comparative purposes, it would be important to provide a scaling factor applicable to be able to compare these data to other publications.
We thank the reviewer for his suggestion. However, for several reasons we did not correct for shrinkage caused by aldehyde fixation. There are papers by Eyre et al. (2007) and the mentioned paper by Korogod et al. 2015 that have demonstrated that cryo-fixation reveals larger numbers of docked synaptic vesicles, a smaller glial volume, and a less intimate glial coverage of synapses and blood vessels compared to chemical fixation. Other structural subelements such as active zone size and shape and the total number of synaptic vesicles remained unaffected. In two further publications Zhao et al. (2012a, b) investigating hippocampal mossy fiber boutons using cryo-fixation and substitutions came to similar results with respect to bouton and active zone size and number and diameter of synaptic vesicles compared to aldehyde-fixation as described by Rollenhagen et al. 2007 for the same nerve terminal. This was one of the reasons not correcting for shrinkage. In addition, all cited papers state that chemical fixation in general provides a much better ultrastructural preservation of tissue samples when compared with cryo-fixation and substitution where optimal preservation is only regional within a block of tissue and therefore less suitable for large-scale ultrastructural analyses as we performed.
Reviewer #3 (Public review):
Summary:
Rollenhagen et al. offer a detailed description of layer 1 of the human neocortex. They use electron microscopy to assess the morphological parameters of presynaptic terminals, active zones, vesicle density/distribution, mitochondrial morphology, and astrocytic coverage. The data is collected from tissue from four patients undergoing epilepsy surgery. As the epileptic focus was localized in all patients to the hippocampus, the tissue examined in this manuscript is considered non-epileptic (access) tissue.
Strengths:
The quality of the electron microscopic images is very high, and the data is analyzed carefully. Data from human tissue is always precious and the authors here provide a detailed analysis using adequate approaches, and the data is clearly presented.
We are very thankful to the reviewer upon his very positive comments about our data analysis and presentation.
Weaknesses:
The study provides only morphological details, these can be useful in the future when combined with functional assessments or computational approaches. The authors emphasize the importance of their findings on astrocytic coverage and suggest important implications for glutamate spillover. However, the percentage of synapses that form tripartite synapses has not been quantified, the authors' functional claims are based solely on volumetric fraction measurements.
We thank the reviewer for his critical comments on our findings concerning the layer-specific astrocytic coverage as also suggested by reviewer#2. As already stated above we will analyze the astrocytic coverage and the layer-specific percentage of astrocytic contribution to the ‘tripartite’ synapse in more detail. We are, however, a bit puzzled about the comment that structural anatomists usually receive that our study only provides morphological details. Our thorough analysis of structural and synaptic parameters of synaptic boutons underlie and might even predict the function of synaptic boutons in a given microcircuit or network and will thus very much improve our understanding and knowledge about the functional properties of these structures, in particular in the human brain where such studies are still quite rare. The main goal of our studies in the human neocortex was the quantitative morphology of synaptic boutons and thus the synaptic organization of the cortical column, layer by layer which to our knowledge is the first such detailed study undertaken in the human brain. Our efforts have set a golden standard in the analysis of synaptic boutons embedded in different microcircuits und is meanwhile internationally very well accepted.
The distinction between excitatory and inhibitory synapses is not clear, they should be analyzed separately.
As already stated above in response to reviewer#1 our study focused on excitatory synaptic boutons since they represent the majority of synapses. However, in the improved version of our manuscript in the Material and Method section we included a paragraph with structural criteria to distinguish excitatory from inhibitory terminals (see also our comment to reviewer#1 concerning this point) including appropriate citations.
The text connects functional and morphological characteristics in a very direct way. For example, connecting plasticity to any measurement the authors present would be rather difficult without any additional functional experiments. References to various vesicle pools based on the location of the vesicles are also more complex than suggested in the manuscript. The text should better reflect the limitations of the conclusions that can be drawn from the authors' data.
We thank the reviewer for this comment. However, it has been shown by meanwhile numerous publications that the shape and size of the active zone together with the pool of synaptic vesicles and the astrocytic coverage critically determines synaptic transmission and synaptic strength, but can also contribute to the modulation of synaptic plasticity (see also citations within the text). It has been shown that synaptic boutons can switch upon certain stimulation conditions to different modes of release (uni- vs. multiquantal, uni- vs multivesicular release) and from asynchronous to synchronous release leading also to the modulation of synaptic short- and long-term plasticity. To the second comment: When we started with our first paper about the Calyx of Held – principal neuron synapse in the MNTB (Sätzler et al. 2002) we tried to define a morphological correlate for the three functionally defined pools. As already mentioned above in our reply to the other two reviewers, this is rather difficult since synaptic vesicles are normally distributed over the entire nerve terminal. After long and thorough discussions with Bill Betz, Chuck Stevens and other leading scientist in the field of synaptic neuroscience, we together with Bert Sakmann tried to define a morphological correlate for the functionally defined pools using a perimeter analysis. We defined the readily releasable pool as vesicles 10 to 20 nm away from the presynaptic active zone, the recycling pool as those in 60-200 nm distance and the remaining as those belonging to the resting pool. However, it has been shown by capacitance measurements (see for example Hallermann et al 2003), FM1-43 investigations (see for example Henkel et al. 1996) and high-resolution electron microscopy (see for example Schikorski and Stevens 2001; Schikorski 2014) that our estimate of the RRP nearly perfectly matches with the functionally defined pools at hippocampal and cortical synapses (Silver et al. 2003). In addition, in one of our own papers (Rollenhagen et al. 2018) we also estimated the RP functionally from trains of EPSPs using an exponential fit analysis and came to similar results upon its size using the perimeter analysis.
Of course, as stated by the reviewer the scenario could be more complex, using other criteria but we never claimed that our morphologically defined pools are the truth but nothing as the truth but we believe it offers a quite good approximation.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
Abstract:
Avoid the numerous abbreviations in the abstract. The paragraph describing the results obtained in this study is too short. Include more results, such as the size of the active zone, the proportion of perforated synapses, the ratio of synapses terminating on dendrites/spines, the percentage of volume occupied by mitochondria, etc. In the last paragraph, compare the layer-specific data to other layers of the neocortex before writing the concluding sentence.
To meet the word limits of the abstract (150 words) defined by eLife we had to use abbreviations. We followed the suggestions by the reviewer and expanded our abstract by adding the proportion of macular vs. perforated active zone and the percentage of mitochondria within an SB. However, we did not include the comparison of structural parameters in the Abstract since this is discussed thoroughly in the MS at other places (see Results and Discussion).
Results:
First of all, wonderful data! Lots of work, very valuable quantitative electron microscopy results.
Main concerns:
Adding several analyses would give much more information about the cortical synaptic organization. It would be very useful to differentiate between excitatory and inhibitory terminals (and give their ratio) and include this information in all different analyses, such as in the SV number, SV pool analysis, mitochondrion analysis, etc., that would give functional information as well. You have all the data for this, and you know how to differentiate between inhibitory and excitatory synapses, it can be done. We could see the possible morphological differences between excitatory and inhibitory synapses (maybe one is larger/has more SVs, etc. than the other). Based on these possible differences conclusions could be drawn about functional hypotheses, such as one or the other is more efficient in inducing postsynaptic potentials, excitation or inhibition is more pronounced in layer 1, etc. Furthermore, looking at the ratio of perforated synapses, we could gain information about the formation of new synapses. Maybe there is a difference between excitatory and inhibitory circuits in this point of view.
To the first point: Since our focus was on excitatory synaptic boutons as already stated in the title we have not analyzed inhibitory SBs. To do so, we have to re-analyze our complete data which is time-consuming and an additional workload. However, we can give a ratio excitatory vs. inhibitory synaptic boutons which was between 10-15% but with layer-specific differences. Our finding are in good agreement with a recent publication in Science by the Lichtman group (Shapson-Coe et al. 2024) and work by the DeFelipe group (Cano-Astorga et al. 2023, 2024) estimating the number of inhibitory boutons in different layers of the temporal lobe neocortex as we did by 10-15%. We included a small paragraph about inhibitory synapses, their percentage and included the citations in our Results section. Concerning the ratio between macular, non-perforated vs. perforated active zones we stated the majority of synaptic boutons were of the macular, non-perforated type (~75%; see improved version of the MS). If perforated, this was found predominantly on the postsynaptic site, but quite rare in L1 SBs. Since GABAergic terminals had only a small or no clearly visible PSD this would be hard to look at.
To the last point, it has been demonstrated that the number of dense core vesicles and their fusion with the presynaptic density could be a critical factor in the build-up of the active zone. In addition, the findings of the Geinismann group suggesting that perforated synapses are more efficient than non-perforated ones is nowadays very controversially discussed since other factors such as size of the active zone (see for example Matz et al. 2010; Holderith et al. 2012) and the astrocytic coverage contribute to synaptic efficacy and strength.
Related to this topic: although in the case of rat CA1 pyramidal cells all inhibitory synapses terminated on dendritic shafts (Megias et al., Neuroscience 2001), please be aware that both excitatory and inhibitory synapses can terminate on both dendritic shafts and spines in humans (inhibitory synapses are though rare on spines, usually less than 10%, but they do exist, see for example Wittner et al, Neuroscience, 2001). Please, define the excitatory/inhibitory nature of the synapses based on morphological features (not on their postsynaptic target), i.e., flattened vesicles and thin postsynaptic density for GABAergic synapses, whereas larger, round vesicles and thick postsynaptic density for glutamatergic synapses. Anyway, the ratio of excitatory and inhibitory synapses on dendrites and spines in the two sublamina would also give useful information about the synaptic organization of the human neocortical layer 1.
We are aware that not all terminals targeting on spines are excitatory, in turn it has been shown that not all terminals on shafts were inhibitory as long thought (Silver et al. 2003). However, as stated by the reviewer their abundancy on spines is rather low. At the moment it is rather unclear which functional impact inhibitory terminals on spines have, despite a local inhibition (see for example Kubota et al. eLife 2015), and thus their role is rather speculative since excitatory synapses are the predominant class on dendritic spines. As already stated above the ratio of excitatory vs. inhibitory terminals is between 10-15% and not significantly different between the two sublaminae. We are willing to add this in the results section (see in the improved version of the manuscript).
(2) About the glial coverage: Please, specify how glial elements were determined. What were the morphological features specific to astroglial processes? In Figure 5, how could we know whether the glial element marked by green is not a spine neck? The lack of morphological features specific to glial processes makes this analysis weak. The most accurate would be to make it with the aid of GFAP staining. I know this is not possible with your existing data, but at least, provide information on how glial processes were identified.
We used the criteria first described by Peters et al. (1991) and Ventura and Harris (1999) identifying astrocytic profiles by their irregular stellate shape, relatively clear cytoplasm, numerous glycogen granules and bundles of intermediate filaments. After more than 20 years of structural investigations, we hope that the reviewers will believe us that we can identify astrocytic processes at the high-resolution TEM level. In some of our publications (Rollenhagen et al. 2007; 2015; 2018; Yakoubi et al. 2019a) we have used glutamine synthetase pre-embedding immunhistochemistry to identify astrocytic processes, but a disadvantage of this method is the reduction of the ultrastructural preservation of the tissue. We have included the criteria to identify astrocytic processes of glial coverage in our manuscript together with the two citations (see improved version of the manuscript).
(3) The authors state that the total number of SVs was very variable. How was the distribution of the number of SVs? Homogenous distribution suggests that different types of synapses cannot be distinguished based on their morphological features, whereas distribution with more than one peak would suggest that different types of synapses are present in L1, and that they can be differentiated by their appearance (number of SVs, for example). This might be also related to the type of synapse (i.e., excitatory or inhibitory). The same applies to the number of RP and resting pool SVs.
To look for differences in structural and synaptic parameters that can further classify synaptic boutons we have performed a hierarchical cluster and multivariance analysis. However, it turned out that according to structural and functional parameters no further classification into subtypes could be done.
(4) The authors should check and review extensively for improvements to the use of English. The Results and Discussion sections contain many sentences which are not easy to understand. They have either a too complicated structure, or they are incomplete and hard to follow. Few examples: "The RRP/PreAZ at p20 nm criterium was on average 19.05 {plus minus} 17.23 SVs (L1a: 25.04 {plus minus} 21.09 SVs and L1b: 13.07 {plus minus} 13.87SVs) and thus nearly 2-fold larger for L1a." If you take out the parenthesis, the sentence has no meaning. "The majority of SBs in L1 of the human TLN had a single at most three AZs that could be of the non-perforated macular or perforated type comparable with results for other layers in the human TLN but by ~1.5-fold larger than in rodent and non-human primates." Rephrase these types of sentences, please.
We partially agree with the reviewer. We have improved our manuscript by rephrasing and shortening sentences.
Other suggestions:
(1) Put the synaptic density part after the description of the neuronal and synaptic composition part, it is more logical this way (i.e., first qualitative description, the distinction between sublayers, then quantitative data). Please write down in the description of the neuronal and synaptic composition part how L1a and L1b were differentiated (see also my comment on Figure 1).
We agree with the reviewer and did the change according to the suggestion. For a better understanding, we have also expanded the neuronal and synaptic description of the two sublaminae in L1.
(2) Introduce a list of abbreviations at the beginning, that would help.
It is quite unusual to provide a list of abbreviations in eLife. However, when used first the full meaning of the abbreviations is now given.
(3) What is cleft width? Usually, it refers to the distance between the pre- and the postsynaptic membrane, but here, I think it refers to the size (diameter) of the active zone. Please, clarify in the Result section (as it appears earlier than the Methods section, where it is explained). I would probably use the expression "synaptic cleft size" instead of "synaptic cleft width" to avoid misunderstanding.
We thank the reviewer for the suggestion and used synaptic cleft size for better clarity and have transferred the sentence from the Material and Methods to the Results section.
(4) The description of the different SVs (RRP, RP, etc.) is not clear in lines 236-242. What does it mean, that RRP vesicles are located {less than or equal to}10 nm and {less than or equal to}20 nm from the active zone? Explain, why the two different distance criteria were used. Furthermore, how were the vesicles located at p20-p60 defined? Why were these vesicles not considered in the determination of the different pools?
As stated in the public review to the reviewers concern we have tried to define a morphological correlate to the three functionally defined pools. After thorough discussions, with leading scientists in the field of synaptic neuroscience we have decided to use the distance of individual vesicles from the PreAZ and sort vesicles upon these criteria. One can argue that this approach is random, however, these distance criteria were described by Rizzoli and Betz (2004, 2005) and Denker and Rizzoli (2010). As also stated in the public review there is still a controversial discussion whether only docked or omega-shaped SVs constitute the RRP. We decided that also those very close within 10 and 20 nm away from the PreAZ, which is less than a SV diameter may also contribute to the RRP since it was shown that SVs are quite mobile.
(5) Please, explain how the number of docked vesicles can be 3x larger in L1b, than the number of vesicles located at p10? Docked vesicles are the closest (with the membrane touching the PreAZ)... if this comes from the fact that another pool of boutons was used for the EM tomography analysis, then the entire pool of boutons analyzed, then it means that the selection of boutons for the EM tomography is highly biased. This also implies that EM tomography data are most probably not valid for the entire L1b. The difference might also come from the different ratios of dendrite/spine synapses included in the two different analyses. In this case, it would be helpful to distinguish between synapses terminating on dendrites/spines and analyse them separately (same as for inhibitory/excitatory, which is not exactly the same as dendrite/spine!). Different n numbers of synapses are given in the text (n=25, 25, 25 25) and in Table 2 (n=91, 98, 87, and 84) for the analysis of the docked vesicles, please, correct this.
This is a correct value and thus there is a nearly 3-fold difference. The TEM tomography was carried out on the same blocks that have been used for our 3D-volume reconstructions. To carry out TEM tomography we had to use thicker sections (250 nm) to look for complete SBs as we also did in our serial sections, but of course, we could not quantify the same SBs. The completeness of SBs was one of our main criteria to reconstruct structural and synaptic parameters. The second was that the synaptic cleft was cut perpendicular. Only SBs that met these criteria were chosen for further quantitative analysis. In this respect we are of course biased in both methods.
Secondly, as already stated we did not quantify inhibitory terminals in serial sections. However, we did not find significant differences between shaft vs. spine synapses.
Finally, in Table 2 the total number of ‘docked’ SVs is given analyzed from the total number of SBs analyzed.
Discussion:
Please include the recent findings of human L1 neurons, including the "rosehip" cells in the L1 neuronal network, see Boldog et al., Nat Neurosci 2018. It would be also useful to consider in the discussion the human-specific cortical synchrony and integration phenomena derived from in vitro data (Mansvelder, Lein, Tamas, Wittner, Larkum, Huberfeld labs, etc.), and how the synaptic morphology can be related to these.
We thank the reviewer and include the reference in our chapter functional significance.
Figures and Tables:
Figure 1: In the legend, it is written that CR cells are marked by an asterisk, but on the figure it is marked by arrowheads. H: I would put the dashed line slightly lower, just above the two neuronal cell bodies. Now it looks like in the middle of the astrocytic layer. One of the asterisks marking the CR cell is not above the nucleus of that cell. I: the gabaergic neuron is outside of the framed area. I would delete the frame, anyway, the arrowheads and the asterisk are enough to show what the authors want to show.
We have changed the Figure according to the suggestions raised by the reviewer.
Figure 3: The transparent yellow is not visible. It is a bit disturbing that the contours of the boutons are not visible, I would make the transparent yellow stronger (less transparent). The SVs in green/magenta will be still visible.
We wanted to highlight the internal subelements of SBs and thus made the covering transparent but we think it is still visible.
Figure 6C: The data concerning other layers than L1 are most probably taken from other publications of the research group. One is cited (for L6), but not the others. Please correct this, or if not, then write this in the Results and Methods.
We changed the citation in the improved version of the manuscript. We overlooked that the values for L4 and L5 were already published in Schmuhl-Giesen et al. 2022.
Table 1: What does central and lateral cleft width mean in Table 1? Furthermore, please, give the name for abbreviations CV and IQR in Tables 1 and 2.
The measurements of the synaptic cleft are now described in detail in the Results section. We now have given the full names for CV and IQR in the legends of tables 1 and 2.
Supplemental Figures 1 and 2: Why Hu01 and Hu02 are twice? What is the difference? Based on the figure legend, it is L1a and L1b? If yes, please, indicate on the figure or in the legend.<br /> Supplemental Table 1: What is TLE in the case of Hu_04? If it is temporal lobe epilepsy, then why age at epilepsy onset is missing?
Yes, Hu01 and Hu02 were selected for both L1a and L1b in separate serial sections preparations each. We indicated this now in the figure legend. Concerning Hu_04, unfortunately we do not have any further information about the medical background of the patient.
Supplemental Table 1 (Patient table), that there are many abbreviations explained which do not appear in the table (lBAZ: Brivaracetam CBZ: Carbamazepine; CLB: Clobazam; ESL: Eslicarbazepin; GGL: Ganglioglioma, etc.), please check and correct.
We have removed the unnecessary abbreviations.
Other minor suggestions:
What is Pr? Please, give the name a first appearance (line 368).
We explained Pr (release probability) when used for the first time.
Give the name for t-LDT, please (lines 442-443).
We explained t-LTD (timing-dependent long-term depression) when used for the first time.
Typo in line 169: DCW instead of DCV (dense core vesicle), DCV is used in the figure legends.
We changed DCW to DCV.
Typo in line 190: Yokoubi instead of Yakoubi (reference).
We changed Yokoubi to Yakoubi.
Typo in line 237: Rizzoloi instead of Rizzoli (reference).
We changed Rizzoloi to Rizzoli.
Line 229-230: One reference is not inserted properly - Piccolo and Bassoon.
The reference of Schoch and Gundelfinger and Murkherjee to the build-up of the active zone and the role of DCV containing Piccolo and Bassoon are properly cited in the text.
Typo in line 398: exit instead of exist.
Corrected
Typo in line 700: Reynolds (1063) instead of 1963.
Corrected
Reviewer #2 (Recommendations for the authors):
Abstract:
The last sentence seems far-fetched, and unrelated to the manuscript. How mostly single active zone boutons can "mediate, integrate and synchronize contextual and cross-modal information, enabling flexible and state-dependent processing of feedforward sensory inputs from other layers of the cortical column"? Which of the anatomical findings of the manuscript led to these conclusions?
According to the review by Schuman et al. (2021) layer 1 is regarded as a layer that mediate, integrate and synchronize contextual and cross-modal information, enabling flexible and state-dependent processing of feedforward sensory inputs from other layers of the cortical column to which the structural quantitative 3D- models of SBs contribute since they are an integral element connecting neurons and building networks.
I am also puzzled by the authors' statement in more than one place of the manuscript that "L1a can be characterized as a predominantly astrocytic sublamina". If the L1 contains the lowest measured volume ratio of glial processes (Figure 6), then this description does not seem to hold. Please rephrase.
The reviewer is right and we rephrased the sentences for more clarity in the improved version of our manuscript.
Results:
The authors find large inter-patient variability in the synapse density at L1, which raises the issue of what were the criteria to include certain patients in the analyses. Apparently, these are different from the ones analysed in their previous papers, and all the provided parameters were different (sex, age, medication, onset of epilepsy), and any of them can result in altered synapse density.
First, we have not used all patients for this study. Secondly, it was not possible to use all patients for all six layers.
It would be useful to add a panel for Figure 1 with synapse density across the different layers, as they provide this data in the Discussion.
We implemented a Supplementary Table 1 with the synaptic density values over all layers compared in the Discussion.
I cannot find Source Data 1 in the manuscript although it is referred to in more than 1 place (e.g. page 5 line 100).
Source data were uploaded when our manuscript was submitted directly to eLife as Supplemental Material. However, as stated by bioRxiv ‘any Supplemental Materials associated with this manuscript have not been transferred to bioRxiv to avoid the posting of potentially sensitive information’ all source data have not been uploaded to the preprint server.
Page 5 line 100 the correct value is 7.3*107 or rather 108?
We corrected the value in the improved version of the MS.
It would be nice to put the synapse density values into context by comparing them to e.g. mouse, rat, or monkey data.
Since we are working on the human temporal lobe neocortex we avoided to compare those data with those estimated in experimental animals. In addition as discussed by DeFelipe et al. (1999) different methods were used to quantify synaptic density in experimental animals so these results are difficult to compare.
Page 5 Line 117 CR-cells stands for Cayal-Retzius cells?
CR-cells is the abbreviation for Cajal-Retzius cells.
Page 6 Line 146 repeated sentence.
We deleted the repeated sentence.
Page 7 Line 154 "file-scale TEM" ??
We replaced file-scale by fine-scale.
Page 7 Line 164 "GABAergic synapses identified by the smaller more spherical SVs". With this fixation condition, GABAergic vesicles are more ovoid than glutamatergic ones. What were the criteria to distinguish them?
To our knowledge in meanwhile numerous publications using the same fixation inhibitory terminals contain more spherical and smaller and not roundish synaptic vesicles and showed no clear prominent PSDs as described in our paper. We have addressed that more clearly in the results section of the improved version of the MS.
Page 8 line 197 "The majority (~98%) of SBs in L1a and L1b had only a single (Figures 2C-E, 3A-C, E) at most two or three AZs" is in striking contrast with the other statement from page 7 Line 163 "Numerous SBs in both sublaminae were seen to establish either two or three synaptic contacts on the same spine or dendrite". Which of these statements is valid? Please provide exact quantification for this statement and decide which one is true.
It is true that the majority of synaptic boutons had a single active zone. However, for example on a spine not only a single but also two or three SBs can be found. We have rephrased this sentence for more clarity.
Page 9 Line 206 "L1 AZs did not show a large variability in size as indicated by the low SD, CV, and variance (Table 1)" Is this inter-patient variance of mean values? As in Supplementary Figure 1, both the SBs volume and PreAZ area show large variability in a given patient sample. Only the inter-patient variability of mean values seems low. Please state it clearly throughout the MS for other datasets as well.
For clarity concerning the variability between patients and structural parameters we have generated box plots (Suppl. Figures 1 and 2).
Page 9 Line 208 data is on Figure 5A and not 8A.
We thank the reviewer and corrected the citation of the Figure
Page 12 Line 295 how can the number of docked vesicles for L1b be larger than the one measured by the perimeter p10 nm? This later should contain the docked and PreAZ membrane proximal pool as well. This difference is even larger if we assume, that at EM tomography only partial AZs were analysed in a 200 nm thick section, not the entire AZ as for the perimeter measurement. Can the authors provide density estimates by dividing the docked / p10 nm vesicle numbers with the AZ area and comparing them?
This is a result comparing both methods. To the second concern: As stated in the text only synaptic boutons were the active zone can be followed from the beginning to its end and were the synaptic cleft was cut perpendicular were included in the TEM tomography sample as we also did in our 3D-volume reconstructions.
Methods:
Page 25 Line 624 While the PSD area can be equivocally measured, due to the dense appearance of the PSD on the EM images, the PreAZ is more difficult to outline due to lack of evident anatomical markers except the synaptic cleft (the dense material is much thinner). That is why in many publications the PreAZ area is considered to be identical to the PSD area. What are the anatomical criteria used here for the PreAZ? Why do the authors correct the PSD area, which is easy to measure with the PreAZ area that is much less certain to outline?
As stated in material and Methods both the pre- and postsynaptic densities are not defined by placing a closed contour in both densities because one can’t be certain that the dense accumulation of particles defining both areas since the impregnation (staining) and contrast of both structures critically depends on the uranyl and lead staining which could led to misinterpretation due to different staining results. That’s why we have drawn a contour line from the beginning to the end of the presynaptic density and extrapolated that for the postsynaptic density (for details see Material and Methods). In our samples both the pre- and postsynaptic densities were always clearly visible in those boutons further analyze.
Page 26 Line 640 vesicle density measurement: All the synaptic vesicles that are in the 50 nm thick section in their entirety are missed, and there are methods based on EM tomography to correct these estimations. One can not assume, that the error caused by "double counts" of vesicles cancels for the lost ones. There are stereological methods to estimate both types of error please include them and correct the values.
We would like to point out that the whole body of our work to structural analysis of vesicle pools is based on image data stemming from transmission electron microscopy (TEM) generating a projection of the entire volume of the ultra-thin section and NOT from scanning electron microscopy (SEM) where only a small volume close to the surface of the section would be captured. Operating in TEM mode ensures that no vesicle is missed only because it is embedded in its entirety in the section as postulated by the reviewer. Hence, EM tomography, which is basically a TEM operating from different incident angles in relation to the specimen or section, does not provide any advantage in detecting these vesicles. It does, however, help to better position a 3D object within the section volume itself and therefore allows to detect objects that could overlap from one viewing angle by using another angle. As the average vesicle diameter is of similar size compared to the section thickness, the possibility of a complete overlap to happen, however, is almost zero. And as we only count clear ring-like structures, a stereological correction factor calculated according to Abercrombie (1946) would underestimate real counts (see also Saetzler et al. 2002). If there is, however, relevant literature on "methods based on EM tomography" and "stereological methods to estimate both types of error" (over- and underestimates) that we are missing out on, we would appreciate the reviewer providing us with the corresponding references so that we can include such calculations in our paper.
Page 27 Line 664 and 665 "sections" are still tissue blocks, as sectioning comes after if the process is correctly written. Please correct.
We have corrected this according to the reviewer’s comment.
Page 43 Figure 4 D Data for L1b is missing, only the correlation line is visible.
Corrected in a new Figure.
Page 44 Figure 5 C arrowheads are in the correct places? Some of them do not seem to point to the edge of the synapse.
We carefully checked the Figure and adjusted the arrowheads.
Figure 5 E lower arrowhead labels something, that is difficult to identify but does not seem to be a vesicle.
We agree with the reviewer on this point and changed the figure accordingly.
Figure 5 F, the upper vesicle is at least 10 nm apart from the PreAZ membrane. Did the authors consider it as docked (indicated with arrowhead, according to the legend it labels docked vesicles)?
We agree with the reviewer on this point and changed the figure accordingly.
Page 45 Figure 6 B one of the 2 synaptic boutons (sb), sb2 has a tangential active zone that precludes the identification of the pre- and post-synaptic membranes, still 2 "docked vesicles" are labeled. How were they classified as docked? Please remove these tangential synapses from the dataset, as membranes can not be identified.
The reviewer is right that the active zone is tangentially cut, however, the two vesicles are associated with the AZ. In addition, we did not use this AZ for vesicle data analysis.
Page 46 Line 1124 interneuron axon labelled in green not brown.
Corrected as suggested by the reviewer.
Line 1129 SStC is missing.
Changed according to the reviewer’s comment.
Page 48 Table 2 Number of docked vesicles Median values are rounded to integer values? If yes why?
The statistic package used rounded to the given values.
Page 51 Supplementary Table 1 Hu_04 Histopathology, what does TLE stands for?
TLE: temporal lobe epilepsy. We included the abbreviation in the legend of Supplementary Table1, that is now table 2.
Reviewer #3 (Recommendations for the authors):
(1) Reanalysis of astrocytic coverage based on the % of synapses that form tripartite synapses.
We have reanalyzed the data concerning this point (new Figure 6D).
(2) Segregation of excitatory and inhibitory synapses.
We have now included a paragraph in our results section to distinguish between excitatory and inhibitory synapses.
(3) Better explanation of the limits of the study to assess functional parameters.
We disagree with the reviewer on this point and have not included an explanation concerning the limits of this study.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
eLife Assessment
This useful study uses high-field fMRI to test the hypothesized involvement of subcortical structure, particularly the striatum, in WM updating. It overcomes limitations in prior work by applying high-field imaging with a more precise definition of ROIs. Thus, the empirical observations are of use to specialists interested in working memory gating or the reference back task specifically. However, evidence to support the broader implications, including working memory gating as a construct, is incomplete and limited by the ambiguities in this task and its connection to theory.
We would like to express our gratitude to the editor and the reviewers for their time and effort in providing insightful and valuable comments. We greatly value the critical perspective on the relationship between fMRI contrasts and the PBWM model. We hope to have addressed all the last critical points and changed the manuscript according to the reviewers’ suggestions. Furthermore, we would like to point out that the behavioral results section was edited, as a double-check of the results section revealed some erroneous descriptive statistics.
Public Reviews:
Reviewer #1:
Summary:
Trutti and colleagues used 7T fMRI to identify brain regions involved in subprocesses of updating the content of working memory. Contrary to past theoretical and empirical claims that the striatum serves a gating function when new information is to be entered into working memory, the relevant contrast during a reference-back task did not reveal significant subcortical activation. Instead, the experiment provided support for the role of subcortical (and cortical) regions in other subprocesses.
Strengths:
The use of high-field imaging optimized for subcortical regions in conjunction with the theory-driven experimental design mapped well to the focus on a hypothetical striatal gating mechanism.
Consideration of multiple subprocesses and the transparent way of identifying these, summarized in a table, will make it easy for future studies to replicate and extend the present experiment.
Weaknesses:
The reference-back paradigm seems to only require holding a single letter in working memory (X or O; Figure 1). It remains unclear how such low demand on working memory influences associated fMRI updating responses. It is also not clear whether reference-switch trials with 'same' response truly tax working-memory updating (and gate opening), as the working-memory content/representation does not need to be updated in this case. These potential design issues, together with the rather low number of experimental trials, raise concerns about the demonstrated absence of evidence for striatal gate opening.
We acknowledge that a limitation of our study is that the task involved relatively low working memory demands. It remains to be clarified whether the same neural mechanisms would be engaged under a higher working memory load, and this is an important consideration for future research.
We also fully agree that it is uncertain whether reference-switch trials requiring a ‘same’ (or ‘match’ ) response truly engage working memory updating (or gate opening), as the working memory content or representation does not need to be altered in these cases. This concern is addressed in detail in the discussion section titled “No Support for Striatal Gate Opening” (see second paragraph).
Regarding our references to dopamine, we completely agree with the reviewer about the speculative nature of these discussions. In response, we thoroughly reviewed the manuscript and made revisions where necessary to ensure that we consistently emphasize the speculative nature of our commentary on dopamine and dopaminergic pathways.
Finally, we acknowledge the concerns about the design and the relatively low number of trials. However, our fMRI analyses of other reference-back task contrasts did reveal activity in the striatum and other subcortical ROIs. This suggests that our scanning protocol and task design are sufficiently sensitive to detect striatal activity, even with the limited number of trials.
The authors provide a motivation for their multi-step approach to fMRI analyses. Still, the three subsections of fMRI results (3.2.1; 3.2.2; 3.3.3) for 4 subprocesses each (gate opening, gate closing, substitution, updating mode) made the Results section complex and it was not always easy to understand why some but not other approaches revealed significant effects (as the midbrain in gate opening).
We thank the reviewer for this important remark and the opportunity to clarify our approach. We conducted whole-brain general linear models (GLMs) to generate a comprehensive wholebrain map of brain activity for each contrast. However, the whole-brain statistical parametric mappings (SPMs) involve data smoothing, which–while improving signal detection–reduces spatial precision. This is especially problematic in smaller or closely adjacent regions, where spatial blurring can merge distinct activations or make localized signals appear more widespread.
Additionally, the statistical thresholds in whole-brain analyses may detect weak or borderline significant effects, whereas ROI-wise GLMs, which assume uniform behavior across the entire region, may miss the same effects if the signal is weak or inconsistent across the ROI.
Since our primary focus was on the subcortex, we relied more heavily on ROI-wise GLMs, which were limited to subcortical regions. We prioritized findings that were supported by either the ROI-wise GLMs or by both GLM analyses. For instance, the midbrain activations found in our whole-brain analysis but not in the ROI analysis may result from smoothing (where activation from neighboring regions spreads into midbrain voxels) or from functional heterogeneity within the ROI, which can obscure localized activations when averaged in the ROI-wise GLMs. Inferences from each GLM approach, along with their discrepancies, are discussed for each contrast throughout the discussion, with additional details on the clusterbased ROI analysis in the discussion section titled “Dopaminergic involvement in working memory substitution” (see third paragraph).
We acknowledge that the results section may seem complex, and we apologize for any inconvenience this may cause.
Reviewer #2:
Summary:
The study reported by Trutti et al. uses high-field fMRI to test the hypothesized involvement of subcortical structure, particularly striatum, in WM updating. Specifically, participants were scanned while performing the Reference Back task (e.g., Rac-Lubashevsky and Kessler, 2016), which tests constructs like working memory gate opening and closing and substitution. While striatal activation was involved in substitution, it was not observed in gate opening. This observation is cited as a challenge to cortico-striatal models of WM gating, like PBWM (Frank and O'Reilly, 2005).
Strengths:
While there have been prior fMRI studies of the reference back task (Nir-Cohen et al., 2020), the present study overcomes limitations in prior work, particularly with regard to subcortical structures, by applying high-field imaging with a more precise definition of ROIs. And, the fMRI methods are careful and rigorous, overall. Thus, the empirical observations here are useful and will be of interest to specialists interested in working memory gating or the reference back task specifically.
Weaknesses:
I am less persuaded by the more provocative points regarding the challenge it presents to models like PBWM, made in several places by the paper. As detailed below, issues with conceptual clarity of the main constructs and their connection to models, like PBWM, along with some incomplete aspects of the results, make this stronger conclusion less compelling.
(1) The relationship of the Nir-Cohen et al. (2020) task analysis of the reference back task, with its contrasts like gate opening and closing, and the predictions of PBWM is far from clear to me for several reasons.
First, contrasts like gate opening and gate closing make strong finite state assumptions. As far as I know, this is not an assumption of PBWM, certainly not for gate opening. At a minimum, PBWM is default closed because of the tonic inhibition of cortico-thalamic dynamics by the globus pallidus. Indeed, this was even noted in the discussion of this paper, which seems to acknowledge this discrepancy, but then goes on to conclude that they have challenged the PBWM model anyway.
We thank the reviewer for this remark and agree that the reference-back task contrasts do not perfectly align with the predictions of the PBWM model. In the discussion section "No support for striatal gate opening," we note that our data support the PBWM model by emphasizing the central role of the basal ganglia in working memory processes. However, we acknowledge that it may not have been sufficiently clear in the manuscript that the way the reference-back task is operationalised does not allow for a precise test of the PBWM's gating predictions. To address this, we have revised the manuscript to shift focus away from framing it as a direct challenge to the PBWM model. Below, some edits are highlighted.
‘This contrasts with the findings of Nir-Cohen et al. (2020) and raises questions about the relationship between the gate opening process in the reference back task and the indirect striatal gating mechanism described in the PBWM model (Frank et al., 2001; Hazy et al., 2007; O’Reilly & Frank, 2006) and other neurocomputational theories (Hazy et al., 2007; Jongkees, 2020). According to these models, a dopaminergic signal in the striatum is required to trigger gating. Although the orthogonal contrasts in the referenceback task are intended to isolate working memory subprocesses inspired by models of working memory, the two gating contrasts do not fully capture the gating mechanism as originally proposed in neurocomputational models (Frank et al., 2001; Hazy et al., 2007; O’Reilly & Frank, 2006).’ (line 721-730)
‘Another explanation for the lack of enhanced striatal activity in gate opening challenges the conceptualization of the gating mechanism in the reference-back task, which does not accurately map onto the PBWM predictions.’ (line 746)
‘Moreover, despite the lack of striatal involvement during gate opening, our findings do not rule out the possibility that the PBWM model's predictions about striatal gating in working memory are correct, given the misalignment between the gate opening contrast and the PBWM’s proposal regarding striatal gating. It remains unclear whether the absence of striatal activation during gate opening trials is specific to low-demand tasks, like the reference-back task, which does not require as much gating compared to high working memory-demand tasks involving preparation for updating. Or whether the gate opening contrast does not sufficiently capture the PBWM proposed gating mechanism. Further investigation is needed to determine whether (dopamine-driven) striatal gating occurs in high-demand working memory tasks, where the gating process plays a more critical role.’
Second, as far as I know, PBWM emphasizes go/no-go processes around constructs of input- and output-gating, rather than state shifts between gate opening and closing. While this relationship is less clear in reference back, substituting task-relevant items into working memory does appear to be an example of input gating, as modeled by PBWM. Thus, it is not clear to me why the substitution contrast would not be more of a test of input gating than the gate opening contrast, which requires assumptions that are not clear are required by the model, as noted above.
We fully agree with the reviewer, which is why we proposed that neural mechanisms involving the midbrain and striatum are more likely to be observed in the substitution contrast rather than the gate opening contrast.
Third, PBWM relies on striatal mechanisms to solve the problem of selective gating, inputting, or outputting items in memory while also holding on to others. Selective gating contrasts with global gating, in which everything in memory is gated or nothing. The reference back task is a test of global gating. It is an important distinction because non-striatal mechanisms that can solve global gating, cannot solve selective gating. Indeed, this limitation of non-striatal mechanisms was the rationale for PBWM adding striatum. The connectivity of the striatum with the cortex permits this selectivity. It is not clear that the reference back task tests these selective demands in the first place. That limitation in this task was the rationale behind the recent Rac-Lubashevsky and Frank (2022) paper using the reference back 2 procedure that modifies the original reference back for selective gating.
We thank the reviewer for highlighting this excellent reference. We believe it holds exciting potential for future high-field fMRI studies that explore the neural mechanisms underlying selective gating.
So, if the primary contribution of the paper is to test PBWM, as suggested by the first line of the abstract, then it is not clear that the reference back task in general, or the gate opening contrast in particular, is the best test of these predictions. Other contrasts (substitution), or indeed, tasks (reference back 2) would have been better suited.
We agree with the reviewer that the gate opening contrast may not be the optimal test for the PBWM model predictions. However, previous studies have found evidence of striatal gateopening mechanisms using the reference-back task, which cannot be overlooked. We hypothesized that striatal mechanisms are likely active only when working memory content requires replacement, as seen in the substitution contrast in line with the PBWM model. Additionally, the reference-back 2 task (Rac-Lubashevsky & Frank, 2021) had not yet been published when we began data collection. Exploring this task in future studies, particularly with a 7 T fMRI protocol optimized for subcortical regions, would be an exciting avenue for further investigation.
Finally, in response to the reviewer’s remark, we have revised the abstract to remove the emphasis on challenging the PBWM model.
(2) In general, observations of univariate activity in the striatum have been notoriously variable in the context of WM. Indeed, Chatham et al. (2014) who tested working memory output gating - notably in a direct test of the predictions of PBWM - noted this variability. They too did not observe univariate activation in the striatum associated with selective output gating. Rather they found evidence of increased connectivity between the striatum and cortex during selective output gating. They argued that one account of this difference is that striatal gating dynamics emerge from the balance between the firing of both Go and NoGo cell populations that decide whether to gate or not. It is not always clear how this balance should relate to univariate activation in the striatum. Thus, the present study might also test cortico-striatal connectivity, rather than relying exclusively on univariate activation, in their test of striatal involvement in these WM constructs.
We appreciate the reviewer’s insightful observation regarding the variability of univariate activity in the striatum, particularly in the context of working memory and the challenges noted by Chatham et al. (2014). We agree that striatal gating dynamics likely reflect a balance between Go and NoGo cell populations, which may not always manifest in univariate activation alone. In line with the reviewer’s suggestion, examining cortico-striatal connectivity could provide a more comprehensive understanding of striatal involvement in working memory processes, particularly selective gating.
While our current study focused primarily on univariate activity, we recognize the importance of connectivity-based approaches and plan to incorporate functional connectivity analyses in future studies to further explore these dynamics. Such an approach, especially when combined with ultra-high-field fMRI, may offer valuable insights into the interaction between the striatum and cortex during working memory tasks.
(3) It is concerning that there was no behavioral cost for comparison switch vs. repeat trials. This differs from with prior observations from the reference back (e.g., Nir-Cohen et al., 2020), and in general, is odd given the task switch/cue interpretation component. This failure to observe a basic behavioral effect raises a concern about how participants approached this task and how that might differ from prior reports of the reference back. If they were taking an unusual strategy, it further complicates the interpretation of these results and the implications they hold for theory.
We understand the reviewer’s concern regarding the lack of behavioral response time costs for comparison switch versus repeat trials, which does indeed differ from previous findings in studies such as Nir-Cohen et al. (2020). It is possible that this results from our fMRI task design, such as increased inter-trial intervals compared to behavioral studies. While this is certainly a point of concern, we believe that the neural data still provide valuable insights into the mechanisms underlying working memory gating despite the absence of a clear behavioral effect.
In future studies, we aim to increase the number of trials and more closely align our task design with previous studies to mitigate this issue. We agree that further investigation is necessary to ensure the robustness of these effects and their theoretical implications.
In summary, the present observations are useful, particularly for those interested in the reference back task. For example, they might call into question verbal theories and task analyses of the reference back task that tie constructs like gate-opening to striatal mechanisms. However, given the ambiguities noted above, the broader implications for models like PBWM, or indeed, other models of working memory gating, are less clear.
-
-
www.medrxiv.org www.medrxiv.org
-
Author response:
We would like to express our sincere gratitude to the editor and reviewers for their thoughtful comments and suggestions on our manuscript. Below is our interim response to the reviewers’ public review:
Reviewer 1:
(1) We appreciate the reviewer’s insightful comment on the consideration of RAS mutation type and lesion metastasis site in our study. We will undertake a more comprehensive review of the literature and conduct a detailed analysis to assess how these factors influence treatment efficacy in our cohort.
(2) Regarding the radiotherapy planning process, we will provide further clarification in the revised manuscript. Specifically, we select the target lesion using CT imaging and delineate it by marking the 50% isodose line to define the planning target volume (PTV). In assessing treatment efficacy, we differentiate between target lesions (within the PTV) and off-target lesions (outside the PTV). We will update the figures to include the isodose line display for better clarity.
(3 & 4) We acknowledge the limitations of our study, particularly with respect to the sample size, which may hinder the statistical power required for a comprehensive analysis of treatment effect markers and subgroup variations. Nonetheless, we will continue to refine our analyses in the revised manuscript to provide additional insights and strengthen the conclusions where possible.
(5) During the early stages of our research, our team conducted a series of investigations into the impact of tumor fibrosis and angiogenesis on treatment outcomes. We have accumulated a substantial body of data, and we will summarize these findings in the revised manuscript to provide further context and support for our current study.
Reviewer 2:
(1, 4 & 5) We greatly appreciate the reviewer’s careful reading of the manuscript. We will revise the abstract, methods, and results sections to improve clarity and precision. Additionally, we will refine the overall wording of the manuscript to enhance its scientific rigor and professionalism.
(2) We also appreciate the reviewer’s suggestions regarding the methods and results. These will be incorporated into the revised manuscript, with additional detail in the methods section to clarify our experimental approach and strengthen the discussion of our findings.
(3) This is an intriguing point raised by the reviewer. We agree that the upregulation of PD-L1 expression following SBRT treatment could potentially enhance the efficacy of subsequent immunotherapy. To explore this further, we will conduct a detailed literature review and provide a more in-depth analysis of our data to elucidate the underlying mechanisms.
We trust that the clarifications provided above partially address the reviewers' concerns. We are committed to fully resolving the raised issues through more comprehensive revisions in the subsequent manuscript update.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
This study uses a variety of approaches to explore the role of the cerebellum, and in particular Purkinje cells (PCs), in the development of postural control in larval zebrafish. A chemogenetic approach is used to either ablate PCs or disrupt their normal activity and a powerful, high-throughput behavioural tracking system then enables quantitative assessment of swim kinematics. Using this strategy, convincing evidence is presented that PCs are required for normal postural control in the pitch axis. Calcium imaging further shows that PCs encode tilt direction. Evidence is also presented that suggests the role of the cerebellum changes over the course of early development, although this claim is rather less robust in the current version of the paper. Finally, the authors build on their prior work showing that both axial muscles and pectoral fins contribute to "climbs" and show evidence that suggests PCs are required for correct engagement of the fins during this behaviour. Overall, establishing a role for the cerebellum in postural control is not very surprising. However, a clear motivation of this study was to establish a robust experimental platform to investigate the changing role of cerebellar circuits in the development of postural control in the highly experimentally accessible zebrafish larvae, and in this regard, the authors have certainly succeeded.
Overall, I consider this an excellent paper, with some room for improvement in aspects of presentation, discussion, and some aspects of the data analysis..
We thank the reviewer for their kind comments and support. In the revision we have addressed their concerns regarding data presentation and analysis. Additionally, we have expanded our introduction and discussion to address questions of presentation.
Reviewer #2 (Public Review):
Summary:
Franziska Auer et al. investigate the role of cerebellar Purkinje cells in controlling posture in larval zebrafish using the chemogenetic tool TRPV1/capsaicin to bidirectionally manipulate (i.e., activate or ablate) these cells. This tool has been developed for zebrafish previously but has not been applied to Purkinje cells.
High-throughput behavioral experiments are presented to monitor how body posture is affected by these perturbations. The analysis of postural control focuses on a specific subaspect of posture: the body tilt-angle relative to horizontal just before a swim bout is executed, quantified separately for pre-ascent and pre-dive bouts. They report a broad bimodal distribution of pre-ascent bout posture ranging from -20 to +40 degrees, while the pre-dive bout posture was more Gaussian, ranging between -40 and 0 degrees. The treatment effect is quantified as the change in the median of these distributions.
Purkinje cell activation and ablation in 7 days post-fertilization (dpf) fish shifted the median of the ascending bout posture distributions to positive values. The authors hypothesize that the stochastic nature of the activation process might desynchronize Purkinje cell activity, thus abolishing Purkinje cells' role in postural control, similar to ablation. However, this does not explain why dive bout posture decreased upon activation but was unaffected by ablation.
To test whether the role of Purkinje cells in postural control matures over development, the authors repeated the ablation experiments at 14 dpf. They state that "at 14 dpf, the effects of Purkinje cell lesions on posture were more widespread than at 7 dpf." However, this effect size is comparable to that observed at 7 dpf, suggesting no further maturation of the role of Purkinje cells in pre-ascending bout postural control. The median pre-dive bout posture decreased at 14 dpf, contrasting with no effect at 7 dpf, yet this change was comparable in effect size to the activation effect on Purkinje cells at 7 dpf. The current data breadth may not be sufficient to conclude that signatures of emerging cerebellar control of posture across early development were uncovered.
The study's exploration of activating Purkinje cells in freely swimming fish using TRPV1/ capsaicin is of special interest, but the practicability of this method is unclear from the current presentation. It would be beneficial to present the distribution of the percentage of activatable Purkinje cells across animals and time points to provide insight into the method's efficiency. Discussing this limitation and potential improvements would aid in evaluating the method, especially since the authors report that the activation experiments were labor-intensive, limiting repeat experiments. This may explain why the activation experiment at 7 dpf is the only data presented with cell activation, with other analyses performed using the cell ablation capabilities of the TRPV1/capsaicin method.
Another data point at 14dpf would significantly strengthen the conclusions.
The authors analyze Purkinje cell-controlled fin-trunk coordination by examining ascending bout posture across different swim bout speeds. They make the important finding that pectoral fin movements contribute significant lift for median and fast swim bouts but not for slow ones, and that Purkinje cell ablation disrupts lift generation at all speeds.
Finally, the authors examined whether Purkinje cell activity encodes postural tilt-angle by performing calcium imaging on 31 cells from 8 fish using their Tilt In Place Microscope (TIPM). They report that they could decode the tilt-angle from individual neurons with a highly tuned response, and also from neurons that were not obviously tuned when pooling them and analyzing the population response. However, due to the non-simultaneous recordings across animals, definitive conclusions about populationlevel encoding should be made cautiously, it might be better to suggest potential population encoding that needs confirmation with more targeted experiments involving simultaneous recordings.
Strengths:
- The study introduces a novel application of the chemogenetic tool TRPV1/capsaicin to study cerebellar function in zebrafish.
- High-throughput behavioral experiments provide detailed analysis of postural control.
- The further investigation of Purkinje cell-controlled fin-trunk coordination offers new insights into motor control mechanisms.
- The use of calcium imaging to decode postural tilt-angle from Purkinje cell activity presents interesting preliminary results on neuronal population encoding.
Weaknesses:
- The term "disruption" for postural control effects may lead to misleading expectations.
- The supporting data show only subtle median shifts in postural angle, raising questions about the significance of observed effects. Statistical methods that account for the hierarchical structure of the data might be required to support the conclusions.
- The study's data breadth may not be sufficient to conclude emerging cerebellar postural control across early development.
- The current presentation does not adequately detail the practicability and efficiency of the TRPV1/capsaicin method for activating Purkinje cells, and the labor-intensive nature of these experiments constrains the ability to replicate and validate the findings.
- Non-simultaneous recordings in calcium imaging necessitate cautious interpretation of population-level encoding results.
We appreciate the reviewer's thoughtful and detailed feedback. In response, we have made several changes to highlight key points in our manuscript. We have adjusted our wording to more accurately reflect the scope of our findings. Finally, we have clarified and expanded the methods used.
Reviewer #3 (Public Review):
Summary:
This paper uses a new chemogenetic tool to investigate the role of cerebellar Purkinje cells in postural control. Using a high-throughput behavioral assay, they show that activation or ablation of Purkinje cells affects various aspects of postural control in zebrafish larvae during spontaneous swimming and that the effects are more pronounced at later developmental time points, where the Purkinje cell number is much greater. Using a sophisticated imaging assay, they record Purkinje cell activity in response to the tilt of the fish and show that some Purkinje cells are tuned to tilt direction and that the direction can even be decoded from untuned neurons.
Strengths:
Overall the study is nice, using a range of tools to address a fundamental question about the role of the cerebellum in postural control in fish.
Weaknesses:
(1) The data in Figure 1 that establishes the method seems to be based on a very small number of experiments and lacks some statistical analysis.
(2) The choice and presentation of the statistical and analysis methods used in Figures 2-5 could be improved.
We thank the reviewer for their comments. We have added additional statistical analyses for the activation experiments, and improved data presentation .
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
Overall I think this is a great paper.
* Introduction and Discussion.
The Introduction (and Discussion) do little to explain what is understood about cerebellar control of posture and what major outstanding questions remain. The first paragraph of the Introduction seems to argue that the role of the cerebellum in control of posture is well established and line 24 attempts to motivate the present study by virtue of the fact that terrestrial locomotion is "complex". This might be true but is not necessarily a major obstacle given the suite of powerful approaches available in rodent neuroscience. What are the major challenges that are hard to tackle in rodents and what specific questions can the larval zebrafish help to answer? What about development (which gets no mention at all)? I'm not suggesting a comprehensive review of every aspect of cerebellar physiology, but I think the Introduction should attempt to outline the current hypotheses in a little more detail and highlight what we still need to understand.
We take the Reviewer’s point that there is more to say in the Introduction. We feel that multi-dimensional limb biomechanics and proprioception are two aspects of terrestrial locomotion that support our use of the word “complexity.” However, we don’t dwell on this point because, as the reviewer correctly states, the suite of tools for rodent neuroscience & behavior is expansive and, in our opinion, not a limiting factor. Instead, we said what we felt we could regarding the potential contribution of the larval zebrafish in the last paragraph of the Discussion. In the revision, we have added details about the development of cerebellum to the introduction (though this, of course, is an expansive topic and well-beyond the scope of the Introduction), highlighted some of the historical limitations in rodent posture analysis, and set up the .
* Figure 2: 'Arrows denote the shift towards more nose-up postures'. I think the distribution is quite easy to interpret without these arrows; I suggest removing them.
We have removed the arrows.
* IQR is sometimes stated as a single number and sometimes as a range. It should be consistent and unless eLife has guidance to the contrary, I suggest that it be the latter.
Thank you for pointing that out. We now report it as the value at the 25&75th %ile for all IQRs.
* Figure S2: For 14 dpf fish the axes are labelled PC2/3 - is this an error?
We have changed it to a 3-dimensional plot for both 7 and 14 dpf data to show comparable plots for both ages (now Figure S5 F and G). For the analysis in the 14dpf fish the clearest separation was in the space defined by the 2nd and 3rd principal component.
* In the methods, there is insufficient detail given about fluorescent imaging.
We added additional information to how the fluorescent imaging was performed to the ‘Confocal imaging’ section as well as to the ‘Functional imaging section’
* Abstract
In my opinion, the statement "Here, we used a powerful chemogenetic tool (TRPV1/ capsaicin) to *define the role of Purkinje cells*..." is too strong. Whilst the evidence that PCs are required for postural control is certainly strong, what exactly these cells do in the service of postural control is far from clear (as the authors indeed acknowledge in the Discussion). As such, I wouldn't say their role has been "defined".
We change the word to “describe” to better reflect our findings
* aldoca transgenic.
This appears to be a beautiful transgenic line but the data showing the extent of its expression and evidence that in the cerebellum it exclusively labels PCs isn't clear enough.
(i) Ideally Figure 1A would show an image of a whole animal to provide an overview of transgene expression but instead it seems to be (the legend is unclear) a cartoon with a confocal projection of part of the brain overlaid.
We have updated the figure legend to be clearer that we show a cartoon of a larval zebrafish with the confocal image overlaid. The aldoca promotor has been previously described and exclusively labels Purkinje cells (10.1523/JNEUROSCI.3352-10.2010)
(ii) Figure 1B shows expression in the cerebellum, but how are we to understand that all the labelled cells are PCs? Are all PCs labelled, or only a subset? Perhaps a double labelling with a PC in situ marker could be done to demonstrate colocalisation?
As above, the aldoca promotor has been previously described; to the best of our knowledge in the Hibi lab’s hands (and ours) it labels Purkinje cells exclusively, and it labels all of them (10.1523/JNEUROSCI.3352-10.2010)
* Chemogenetic validation.
Overall, the chemogenetic approach to abrogate PC function looks to be very powerful. The authors state in several places that a contribution of this paper is in its "establishing the validity of TRPV1/capsaicin-mediated perturbations". However, the data in Figure 1, along with various comments in other parts of the paper raise some questions:
(i) For experiments depolarising PCs with 1µM CSn, the same size is tiny: Two transgenic animals and one control. Moreover, it is stated 'in one fish ... we observed a small number of neurons at the 9h timepoint with bright, speckled fluorescence suggestive of cell death". Was this one out of two transgenics?! In the discussion, I didn't understand the statement "ensure adequate brightness levels *to achieve sufficient depolarization without excitotoxicity*". Does this "excitotoxicity" relate to the specked fluorescence observation?
Overall, the very small sample size and comments about excitotoxicity and cell death raise concerns about the approach that I think warrant clearer treatment in the results (including information about the assessment of transgene expression, % embryos judged to have suitable expression), especially as this paper is seeking to establish the validity of the method.
We note first that the method has been previously validated (https://doi.org/10.1038/ nmeth.3691) and that we build on this work. For the experiment described, the point was to identify an acceptable duration for exposure. To that end, we analyzed 6 animals for up to 6h (including the washout experiments in Figure S1B) where we never observed any speckled fluorescence; we limited our behavioral experiments to 6h accordingly. We thought it would be worth including the observation of speckled fluorescence at 9h timepoint for future reference. To directly address the comment we have increased the number of analyzed cells and fish for the 1uM capsaicin experiments and added statistical analysis (lines 65-67).
When screening for transgene expression we selected for fish that had clearly visible expression, but that did not look overly bright, and used the same criteria when screening fish for the GCaMP imaging and for behavior. Around a quarter of the fish that had aldoca:TRPV1-tagRFP expression had a usable expression level for the activation experiment. We have added this information to the Results (line 62) and Methods (line 369-372)
(ii) The authors note "capsaicin could sporadically activate subsets of Purkinje cells" and further speculate about PC activity and synchrony in the discussion. Figure 1 seems to rely on single images at widely spaced time points but given that they are set up to do 2-photon calcium imaging, why didn't they collect continuous time series data and analyse the temporal patterns of activity across the transgenic PC population?
We have added time series data for calcium imaging after 1uM of Capsaicin in TRPV1- and TRPV1+ cells to Supplementary Figure S1A. Here too we see sporadic increases in calcium levels at similar rates: 0% for TRPV1- and 15-19% for TRPV1+ (see also Figure S1 legend)
(iii) The axonopathy and cell death resulting from 10 µM Csn is quite dramatic.
However, here the authors do not appear to have included a TRPV1 negative control (although oddly they did for 1 µM treatment) so it is currently unclear whether or not a high conc of Csn alone might be cytotoxic.
Chen et al (https://doi.org/10.1038/nmeth.3691) have established the TRPV1/capsaicin method in zebrafish with broad neuronal label and did not see any effect with high doses of capsaicin in TRPV1 negative fish.
* Behavioural assessment - stats
Overall, the disruption of postural stability after PC manipulations is convincing.
However, I have a few queries about the statistics:
(i) In this section, the statistical unit was not clear. The tables, which are otherwise very useful, give no indication of N. The legend text does report "8 repeats/149 control fish" and "across experimental repeats" suggesting the statistical unit might be the repeats rather than animals, but this should be clarified. In Figure 2G, individual data points should be plotted if N=8, or a representation of the distribution (eg violin or box and whisker plots) if N = 149.
We apologize for the confusion. Given the variable numbers of bouts, a single experimental repeat does not allow for an accurate estimate of expected value. Below we simulated how accurately the median can be estimated based on increasing sample sizes (Author response image 1). Given that large numbers of bouts are necessary to accurately estimate the median we pool the data for all experiments and use resampling statistics to estimate bias in our estimate.
Author response image 1.
Median estimation based on increasing sample size
(ii) Related to the above, I hope it might be easier to interpret the unexpected change in climb posture in ablation controls once the data for individual repeats is shown.
When we analyze the data as single repeats we see considerable variability between different repeats due to undersampling. We tested the medians for the single repeats for outliers to ensure that the shift is not due to a single repeat skewing the distribution. We did not detect any outliers in the pre-lesion control or in the post-lesion control group. (Outliers were determined as deviating more than 3 times the scaled median absolute deviation (MAD) from the median. A scaling factor of 1.4826 was used to ensure that MAD-based outlier detection is consistent with other methods like Z-scores.) We added this information to line 133-134 and the method section under Statistics.
(iii) In some parts of this section, including the Tables, the authors report the 95% CI of the median, rather than IQR. In this case, they should report the z-value used for 95% CI estimation.
As we are using resampling to estimate the 95% confidence interval of the median there is no z-value as in a traditional normal distribution based confidence interval; Instead, we explicitly define the 2.5th and 97.5th percentiles from the bootstrapped sample distribution, which captures the middle 95% of the data, representing the 95% confidence interval.
* It is stated that "fish adopted more nose-up postures before *and throughout* climb bouts". Figure 2F seems to show posture before the climb, but where is the "throughout" data? It would be useful if Figure 2E, J could be extended to make a bit clearer these two phases of postural assessment.
We removed the phrase ‘throughout climb bouts’ as we are not showing the posture throughout the bout and to avoid over complicating the interpretation.
* Why were PCs not activated at 14 dpf (eg using 1 µM Csn)?
Due to shifts in priorities the first author will not be continuing this series of experiments, and so this additional experiment will have to wait for someone to pick up this line of inquiry
* The authors appear to claim that the difference in phenotype in 7 versus 14 dpf animals following high conc Csn treatment is indicative of a changing role for cerebellar PCs over this developmental period. For instance, in reference to the 14 dpf ablation phenotype, the authors write "reveals the functional emergence of Purkinje cell control of dives" and in the abstract they talk about "emerging control of posture across early development". However, can they rule out that the phenotypic differences might instead reflect differential sensitivity of the relevant PC (sub)populations to CSn at the two ages? If this caveat cannot be discounted then I suggest it is acknowledged e.g. in the discussion.
As previously established, all Purkinje cells are labeled in the aldoca line (10.1523/ JNEUROSCI.3352-10.2010). Fluorescence is brighter at 14dpf compared to 7dpf, suggesting higher levels of TRPV1. We therefore assume that at 14 dpf, the high concentration of Csn is sufficient to ablate Purkinje cells. At 14 dpf, cerebellar damage is visible under a standard dissecting microscope.The preponderance of evidence therefore speaks against a previously undiscovered subpopulation of TRPV1expressing Purkinje cells that are, by mechanisms yet unknown, resistant to high doses of capsaicin.
* Fin-body "coordination"
The ideas and data around fin-body coordination are very intriguing.
(i) The statement "fin engagement is speed-dependent" would benefit from a stats test to show this is indeed significant. The data in Figure 4B suggest a rather high degree of variance.
This is an important point; we appreciate the Reviewer’s attention. We have added statistics to show this is speed dependent to line 167-169 and show the corresponding plot in the supplement in Figure S4. "Here, we observed that fin engagement is speeddependent, with faster bouts producing greater lift for a given axial rotation (Spearman correlation coefficient: control 0.2193; 10uM capsaicin: 0.0397; Z-test after ztransformation: p < 0.001)
(ii) The statement "After capsaicin exposure, the slopes of the medium fast speed bins were significantly lower (Figure 4C), reflecting *a loss of speed-dependent modulation*" is not convincing. The slope is likely a function of both speed and Csn treatment, and the comparisons in Figure 4C appear to be testing the latter, not the former.
We understand the reviewer’s point. However, the slope for the slow bouts remains unchanged. We therefore conclude that the reduction in fin-body slope is speed dependent and not a speed independent reduction of slope overall.
We have made this more clear by adding Supplementary Figure S4 and changing the text in line 177-179.
(iii) I'd like to understand more about the phenotype of the fin-amputated animals. Were any "bout" parameters changed? Did the animals still attempt climbs and was the distribution of the upward rotation parameter similar to controls? The text states "the slope of the relationship between upward rotation and lift was indistinguishable from zero" but the stats reported in the text are comparisons between groups while Table 5 shows 95% CIs that don't span zero. Some clarification would be useful here.
We appreciate the Reviewer’s interest. We’ve studied climbing in fin-amputated animals at length here: https://doi.org/10.7554/eLife.45839 and here: https://doi.org/10.1016/ j.celrep.2023.112573 and have added these references in line 183.
(iv) The authors repeatedly refer to fin-body *coordination* but it is not clear whether the loss of lift after PC ablation is a result of an explicit coordination defect (i.e. changes in the relative timing and/or kinematics between fins and axial motion components), versus a simple reduction in pectoral fin engagement. Either result could be interesting, but this should be clarified.
Thank you for pointing that out. In the fastest speed bin, we observed an increase in upward rotation and a decrease in average fin lift. In contrast, the medium speed bin showed no significant changes in average fin lift or upward rotation (see Author response image 2 and Tables 4 and 5), yet already displayed coordination deficits. Based on these observations, we argue that Purkinje cell lesions primarily affect coordination, rather than simply reducing one specific parameter such as lift or rotation (line 293-298).
We have added fin lift and rotation values from Author response image 2 for all speed bins to tables 4 and 5.
Author response image 2.
Fin lift and rotation for slow, medium and fast bouts
* PC activity and decoding of pitch direction.
The clever TIPM method is used to collect calcium data that convincingly shows that individual PCs can encode pitch-tilt direction. However, a population of "not tuned" cells are also identified, and here I found the analysis of their responses and the argument that they encode pitch direction at a population level difficult to follow.
(i) First, although the naming of the cells implies that individual neurons do not encode pitch direction, I did not find this convincing. Figures 5F/G suggest that several "not tuned" cells in fact show quite consistent differences in activity across trial types and indeed in terms of their average responses sit as far from the unity line as do several "tuned" cells.
The Reviewer’s comment helped us clarify some key points. First, tuned and untuned cells were categorized based on a Directionality Index threshold of 0.35; some cells might look similar in 5F/G but the highly variable responses of Purkinje cells have highly variable response so overall there was no consistent tuning. We have clarified this in the text in line 203-207 Below we have plotted the Up versus Down responses for the 10 least tuned cells (sorted by directionality index). While some cells have higher responses on average to one direction we think that the variability makes it difficult to support a claim for “tuning.” We have also tested the support vector machine on the least tuned cells to confirm that the chosen cutoff for tuned/untuned is not affecting our claim that untuned cells can encode position.(see also Author response image 4)
Author response image 3.
Trial-by-trial variability
(ii) It is therefore not very surprising that PCA (and the SVM decoder) distinguishes trial type. I would guess that PCA assigns the largest weights to these most tuned of the "not tuned" cells, and the 3-5 cell decoders do well when these cells happen to be sampled.
Author response image 4.
Decoding accuracy of the 3/5/7 least tuned cells
This was an interesting idea. To rule out that it is only the most tuned cells that contain the information, we tested the decoder on the 3/5/7 least tuned cells; here too, 5 and more cells are better able to accurately decode the direction. We have add the decoding accuracy to the text in line 221-224
(iii) As I understand the analysis, Figure 5G shows responses for "not tuned" cells over 21 trials (of each type) but these are not the same trials for the different cells? How then is population coding being assessed?
We have updated the text and refer to this data as a “pseudo-population” in lines 216 and 218 for all experiments where we combined cells from different fish. For technical reasons, when we perform TIPM at eccentric angles we must use sparsely labelled fish to ensure that we can find the same cells over a 60 degree range. We have repeated our analyses for TIPM centered at the horizon, where we can record from entire populations from a single fish.
(iv) Furthermore, Figure S2 shows a somewhat different analysis with decoding accuracy measured on a fish-by-fish basis. In this case, are these decoders for simultaneously imaged neurons? Is this a cross-validated measure of decoding accuracy?
Yes, as above, Figure S4 (former S2) looks at fish-by-fish basis of simultaneous recorded neurons. Yes, it was 5-fold cross validated. We have updated the text in line 490-494.
Reviewer #2 (Recommendations For The Authors):
- Postural control involves various aspects such as balance, coordination, relative body part orientations, and stability. Discussing these and presenting in this context the specific subaspect characterized in this study would help clarify which aspect of postural control the work focuses on.
The Reviewer makes an interesting point, but we think their description of what constitutes postural control is overly broad. Specifically, control of “relative body part orientations in space” by definition requires coordination, and subserves balance and stability. We acknowledge, of course, that different aspects can be and often are treated independently. While interesting, a full treatment of what comprises “postural control” is beyond the scope of the paper, as it would require reconciling the terms across taxa, effectors, environments and well over a century of experiments.
We contend that posture — particularly underwater — is best defined as the relative orientation of body parts in space. For fish, those parts consist of predominantly axial muscles and secondarily fins. We present these definitions in the Introduction and thank the Reviewer for encouraging us to more clearly shape our findings.
- Disruption of posture or postural control: The use of the word "disruption" could lead to misleading expectations. While it may not be incorrect, it suggests a significant loss of equilibrium, an obvious increase in postural variability, or at least a noticeable effect when observing an individual animal's behavior. However, the supporting data show only a subtle median shift in postural angle within a very broad distribution averaged over many individuals. This effect was only significant when comparing fish with a control group, not when comparing fish posture before and after the treatment.
Replacing "disruption" with "modification" would be more cautious.
We take the Reviewer’s point and have adjusted our wording to "modifies postural control.” In lines 137, 266, and 283
- Statistical significance: Consider aligning the asterisk notation with conventional standards (e.g., * for p < 0.05, ** for p < 0.01, *** for p < 0.001) to enhance clarity for readers. On the other hand, the individual measurements might not be independent (e.g., measurements from the same fish, or the same tank are likely to be correlated), so using the Wilcoxon rank-sum test (Mann-Whitney U test) on pooled data might lead to incorrect conclusions. Methods that account for the hierarchical structure of the data might be required to support the conclusions.
We take the Reviewer’s point about the importance of conventions, however we have never found “more stars = more significant” to be all that helpful in evaluating claims. Instead, we’ve opted to have both a significance and effect size criteria; a “star” here reflects our considered confidence in the difference we observe.
We agree that the hierarchical nature of pooled data is worth considering/presenting.
We performed a two-way analysis of variance (ANOVA) on the interquartile ranges (IQRs) of the single experimental repeats for the 7 days post-fertilization (dpf) activation, 7dpf lesion, and 14dpf lesion experiments. The ANOVA revealed no significant main effects, supporting the strategy of pooling experimental repeats to estimate distributions.
The results of the ANOVA, along with the IQRs for all experimental repeats, are presented in Tables 6-11. We have also clarified this in the methods section in lines 505-509.
- Data representation: All data of postural angles should be represented in the form of violin plots to show the underlying distributions of the postural angles, especially given that the effect size is small relative to the dispersion of the distribution of the postural angle and that this distribution is also not Gaussian but bimodal, and different before and after the treatments.
We take the Reviewer’s point that seeing the full distribution can be useful. We have added plots of the raw distributions for the data in Figure 3 as supplemental Figure S3.
- Showing the distributions will provide the necessary information for the reader to evaluate the importance of the effect. For all data shown in Table 1, the distributions should be presented in the supplementary information.
As requested, we have added the distributions of the data in Table 1 to the supplement (Figure S2)
- Roll posture: A statement about whether roll posture is perturbed by Purkinje cell manipulation would be a piece of important additional information helping to understand how strong the 'disruption' of posture is.
We haven’t assessed roll posture, as this is not practical in the current version of the SAMPL apparatus. We have added this limitation to the results (line 116) but also note that as our manipulations are bilateral, we don’t anticipate any systematic changes to roll.
- Comparison with other methods: Add a discussion on how the TRPV1/capsaicin method compares with other methods, such as using nitroreductase (Ntr) for targeted pharmaco-genetic ablation of cells by treatment with metronidazole or the the possibility to to ablate Purkinje cells by KillerRed as the author lab has done previously. Both methods have been applied to ablate Purkinje cells in larval zebrafish. What are the advantages of the TRPV1 method compared to these when neglecting the activation possibility?
Thank you for that suggestion, we have added a section to the discussion where we compare the TRPV1/capsaicin lesion to other lesion methods (lines 334-336)
- Describe the decoding algorithm: The decoding algorithm used could be described more in detail in the methods section.
We have described the decoding algorithm in more detail in the methods under ‘Functional GCaMP imaging in Purkinje cells.’ Line 488+
We used a support vector machine (SVM) with a linear kernel. The SVM model was trained using k-fold cross-validation, which splits the data into k subsets (folds). At each iteration, the model was trained on k-1 folds and tested on the remaining fold, ensuring that the model performance was evaluated on unseen data in each fold. Permutations were performed on randomized trial identity as a null hypothesis (5-fold cross-validation; 100 shuffles for randomization). Accuracy was calculated as 1 minus the classification loss.
- Availability of code: The link to the data and code repository is not working.
Thank you for pointing that out, we have fixed it now. In the lower right of the page you can see the history of all changes to the repository, including the entry on 2023-09-08 where the corresponding author set it to “public.” When we checked thanks to your comment, it had been set to “private,” without any record of when/why. We have reset it 2024-10-17. We will continue to check it periodically in the future and apologize in advance if it is unavailable; this is the first time we’ve seen that happen.
- Electrophysiological Control: Including an electrophysiological characterization of the activation of Purkinje cells by the TRPV1/capsaicin would significantly strengthen the validity of the method.
We take the Reviewer’s point that electrophysiological characterization is a way to strengthen the validity of the method. However, Chen et al (h"ps://doi.org/10.1038/ nmeth.3691) have performed electrophysiology during neuronal activation and concluded that TRPV1 activation with capsaicin indeed increases neuronal activity and firing rates increased. Our calcium imaging and lesion experiments amply demonstrate that Purkinje cells are sensitive to TRPV1-mediated currents. We therefore do not believe that the additional information gained by arduous electrophysiological evaluation is merited here.
- Describe more in detail how climb and dive bouts are defined. The height difference between consecutive bouts measured 250ms before the bout of executions.
Climb and Dive bouts are split by the angle of their trajectory. If the fish moves up (i.e. trajectory larger 0) it is considered a climb bout and vice versa for dive bouts. 250ms prior to the maximum speed is roughly the time the fish initiate a bout, so the pre-bout posture is measured when at this point. The time-courses of bouts are dissected extensively in Zhu et. al. 2023. We have added a definition for climb and dive bouts to the method section under ‘Behavior analysis’ line 453 and 454.
- Figure 1H: Why can't you ablate all Purkinje cells but only about 80%?
This is an excellent question. We opted for an extremely conservative count, and included everything that was still resembling a cell, even if it might not be functional/ already dying. Our counts are therefore likely an underestimate of the percentage of cells that were lost. We have added this point to the text in lines 393 395
- Figure 2C: The method is not fully clear. At 8dpf 0.1uM capsaicin is added to the chamber. At what time after the application of capsaicin did the behavioral recording start?
We recorded after about 10-15min after adding the 1uM Csn to the chambers. The fish were fed after the 6h in capsaicin. We have added this information to the method section line 404 - 408.
- Figure 2F: What indicates the shown confidence interval? Also median with a 95% confidence interval calculated over the experiments in parallel?
The distributions shown in Figure 2F take data from all experiments pooled. We use resampling methods to determine the variability in our estimates. The distribution plots are showing the median and the 25th and 75th percentile of the resampled distribution. We have added this information to the figure legends.
- Figure 3: Subtitles on panel D and E indicating <climb bout posture> and would facilitate reading.
We have added the subtitles to those panels.
- Figure 4: Describe in the methods how recordings from individual fish were mapped onto each other to superimpose the Purkinje cell locations recorded from the 8 fish.
We have added the respective section to the methods: Line 481 - 483
“To map the anatomical locations of the recorded cells, we imaged overview stacks for each fish. These stacks were manually aligned in Illustrator, and the cells included in the analysis were reidentified and color-coded according to their tuning properties.”
Reviewer #3 (Recommendations For The Authors):
Major points:
(1) Lines 74-81. The data presented here and in later experiments to argue for an effect of capsaicin on neural activity lacks statistical rigor because of the apparently very small numbers of animals/cells assessed. For example, the control appears to involve 4 cells assessed from 1 animal, and the experimental group is just 2 animals. Given that the interpretation of the paper depends upon this result, it is worthwhile to show the result more clearly, and with some statistical analysis. They argue in the discussion that "Our imaging assay established that 1 µM of capsaicin would stochastically activate subsets of Purkinje cells" which seems a stretch from the data as presented.
We appreciate this point, which was shared by Reviewer 1. We have added more data and performed statistical analysis (line 63 - 67 as well as Figure S1A)
(2) I found the practice of sorting effects by a mixture of effect size and p-value to be a little arbitrary, although in this case, it seems likely that it identified the most relevant effects. I would have preferred to see some attempt to correct for multiple comparisons (e.g. by resampling with the identities of fish shuffled to estimate the distribution of each measurement for this population size), followed by filtering for effect size after establishing a corrected threshold for significance.
We take the Reviewer’s point, though we note that critical values for effect size and pvalue are inevitably “a little arbitrary.” We can’t do the exact analysis the Reviewer suggests as we do not measure data from individual fish for these experiments. However, we did calculate new critical p-values (added to the Tables) that account for multiple comparisons using Šidák’s method.
(3) Figure 4. The data here is a little strange in that the slope in the control condition for medium speed is given as much larger than for slow, but the data in the two cases appears largely overlapping for most of the range of behavior, only diverging for the most extreme rotations. It seems perhaps that the measurement of slope is strongly dependent on these most extreme values. The authors might want to consider the use of robust regression methods which might mitigate these effects.
This is an interesting observation and we appreciate the Reviewer’s thoughtful suggestion. We now use a robust regression method (bisquare weighting of residuals).
We have adjusted all values in lines 175 - 177 and added the regression method to the Methods section line 520.
(4) Figure 5. The 'principal component analysis' description is extremely unclear. The text says that PCA 'showed near-complete segregation of trial types' but it is not explained how this was achieved with PCA or how this was quantified. Figure panels show the data plotted using different pairs of PCs showing visual evidence of segregation. In the methods, it is stated that "We performed principal component analysis" and that "cells were used for principal component analysis and subsequent support vector machine decoding analysis". What is meant exactly by 'performed PCA'? Was PCA used in a dimensionality reduction step? And if so, how many and which PCs were chosen and why? For visualization of the separation, the authors show arbitrary pairs of PCs. Could it be better to use a method more suited to that purpose such as linear discriminant analysis?
PCA was used to define a subspace to qualitatively evaluate if different trials could be separated. Once it became clear that it could, we next trained a binary decoder on the complete dataset (i.e. no dimensionality reduction). We did not perform linear discriminant analysis as the unsupervised PCA already showed separation of trial types. We have made this clearer in lines 212 - 214.
(5) Why does the decoding analysis use only untuned cells? Isn't it equally, or more, interesting to know how well tilt can be encoded using all cells? It is unclear to me what we learn by selecting only untuned cells for this analysis (although I agree it is interesting that this does work).
We focused exclusively on untuned cells because including even a single highly tuned cell for the population coding will lead to excellent results. By using untuned cells we test if there is some directionality information that is not visible just by looking at the up/ down responses of single cells. We have made this clear in lines 217 - 218
Minor points and corrections:
(1) Maybe consider losing the words 'powerful' (I think it is overused and not well defined) and 'reagent'. Reagent is normally used for something that participates in a reaction. It is a bit odd to use it to refer to a transgenic animal. Later it is called a 'tool' which seems better.
We have changed the wording and refer to it as tool for the whole paper.
(2) Figure 1D. Please use a color bar to indicate the scale.
We have added a color scale to the panel
(3) Saying that 'posture' increases is confusing, although the meaning can be inferred from the overall context and the definitions in the Methods - could Posture be capitalized to indicate a specific definition is being used rather than the general meaning?
This suggestion agrees with those made by Reviewer 2. We have changed the wording to “postural angle.”
(4) The arrowheads in Figure 2FHK are unnecessary and confusing (why are some horizontal and some vertical?).
Thank you for that suggestion, we have removed the arrowheads.
(5) Figure 3 The legend should indicate that the image is shown with an inverted lookup table.
We have updated the legend
(6) Figure 3 D and E Titles would be helpful, so it is not necessary to refer to the legend to understand the difference.
We have added titles to the figure panels
(7) The dwell time for the 2-photon experiments is given in the manuscript, but I think the authors meant microseconds?
Thank you for pointing that out. We have corrected it to microseconds.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Weaknesses (Reviewer 1):
The role of Fgf signaling in gliogenesis and Foxg1 in neurogenesis is well known. It is not clear if Fgf18 is a direct target of Foxg1.
We agree with the reviewer- Fgf signaling is an established pro-gliogenic pathway (Duong et al 2019) and Foxg1 overexpression is known to promote neurogenesis in cultured neural stem cells (Branacaccio et al 2019). Our study links these two mechanisms, as the Reviewer has summarized: (a) we demonstrate that FOXG1 works via modulating Fgf signaling cell-autonomously within progenitors by regulating the levels of Fgfr3. (b) Loss of Foxg1 in postmitotic neurons results in the upregulation of Fgf ligand expression (possibly via indirect mechanisms) and this non-cell autonomously increases Fgf signaling in progenitors_. Our study is entirely performed _in vivo.
Revision: We have revised the manuscript to reflect that Fgf18 may be an indirect target of FOXG1 in postmitotic neurons.
Weaknesses (Reviewer 2):
It wasn't clear to me why the authors chose postnatal day 14 to examine the effects of Foxg1 deletion at E15 - this is a long time window, giving time for indirect consequences of Foxg1 deletion to influence development and thereby potentially complicating the interpretation of findings. For example, the authors show that there is no increased proliferation of astrocytes or death of neurons lacking Foxg1 shortly after cre-mediated deletion, but it remains formally possible (if perhaps unlikely) that these processes could be affected later during the time window. The rationale underlying the choice of this time point should be explained.
I don't agree with the statement in the very last sentence of the results section that "neurogenesis is not possible in the absence of [Foxg1]" as there are multiple reports in the literature demonstrating the presence of neurons in Foxg1-/- mice (eg: Xuan et al., 1995; Hanashima et al., 2002, Martynoga et al., 2005, Muzio and Mallamaci 2005). Perhaps the statement refers specifically to late-born cortical neurons. This point also arises in the discussion section.
Revisions:
(a) We have revised the manuscript to explain why we chose postnatal day 14 to examine the effects of Foxg1 deletion at E15.
● We have examined the transcriptomic dysregulation after Foxg1 deletion at E17.5, which is a reasonable period to identify potential direct targets. Furthermore, FOXG1 occupies the Fgfr3 locus in ChIP-seq performed at E15.5. Together, these support the interpretation that Fgfr3 is a direct target of Foxg1.
● As the Reviewer notes, we have investigated the possibility of increased proliferation of astrocytes and death of neurons and found no evidence suggesting these phenomena occur in the 3 days after loss of Foxg1. Cortical neurons are postmitotic and differentiated by E18.5, the stage at which we examined CC3 staining and found no difference in cell death in control and mutants (Supplementary Figure S2C, C’). The majority of progenitors (PAX6+ve cells) that lose Foxg1 at E15.5 express the gliogenic transcription factor NFIA by E18.5 (Figure 2C, C’), but hardly any express intermediate (neurogenic) progenitor marker TBR2 (Supplementary Figure S2B, B’). It is therefore unlikely that neurons are born from Foxg1 mutant progenitors and then die at a later stage.
● The cellular consequences of loss of Foxg1 require additional time to detect e.g. it takes ~ 5 days for GFAP to be detected in astrocytes once they are born. The P14 timepoint permits the assessment of oligogenesis which begins after astrogliogenesis and therefore permits a comprehensive assessment of the lineage of E15.5 Foxg1 null progenitors.
(b) Thank you for pointing out that the last sentence of the results section implied (incorrectly) that ALL neurogenesis is not possible in the absence of Foxg1 We have modified this (and the discussion) to reflect that this applies to E14/15 progenitors and late-born cortical neurons.
Recommendations for the authors (Reviewer 2):
(c) We thank the reviewer for this suggestion. We will modify the schematic (Figure 7) to remove any ambiguity regarding Foxg1 expression.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the current reviews.
Those comments are all valuable and very helpful for revising and improving our paper, as well as the important guiding significance to our researches. We have studied comments carefully and have made correction which we hope meet with approval.
Reviewer #3 (Public review):
Summary:
The manuscript by Ma et al. describes a multi-model (pig, mouse, organoid) investigation into how fecal transplants protect against E. coli infection. The authors identify A. muciniphila and B. fragilis as two important strains and characterize how these organisms impact the epithelium by modulating host signaling pathways, namely the Wnt pathway in lgr5 intestinal stem cells.
Strengths:
The strengths of this manuscript include the use of multiple model systems and follow up mechanistic investigations to understand how A. muciniphila and B. fragilis interacted with the host to impact epithelial physiology.
Weaknesses:
As in previous revisions, there remains concerning ambiguity in the methodology used for microbiota sequence analysis and it would be difficult to replicate the analysis in any meaningful way. In this revision, concerns about the rigor and reproducibility of this component of the manuscript have been increased. Readers should be cautious with interpretation of this data.
(1) In previous versions of the manuscript it would appear the correct bioproject accession was listed but, the actual link went to an unrelated project. The updated accession link appears to contain raw data; however, the authors state they used an Illumina HiSeq 2500. This would be an unusual choice for V3-V4 as it would not have read lengths long enough to overlap. Inspection of the first sample (SRR19164796) demonstrates that this is absolutely not the raw data, as there is a ~400 nt forward read, and a 0 length reverse read. All quality scores are set to 30. There is no logical way to go from HiSeq 2500 raw data and read lengths to what was uploaded to the SRA and it was certainly not described in the manuscript.
What we uploaded to the SRA was Contigs files for sample, we have modified the description on line 694.
(2) No multiple testing correction was applied to the microbiome data.
The alpha diversity indexes were tested using T-test and wilcox test, and we showed the result of T-test in FigureS1B. The p-values were corrected for multiple testing using the Benjamini-Hochberg method, we have modified the description on line 322.
---------
The following is the authors’ response to the previous reviews.
Public Reviews:
Reviewer #2 (Public Review):
Ma X. et al proposed that A. muciniphila was a key strain that promotes the proliferation and differentiation of intestinal stem cells through acting on the Wnt/β-catenin signaling pathway. They used various models, such as piglet model, mouse model and intestinal organoids to address how A. muciniphila and B. fragilis offer the protection against ETEC infection. They showed that FMT with fecal samples, A. muciniphila or B. fragilis protected piglets and/or mice from ETEC infection, and this protection is manifested as reduced intestinal inflammation/bacterial colonization, increased tight junction/Muc2 proteins, as well as proper Treg/Th17 cells. Additionally, they demonstrated that A. muciniphila protected basal-out and/or apical-out intestinal organoids against ETEC infection via Wnt signaling.
Comments on revised version:
Please add proper references to indicate the invasion of ETEC into organoids after 1 h of infection.
We have added references on line 211.
References:
Xiao K, Yang Y, Zhang Y, Lv QQ, Huang FF, Wang D, Zhao JC, Liu YL. 2022. Long-chain PUFA ameliorate enterotoxigenic Escherichia coli-induced intestinal inflammation and cell injury by modulating pyroptosis and necroptosis signaling pathways in porcine intestinal epithelial cells. Br. J. Nutr. 128(5):835-850.
Qian MQ, Zhou XC, Xu TT, Li M, Yang ZR, Han XY. 2023. Evaluation of Potential Probiotic Properties of Limosilactobacillus fermentum Derived from Piglet Feces and Influence on the Healthy and E. coli-Challenged Porcine Intestine. Microorganisms. 11(4).
Reviewer #3 (Public Review):
Summary:
The manuscript by Ma et al. describes a multi-model (pig, mouse, organoid) investigation into how fecal transplants protect against E. coli infection. The authors identify A. muciniphila and B. fragilis as two important strains and characterize how these organisms impact the epithelium by modulating host signaling pathways, namely the Wnt pathway in lgr5 intestinal stem cells.
Strengths:
The strengths of this manuscript include the use of multiple model systems and follow up mechanistic investigations to understand how A. muciniphila and B. fragilis interacted with the host to impact epithelial physiology.
Weaknesses:
After an additional revision, the bioinformatics section of the methods has changed significantly from previous versions and now indicates a third sequencer was used instead: Ion S5 XL. Important parameters required to replicate analysis have still not been provided. Inspection of the SRA data indicates a mix of Illumina MiSeq and Illumina HiSeq 2500. It is now unclear which sequencing technology was used as authors have variably reported 4 different sequencers for these samples. Appropriate metadata was not provided in the SRA, although some groups may be inferred from sample names. These changing descriptions of the methodologies and ambiguity in making the data available create concerns about rigor of study and results.
Due to confusing the sequencing method of this experiment with other experiment samples, we apologize for the multiple incorrect modifications of the method description. We have modified the method for microbiome sequencing technology on line 304. The sequencing technology is Illumina HiSeq 2500. The SRA metadata can be viewed at https://www.ncbi.nlm.nih.gov/sra/PRJNA837047. The sample names ep1-6 and ef1-6 were correspond to the EP and EF groups, respectively.
Recommendations For the Authors:
As in the previous revision:
-provide important parameters required to replicate analysis
-ensure that reporting of sequencing technology is correct as data listed on SRA appears to be derived from Illumina sequencers, and was deposited indicating as such.
-update SRA metadata such that experimental groups are clear and match the nomenclature used in the manuscript (Particularly for samples which are labelled [A-Z][0-9]
- The multiple testing correction wasn’t applied.
-Due to confusing the sequencing method of this experiment with other experiment samples, we apologize for the multiple incorrect modifications of the method description. We have modified the method for microbiome sequencing technology on line 304. The sequencing technology is Illumina HiSeq 2500.
- The SRA metadata can be viewed at https://www.ncbi.nlm.nih.gov/sra/PRJNA837047. The sample names ep1-6 and ef1-6 were correspond to the EP and EF groups, respectively.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Public Reviews:
Reviewer #1 (Public review):
Summary:
The authors investigate the effects of aging on auditory system performance in understanding temporal fine structure (TFS), using both behavioral assessments and physiological recordings from the auditory periphery, specifically at the level of the auditory nerve. This dual approach aims to enhance understanding of the mechanisms underlying observed behavioral outcomes. The results indicate that aged animals exhibit deficits in behavioral tasks for distinguishing between harmonic and inharmonic sounds, which is a standard test for TFS coding. However, neural responses at the auditory nerve level do not show significant differences when compared to those in young, normal-hearing animals. The authors suggest that these behavioral deficits in aged animals are likely attributable to dysfunctions in the central auditory system, potentially as a consequence of aging. To further investigate this hypothesis, the study includes an animal group with selective synaptic loss between inner hair cells and auditory nerve fibers, a condition known as cochlear synaptopathy (CS). CS is a pathology associated with aging and is thought to be an early indicator of hearing impairment. Interestingly, animals with selective CS showed physiological and behavioral TFS coding similar to that of the young normal-hearing group, contrasting with the aged group's deficits. Despite histological evidence of significant synaptic loss in the CS group, the study concludes that CS does not appear to affect TFS coding, either behaviorally or physiologically.
We agree with the reviewer’s summary.
Strengths:
This study addresses a critical health concern, enhancing our understanding of mechanisms underlying age-related difficulties in speech intelligibility, even when audiometric thresholds are within normal limits. A major strength of this work is the comprehensive approach, integrating behavioral assessments, auditory nerve (AN) physiology, and histology within the same animal subjects. This approach enhances understanding of the mechanisms underlying the behavioral outcomes and provides confidence in the actual occurrence of synapse loss and its effects. The study carefully manages controlled conditions by including five distinct groups: young normal-hearing animals, aged animals, animals with CS induced through low and high doses, and a sham surgery group. This careful setup strengthens the study's reliability and allows for meaningful comparisons across conditions. Overall, the manuscript is well-structured, with clear and accessible writing that facilitates comprehension of complex concepts.
Weaknesses:
The stimulus and task employed in this study are very helpful for behavioral research, and using the same stimulus setup for physiology is advantageous for mechanistic comparisons. However, I have some concerns about the limitations in auditory nerve (AN) physiology. Due to practical constraints, it is not feasible to record from a large enough population of fibers that covers a full range of best frequencies (BFs) and spontaneous rates (SRs) within each animal. This raises questions about how representative the physiological data are for understanding the mechanism in behavioral data. I am curious about the authors' interpretation of how this stimulus setup might influence results compared to methods used by Kale and Heinz (2010), who adjusted harmonic frequencies based on the characteristic frequency (CF) of recorded units. While, the harmonic frequencies in this study are fixed across all CFs, meaning that many AN fibers may not be tuned closely to the stimulus frequencies.
We chose the stimuli for the AN recordings to be identical to the stimuli used in the behavioral evaluation of the perceptual sensitivity. Only with this approach can we directly compare the response of the population of AN fibres with perception measured in behaviour. We will address this more clearly in the revision.
If units are not responsive to the stimulus further clarification on detecting mistuning and phase locking to TFS effects within this setup would be valuable.
It is unclear to us what the reviewer alludes to. We ask to rephrase the question.
Given the limited number of units per condition-sometimes as few as three for certain conditions - I wonder if CF-dependent variability might impact the results of the AN data in this study and discussing this factor can help with better understanding the results. While the use of the same stimuli for both behavioral and physiological recordings is understandable, a discussion on how this choice affects interpretation would be beneficial. In addition a 60 dB stimulus could saturate high spontaneous rate (HSR) AN fibers, influencing neural coding and phase-locking to TFS. Potentially separating SR groups, could help address these issues and improve interpretive clarity.
In the discussion of a revised version of the manuscript, we will point out the pros and cons of using fixed-level stimuli that were not adjusted in frequency to the BF.
A deeper discussion on the role of fiber spontaneous rate could also enhance the study. How might considering SR groups affect AN results related to TFS coding? While some statistical measures are included in the supplement, a more detailed discussion in the main text could help in interpretation. We do not think that it will be necessary to conduct any statistical analysis in addition to that already reported in the supplement.
We will consider moving some supplementary information back into the main manuscript when revising.
Although Figure S2 indicates no change in median SR, the high-dose treatment group lacks LSR fibers, suggesting a different distribution based on SR for different animal groups, as seen in similar studies on other species. A histogram of these results would be informative, as LSR fiber loss with CS-whether induced by ouabain in gerbils or noise in other animals-is well documented (e.g., Furman et al., 2013).
We will add information on the distribution when revising.
Although ouabain effects on gerbils have been explored in previous studies, since these data already seems to be recorded for the animal in this study, a brief description of changes in auditory brainstem response (ABR) thresholds, wave 1 amplitudes, and tuning curves for animals with cochlear synaptopathy (CS) in this study would be beneficial. This would confirm that ouabain selectively affects synapses without impacting outer hair cells (OHCs). For aged animals, since ABR measurements were taken, comparing hearing differences between normal and aged groups could provide insights into the pathologies besides CS in aged animals. Additionally, examining subject variability in treatment effects on hearing and how this correlates with behavior and physiology would yield valuable insights. If limited space maybe a brief clarification or inclusion in supplementary could be good enough.
We do indeed have data on ABR amplitudes and the wave 1 growth functions but only in response to broadband clicks. For more frequency-specific information, mass-potential recordings are available, obtained before and after ouabain treatment. Regarding neural tuning, we did not obtain full frequency-threshold curves but do have bandwidths for response curves recorded close to threshold. We are in the process of analyzing all these data further and will consider how to best incorporate them into the manuscript, to address the reviewer’s concerns.
Another suggestion is to discuss the potential role of MOC efferent system and effect of anesthesia in reducing efferent effects in AN recordings. This is particularly relevant for aged animals, as CS might affect LSR fibers, potentially disrupting the medial olivocochlear (MOC) efferent pathway. Anesthesia could lessen MOC activity in both young and aged animals, potentially masking efferent effects that might be present in behavioral tasks. Young gerbils with functional efferent systems might perform better behaviorally, while aged gerbils with impaired MOC function due to CS might lack this advantage. A brief discussion on this aspect could potentially enhance mechanistic insights.
Our provisional response below will be integrated in similar form into the Discussion.
Olivocochlear efferent activity is a potential modulator of OHC gain (by medial olivocochlear neurons, MOC) and afferent activity (by lateral olivocochlear neurons, LOC). Beyond this general observation it is, however, difficult to speculate about its specific role in the TFS1 test, as almost nothing is known about efferent activity under naturalistic conditions in a behaving animal (reviewed by Lauer et al., 2022). We note, however, that efferent activity is believed to be reduced under general anesthesia (reviewed by Guinan, 2011, DOI 10.1007/978-1-4419-7070-1_3) and possibly abnormal in other ways, considering the potential top-down inputs to the efferent neurons from extensive brain networks (reviewed by Schofield, 2011, DOI 10.1007/978-1-4419-7070-1_9; Romero and Trussell, 2022, DOI: 10.1016/j.heares.2022.108516). Thus, it is reasonable to assume a reduced efferent influence in our auditory-nerve data, compared to the behavioral test situation. In contrast, we assume more comparable efferent influences in young-adult and old gerbils. It was recently shown that, despite age-related losses in both MOC and LOC cochlear innervation, this basically reflected the loss of efferent target structures (OHC and type-I afferents), with the surviving cochlear circuitry remaining largely normal (Steenken et al., 2024, DOI: 10.3389/fnsyn.2024.1422330). The main difference was an increased proportion of OHC without any efferent innervation, predominantly in low-frequency cochlear regions (Steenken et al., 2024). Such OHC are thus not under efferent control, and they are more numerous (about 10 – 30%) in old gerbils.
Lastly, although synapse counts did not differ between the low-dose treatment and NH I sham groups, separating these groups rather than combining them with the sham might reveal differences in behavior or AN results, particularly regarding the significance of differences between aged/treatment groups and the young normal-hearing group. For maximizing statistical power, we combined those groups in the statistical analysis. These two groups did not differ in synapse number and had quite similar ABR wave 1 growth functions.
Reviewer #2 (Public review):
Summary:
Using a gerbil model, the authors tested the hypothesis that loss of synapses between sensory hair cells and auditory nerve fibers (which may occur due to noise exposure or aging) affects behavioral discrimination of the rapid temporal fluctuations of sounds. In contrast to previous suggestions in the literature, their results do not support this hypothesis; young animals treated with a compound that reduces the number of synapses did not show impaired discrimination compared to controls. Additionally, their results from older animals showing impaired discrimination suggest that age-related changes aside from synaptopathy are responsible for the age-related decline in discrimination.
We agree with the reviewer’s summary.
Strengths:
(1) The rationale and hypothesis are well-motivated and clearly presented.
(2) The study was well conducted with strong methodology for the most part, and good experimental control. The combination of physiological and behavioral techniques is powerful and informative. Reducing synapse counts fairly directly using ouabain is a cleaner design than using noise exposure or age (as in other studies), since these latter modifiers have additional effects on auditory function.
(3) The study may have a considerable impact on the field. The findings could have important implications for our understanding of cochlear synaptopathy, one of the most highly researched and potentially impactful developments in hearing science in the past fifteen years.
Weaknesses:
(1) My main concern is that the stimuli may not have been appropriate for assessing neural temporal coding behaviorally. Human studies using the same task employed a filter center frequency that was (at least) 11 times the fundamental frequency (Marmel et al., 2015; Moore and Sek, 2009). Moore and Sek wrote: "the default (recommended) value of the centre frequency is 11F0." Here, the center frequency was only 4 or 8 times the fundamental frequency (4F0 or 8F0). Hence, relative to harmonic frequency, the harmonic spacing was considerably greater in the present study. By my calculations, the masking noise used in the present study was also considerably lower in level relative to the harmonic complex than that used in the human studies. These factors may have allowed the animals to perform the task using cues based on the pattern of activity across the neural array (excitation pattern cues), rather than cues related to temporal neural coding. The authors show that mean neural driven rate did not change with frequency shift, but I don't understand the relevance of this. It is the change in response of individual fibers with characteristic frequencies near the lowest audible harmonic that is important here.
The auditory filter bandwidth of the gerbil is about double that of human subjects. Because of this, the masking noise has a larger overall level than in the human studies in the filter. This precludes that the gerbils can use excitation patterns, especially in the condition with a center frequency of 1600 Hz and a fundamental of 200 Hz and in the condition with a center frequency of 3200 Hz and a fundamental of 400 Hz.
The case against excitation pattern cues needs to be better made in the Discussion. It could be that gerbil frequency selectivity is broad enough for this not to be an issue, but more detail needs to be provided to make this argument. The authors should consider what is the lowest audible harmonic in each case for their stimuli, given the level of each harmonic and the level of the pink noise. Even for the 8F0 center frequency, the lowest audible harmonic may be as low as the 4th (possibly even the 3rd). In human, harmonics are thought to be resolvable by the cochlea up to at least the 8th.
Because of the gerbil’s broader auditory filters, with the exception of the condition with center frequency of 1600 Hz and fundamental of 400 Hz harmonics are are not resolved. We will expand the topic of potential excitation pattern cues in the discussion of the revised version and add results on modeled excitation patterns to the supplement.
(2) The synapse reductions in the high ouabain and old groups were relatively small (mean of 19 synapses per hair cell compared to 23 in the young untreated group). In contrast, in some mouse models of the effects of noise exposure or age, a 50% reduction in synapses is observed, and in the human temporal bone study of Wu et al. (2021, https://doi.org/10.1523/JNEUROSCI.3238-20.2021) the age-related reduction in auditory nerve fibres was ~50% or greater for the highest age group across cochlear location. It could be simply that the synapse loss in the present study was too small to produce significant behavioral effects. Hence, although the authors provide evidence that in the gerbil model the age-related behavioral effects are not due to synaptopathy, this may not translate to other species (including human). This should be discussed in the manuscript.
Our provisional response below will be integrated in similar form into the Discussion.
The observed extent of age-related or noise-induced loss of type-I afferent synapses on IHC varies widely between species and studies. For example, in ageing CBA/CaJ mice, mean losses of between 20 and 50% of afferent synapses (depending on cochlear location and precise age) were reported (Sergeyenko et al., 2013, DOI: 10.1523/JNEUROSCI.1783-13.2013; Kobrina et al., 2020, DOI: 10.1016/j.neurobiolaging.2020.08.012). Humans showed more pronounced losses of peripheral axons, of 40–100%, again depending on cochlear location, precise age, and noise history (Wu et al., 2019, DOI: 10.1016/j.neuroscience.2018.07.053; 2021, DOI: 10.1523/JNEUROSCI.3238-20.2021). The age-related and induced synapse losses in our gerbils were in a more moderate range, around 20% (Steenken et al., 2021, DOI: 10.1016/j.neurobiolaging.2021.08.019; this study). Thus, it is possible that a more severe, induced synaptopathy would have resulted in behavioral deficits in young-adult gerbils. However, in the absence of additional noise or pharmacologically induced damage, our study provides strong evidence for other factors causing temporal processing problems with advancing age. Our 3-year-old gerbils are approximately comparable to a 60-year-old human (Castano-Gonzalez et al., 2024, DOI: 10.1016/j.heares.2024.108989) with beginning but not yet clinically relevant hearing loss (Hamann et al., 2002, DOI: 10.1016/S0378-5955(02)00454-9).
It would be informative to provide synapse counts separately for the animals who were tested behaviorally, to confirm that the pattern of loss across the group was the same as for the larger sample.
Yes, the pattern was the same for the subgroup of behaviorally tested animals. We will add this information to the revised version of the manuscript.
(3) The study was not pre-registered, and there was no a priori power calculation, so there is less confidence in replicability than could have been the case. Only three old animals were used in the behavioral study, which raises concerns about the reliability of comparisons involving this group.
The results for the three old subjects differed significantly from those of young subjects and young ouabain-treated subjects. This indicates a sufficient statistical power, since otherwise no significant differences would be observed.
Reviewer #3 (Public review):
This study is a part of the ongoing series of rigorous work from this group exploring neural coding deficits in the auditory nerve, and dissociating the effects of cochlear synaptopathy from other age-related deficits. They have previously shown no evidence of phase-locking deficits in the remaining auditory nerve fibers in quiet-aged gerbils. Here, they study the effects of aging on the perception and neural coding of temporal fine structure cues in the same Mongolian gerbil model.
They measure TFS coding in the auditory nerve using the TFS1 task which uses a combination of harmonic and tone-shifted inharmonic tones which differ primarily in their TFS cues (and not the envelope). They then follow this up with a behavioral paradigm using the TFS1 task in these gerbils. They test young normal hearing gerbils, aged gerbils, and young gerbils with cochlear synaptopathy induced using the neurotoxin ouabain to mimic synapse losses seen with age. In the behavioral paradigm, they find that aging is associated with decreased performance compared to the young gerbils, whereas young gerbils with similar levels of synapse loss do not show these deficits. When looking at the auditory nerve responses, they find no differences in neural coding of TFS cues across any of the groups.
However, aged gerbils show an increase in the representation of periodicity envelope cues (around f0) compared to young gerbils or those with induced synapse loss. The authors hence conclude that synapse loss by itself doesn't seem to be important for distinguishing TFS cues, and rather the behavioral deficits with age are likely having to do with the misrepresented envelope cues instead.
We agree with the reviewer’s summary.
The manuscript is well written, and the data presented are robust. Some of the points below will need to be considered while interpreting the results of the study, in its current form. These considerations are addressable if deemed necessary, with some additional analysis in future versions of the manuscript.
Spontaneous rates - Figure S2 shows no differences in median spontaneous rates across groups. But taking the median glosses over some of the nuances there. Ouabain (in the Bourien study) famously affects low spont rates first, and at a higher degree than median or high spont rates. It seems to be the case (qualitatively) in Figure S2 as well, with almost no units in the low spont region in the ouabain group, compared to the other groups. Looking at distributions within each spont rate category and comparing differences across the groups might reveal some of the underlying causes for these changes. Given that overall, the study reports that low-SR fibers had a higher ENV/TFS log-z-ratio, the distribution of these fibers across groups may reveal specific effects of TFS coding by group.
As the reviewer points out, our sample from the group treated with a high concentration of ouabain showed very few low-spontaneous-rate auditory-nerve fibers, as expected from previous work. However, this was also true, e.g., for our sample from sham-operated animals, and may thus well reflect a sampling bias. We are therefore reluctant to attach much significance to these data distributions. We will consider moving some supplementary information back into the main manuscript when revising.
Threshold shifts - It is unclear from the current version if the older gerbils have changes in hearing thresholds, and whether those changes may be affecting behavioral thresholds. The behavioral stimuli appear to have been presented at a fixed sound level for both young and aged gerbils, similar to the single unit recordings. Hence, age-related differences in behavior may have been due to changes in relative sensation level. Approaches such as using hearing thresholds as covariates in the analysis will help explore if older gerbils still show behavioral deficits.
Unfortunately, we did not obtain behavioral thresholds that could be used here. The ABR thresholds, although not directly comparable to behavioral thresholds, suggest that our old animals had at most a moderate threshold increase in quiet. Furthermore, we want to point out that the TFS 1 stimuli had an overall level of 68 dB SPL, and the pink noise masker would have increased the threshold more than expected from the moderate, age-related hearing loss in quiet. Thus, the masked thresholds for all gerbil groups are likely similar and should have no effect on the behavioral results.
Task learning in aged gerbils - It is unclear if the aged gerbils really learn the task well in two of the three TFS1 test conditions. The d' of 1 which is usually used as the criterion for learning was not reached in even the easiest condition for aged gerbils in all but one condition for the aged gerbils (Fig. 5H) and in that condition, there doesn't seem to be any age-related deficits in behavioral performance (Fig. 6B). Hence dissociating the inability to learn the task from the inability to perceive TFS 1 cues in those animals becomes challenging.
Even in the group of gerbils with the lowest sensitivity, for the condition 400/1600 the animals achieved a d’ of on average above 1. Furthermore, stimuli were well above threshold and audible, even when no discrimination could be observed. Finally, as explained in the methods, different stimulus conditions were interleaved in each session, providing stimuli that were easy to discriminate together with those being difficult to discriminate. This approach ensures that the gerbils were under stimulus control, meaning properly trained to perform the task. Thus, an inability to discriminate does not indicate a lack of proper training.
Increased representation of periodicity envelope in the AN - the mechanisms for increased representation of periodicity envelope cues is unclear. The authors point to some potential central mechanisms but given that these are recordings from the auditory nerve what central mechanisms these may be is unclear. If the authors are suggesting some form of efferent modulation only at the f0 frequency, no evidence for this is presented. It appears more likely that the enhancement may be due to outer hair cell dysfunction (widened tuning, distorted tonotopy). Given this increased envelope coding, the potential change in sensation level for the behavior (from the comment above), and no change in neural coding of TFS cues across any of the groups, a simpler interpretation may be -TFS coding is not affected in remaining auditory nerve fibers after age-related or ouabain induced synapse loss, but behavioral performance is affected by altered outer hair cell dysfunction with age.
A similar point is made by Reviewer #1. As indicated above, we do have limited data on neural bandwidths and will explore if these are sufficient to address the reviewers’ questions about potential, age-related changes in neural tuning in our sample. Previous work found no substantial OHC losses (Tarnowski et al., 1991, DOI: 10.1016/0378-5955(91)90142-V; Adams and Schulte, 1997, DOI: 10.1016/S0378-5955(96)00184-0; Steenken et al., 2024, DOI: 10.3389/fnsyn.2024.1422330) nor any deterioration in neural frequency tuning (Heeringa et al., 2020, DOI: 10.1523/JNEUROSCI.2784-18.2019), in quiet-aged gerbils of similar age as the ones used here.
Emerging evidence seems to suggest that cochlear synaptopathy and/or TFS encoding abilities might be reflected in listening effort rather than behavioral performance. Measuring some proxy of listening effort in these gerbils (like reaction time) to see if that has changed with synapse loss, especially in the young animals with induced synaptopathy, would make an interesting addition to explore perceptual deficits of TFS coding with synapse loss.
This is an interesting suggestion that we will explore in the revision of the manuscript. Reaction times were recorded for responses that can be used as a proxy for listening effort.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public review):
Summary:
This study provides convincing evidence on the infraslow oscillation of DG cells during NREM sleep, and how serotonergic innervation modulates hippocampal activity pattern during sleep and memory.
Strengths and Weaknesses:
The authors used state-of-the-art techniques to carry out these experiments. Given that the functional role of infraslow rhythm still remains to be studied, this study provides convincing evidence of the role of DG cells in regulating infraslow rhythm, sleep microarchitecture, and memory.
I have a few minor comments.
(1) Decreased infraslow rhythm during NREMs in the 5ht1a KO mice is striking. It would be helpful to know whether sleep-wake states, MAs, and transitions to REMs are changed.
We agree with the reviewer that serotonin receptors may be involved in sleep regulation therefore it is important to analyze the effect of their manipulation. We would also like to bring to the attention of the reviewer that in this case we restricted the 5ht1a manipulation to the hippocampus which does not have a known impact on sleep-wake regulation. The analysis of our recorded dataset from these mice confirmed this notion, because we did not see any changes in sleep metrics (see: supplementary figure 6A).
(2) It would be interesting to discuss whether the magnitude in changes of infraslow rhythm strength is correlated with memory performance (Figure 6).
We agree with the reviewer that this could be an interesting point. In our experiments we wanted to minimize the impact of the surgical procedures on the behavior, thus we used separate cohorts to record the photometry and to carry out the behavior experiments, therefore we are unable to correlate behavior and infraslow oscillatory amplitudes in our dataset.
However, a similar experiment was carried out in a recent paper where the authors discovered that the norepinephrine system also displays infraslow oscillatory cycles during NREM sleep (Kjaerby et al 2022). The authors of that paper gradually decreased the magnitude of the NE pulses during NREM by optogenetic manipulation of the locus coeruleus which led to a fragmented sleep phenotype characterized by increased micro arousal occurrence, decreased REM and reduced spindle activity. They also tested the memory performance of the mice in a novel object recognition task and found diminished performance level in the opto group. Serotonin has multiple roles in the brain, many of them show overlap with proposed functions of the noradrenergic system including regulation of plasticity, signaling reward or fearful stimuli. Therefore, we speculate that the modification of serotonin dynamics during sleep will most likely interfere with memory performance.
We inserted this paragraph in the discussion part of our paper.
(3) The authors should cite the Oikonomou Neuron paper that describes slow oscillatory activity of DRN SERT neurons during NREM sleep.
Thank you for the suggestion, we inserted this paper in the manuscript.
(4) The authors should clarify how they define the phasic pattern of the photometry signal.
We have added the details in the Methods.
Reviewer #2 (Public review):
Summary:
The authors investigated DG neuronal activity at the population and single-cell level across sleep/wake periods. They found an infraslow oscillation (0.01-0.03 Hz) in both granule cells (GC) and mossy cells (MC) during NREM sleep.
The important findings are:
(1) The antiparallel temporal dynamics of DG neuron activities and serotonin neuron activities/extracellular serotonin levels during NREM sleep, and
(2) The GC Htr1a-mediated GC infraslow oscillation.
Strengths:
(1) The combination of polysomnography, Ca-fiber photometry, two-photon microscopy, and gene depletion is technically sound. The coincidence of microarousals and dips in DG population activity is convincing. The dip in activity in upregulated cells is responsible for the dip at the population level.
(2) DG GCs express excitatory Htr4 and Htr7 in addition to inhibitory Htr1a, but deletion of Htr1a is sufficient to disrupt DG GC infraslow oscillation, supporting the importance of Htr1a in DG activity during NREM sleep.
Weaknesses:
(1) The current data set and analysis are insufficient to interpret the observation correctly.
a. In Figure 1A, during NREM, the peaks and troughs of GC population activities seem to gradually decrease over time. Please address this point.
Thank you for the suggestion. We have analyzed and compared the magnitude of the oscillatory signals in the first and last minute of the NREM sleep epochs in Dock10-Cre mice and found no significant difference. However, we did observe that the ISO amplitude is smaller in the early stage of the first NREM epochs, defined as those with the prior wakefulness longer than 5 minutes (new supplementary figure 1).
b. In Figure 1F, about 30% of Ca dips coincided with MA (EMG increase) and 60% of Ca dips did not coincide with EMG increase. If this is true, the readers can find 8 Ca dips which are not associated with MAs from Figure 1E. If MAs were clustered, please describe this properly.
We did not find evidence that MAs were clustered in our dataset (see a representative example in supplementary figure 1A). We replaced the example trace with a new one which shows calcium dips with and without MAs. We believe this new trace better represents the data.
c. In Figure 1F, the legend stated the percentage during NREM. If the authors want to include the percentage of wake and REM, please show the traces with Ca dips during wake and REM. This concern applies to all pie charts provided by the authors.
Figure 1F (and all other pie charts) shows the outcome of brain states following a calcium-dip episode. That is, we found that the Ca-dips during NREM were followed by MAs in 30% of the cases, 59% of the Ca-dips led to the maintenance of NREM (no MAs) while in 2% and 9% of the cases we detected either REM state or wakening of the animal. These numbers correspond very well with similar analysis done in a recent paper which looked at the infraslow oscillatory behavior of the norepinephrine system (Kjaerby et al 2022) during NREM sleep. We apologize if the wording in the manuscript was misleading, we modified the figure legends to clarify what the pie charts represent.
d. In Figure 1C, please provide line plots connecting the same session. This request applies to all related figures.
We have replaced the dot plots in all related figures with the line plots.
e. In Figure 2C, the significant increase during REM and the same level during NREM are not convincing. In Figure 2A, the several EMG increasing bouts do not appear to be MA, but rather wakefulness, because the duration of the EMG increase is greater than 15 seconds. Therefore, it is possible that the wake bouts were mixed with NREM bouts, leading to the decrease of Ca activity during NREM. In fact, In Figure 2E, the 4th MA bout seems to be the wake bout because the EMG increase lasts more than 15 seconds.
We have replaced the Figure 2C with line plots as suggested above. It is clear that MC activity during REM sleep is higher, compared to that in NREM sleep, whereas the overall difference between wake and NREM is not significant (some increased, some decreased). Regarding the MAs, we have added a trace of averaged EMG signals in Figure 2G, showing that the averaged EMG bursts during MA are shorter than 5 seconds.
f. Figure 5D REM data are interesting because the DRN activity is stably silenced during REM. The varied correlation means the varied DG activity during REM. The authors need to address it.
We thank the reviewer for this suggestion. We have added this point to the discussion. We speculate that inputs from the supramammillary nucleus or entorhinal cortex to the DG during REM sleep may both contribute to this variability.
g. In Figure 6, the authors should show the impact of DG Htr1a knockdown on sleep/wake structure including the frequency of MAs. I agree with the impact of Htr1a on DG ISO, but possible changes in sleep bout may induce the DG ISO disturbance.
As suggested, we have performed sleep analysis in the Htr1a knockdown experiments including MA quantification. We have found no significant difference between Hrt1-knockdown and control mice in any of the sleep metrics (see: supplemental figure 6). Our interpretation is that the lack of changes in sleep/wake cycles is likely due to the hippocampus not being directly involved in regulating these brain states.
(2) It is acceptable that DG Htr1a KO induces the reduced freezing in the CFC test (Figure 6E, F), but it is too much of a stretch that the disruption of DG ISO causes impaired fear memory. There should be a correlation.
We have modified the discussion accordingly.
(3) It is necessary to describe the extent of AAV-Cre infection. The authors injected AAV into the dorsal DG (AP -1.9 mm), but the histology shows the ventral DG (Supplementary Figure 4), which reduces the reliability of this study.
The histology image shown in the manuscript was taken from the -2.5 mm anteroposterior level, which we still consider to be part of the dorsal DG. For additional clarity, we have replaced the figure with new histology images slightly more anterior position (AP~2.0mm).
Reviewer #3 (Public review):
Summary:
The authors employ a series of well-conceived and well-executed experiments involving photometric imaging of the dentate gyrus and raphe nucleus, as well as cell-type specific genetic manipulations of serotonergic receptors that together serve to directly implicate serotonergic regulation of dentate gyrus (DG) granule (GC) and mossy cell (MC) activity in association with an infra slow oscillation (ISO) of neural activity has been previously linked to general cortical regulation during NREM sleep and microarousals.
Strengths:
There are a number of novel and important results, including the modulation of dentage granule cell activity by the infraslow oscillation during NREM sleep, the selective association of different subpopulations of granule cells to microarousals (MA), the anticorrelation of raphe activity with infraslow dentate activity.
The discussion includes a general survey of ISOs and recent work relating to their expression in other brain areas and other potential neuromodulatory system involvement, as well as possible connections with infraslow oscillations, micro-arousals, and sensory sensitivity.
Weaknesses:
(1) The behavioral results showing contextual memory impairment resulting from 5-HT1a knockdown are fine but are over-interpreted. The term memory consolidation is used several times, as well as references to sleep-dependence. This is not what was tested. The receptor was knocked down, and then 2 weeks later animals were found to have fear conditioning deficits. They can certainly describe this result as indicating a connection between 5-HT1a receptor function and memory performance, but the connection to sleep and consolidation would just be speculation. The fact that 5-HT1a knockdown also impacted DG ISOs does not establish dependency. Some examples of this are:
a. The final conclusion asserts "Together, our study highlights the role of neuromodulation in organizing neuronal activity during sleep and sleep-dependent brain functions, such as memory.". However, the reported memory effects (impairment of fear conditioning) were not shown to be explicitly sleep-dependent.
We thank the reviewer for this comment. We have revised the sentence.
b. Earlier in the discussion it mentions "Finally, we showed that local genetic ablation of 5-HT1a receptors in GCs impaired the ISO and memory consolidation". The effect shown was on general memory performance - consolidation was not specifically implicated.
We have revised the sentence.
(2) The assertion on page 9 that the results demonstrate "that the 5-HT is directly acting in the DG to gate the oscillations" is a bit strong given the magnitude of effect shown in Figure 6D, and the absence of demonstration of negative effect on cortical areas that also show ISO activity and could impact DG activity (see requested cortical sigma power analysis).
We have revised the sentence.
(3) Recent work has shown that abnormal DG GC activity can result from the use of the specific Ca indicator being used (GCaMP6s). (Teng, S., Wang, W., Wen, J.J.J. et al. Expression of GCaMP6s in the dentate gyrus induces tonic-clonic seizures. Sci Rep 14, 8104 (2024). https://doi.org/10.1038/s41598-024-58819-9). The authors of that study found that the effect seemed to be specific to GCaMP6s and that GCaMP6f did not lead to abnormal excitability. Note this is of particular concern given similar infraslow variation of cortical excitability in epilepsy (cf Vanhatalo et al. PNAS 2004). While I don't think that the experiments need to be repeated with a different indicator to address this concern, you should be able to use the 2p GCaMP7 experiments that have already been done to provide additional validation by repeating the analyses done for the GCaMP6s photometry experiments. This should be done anyway to allow appropriate comparison of the 2p and photometry results.
We would like to thank the reviewer for this comment. We also analyzed the two-photon data in the same manner as the photometry data. However, the only supportive evidence that might be related to ISO in the two-photon data, recorded at the somatic level, was decreased fluorescence during MAs in the NREM-upregulated cell group (see Figure 3 D, E). We are unsure why this discrepancy exists, but we have discussed it in the manuscript and offered some alternative explanations. One hypothesis we are currently exploring relates to the different subcellular compartments sampled by the two imaging techniques. The photometry probe was implanted above the dentate gyrus, and since light collection efficiency declines sharply with distance from the probe tip (Pisano et al., 2019), we hypothesize that ISO is stronger at the dendritic level which directly receive the inputs from entorhinal cortex, and which is closest to the probe's tip. We are now conducting multiplane two-photon imaging experiments in our labs to test this hypothesis.
(4) While the discussion mentions previous work that has linked ISOs during sleep with regulation of cortical oscillations in the sigma band, oddly no such analysis is performed in the current work even though it is presumably available and would be highly relevant to the interpretation of a number of primary results including the relationship between the ISOs and MAs observed in the DG and similar results reported in other areas, as well as the selective impact of DG 5-HT1a knockdown on DG ISOs. For example, in the initial results describing the cross-correlation of calcium activity and EMG/EEG with MA episodes (paragraph 1, page 4), similar results relating brief arousals to the infraslow fluctuation in sleep spindles (sigma band) have been reported also at .02 Hz associated with variation in sensory arousability (cf. Cardis et al., "Cortico-autonomic local arousals and heightened somatosensory arousability during NREMS of mice in neuropathic pain", eLife 2021). It would be important to know whether the current results show similar cortical sigma band correlations. Also, in the results on ISO attenuation following 5-HT1 knockdown on page 7 (Figure 6), how is cortical EEG affected? Is ISO still seen in EEG but attenuated in DG?
Thank you for this valuable comment. We performed the analysis and found a positive correlation between cortical sigma band activity and DG activity during NREM sleep (see supplementary figure 1C-1E). Additionally, we conducted further analyses using the local 5-HT1a KO mouse model but did not observe significant changes in sleep architecture or MA frequency (see supplementary figure 6A). It is also important to note that ISO was only analyzed using calcium signals, not EEG signals. The standard filtering settings in our EEG data collection (0.5-500 Hz) do not allow us to analyze signals in such a low-frequency range.
(5) The illustrations of the effect of 5-HT1a knockdown shown in Figure 6 are somewhat misleading. The examples in panels B and C show an effect that is much more dramatic than the overall effect shown in panel D. Panels B and C do not appear to be representative examples. Which of the sample points in panel D are illustrated in panels B and C? It is not appropriate to arbitrarily select two points from different animals for comparison, or worse, to take points from the extremes of the distributions. If the intent is to illustrate what the effect shown in D looks like in the raw data, then you need to select examples that reflect the means shown in panel D. It is also important to show the effect on cortical EEG, particularly in sigma band to see if the effects are restricted to the DG ISOs. It would also be helpful to show that MAs and their correlations as shown in Figure 1 or G as well as broader sleep architecture are not affected.
We agree with the reviewer that the chosen example may appear somewhat exaggerated. However, we must point out that visually assessing missing or downregulated frequency components can be challenging. To provide a more objective presentation, we included Supplementary Figure 6B-C, in which we performed analysis similar to that in Fig1G in 5HT1a mice. These figures show a significant decrease in ISO amplitude, though the blockade is not complete, due to the incomplete nature of genetic manipulation with viral injection (see Suppl Fig 5). Furthermore, recent studies (Dong et al., 2023; Zhang et al., 2024; Kjaerby et al., 2022) have identified several other neuromodulatory and peptidergic systems that might affect DG activity during MAs.
To explore this further, we conducted pharmacological experiments. We administered 8-hydroxy-DPAT, a 5-HT1a agonist (i.p. 1 mg/kg) in Dock10-Cre mice injected with AAV-FLEX-GcaMP6s in the DG. Since 5-HT1a receptors act as autoreceptors on raphe 5-HT neurons, this treatment effectively silences the serotonergic system, thereby “removing” 5-HT signaling from the brain. The results, shown in Author response image 1, indicate that pharmacological suppression of 5-HT dampens the ISO in the DG during subsequent sleep intervals, with ISO recovering after the drug is washed out. These findings are consistent with the results obtained with the more specific local genetic manipulation. We have not included this result in the manuscript because we believe that the local downregulation is a cleaner experiment whose interpretation is more straightforward.
Author response image 1.
Finally, we also performed sleep analysis in 5-HT1a KO mice, showing that the local downregulation of 5-HT1a receptors had no significant effect on sleep metrics (Suppl Fig 6A). The hippocampus is not typically involved in regulating sleep-wake cycles, so we believe this result is consistent with that understanding.
(6) On page 9 of the results it states that GCs and MCs are upregulated during NREM and their activity is abruptly terminated by MAs through a 5-HT mediated mechanism. I didn't see anything showing the 5-HT dependence of the MA activity correlation. The results indicate a reduction in ISO modulation of GC activity but not the MA-correlated activity. I would like to see the equivalent of Figure 1,2 G panels with the 5-HT1a manipulation.
We agree with the reviewer on this point. We did not conduct any pharmacological or genetic manipulation in 2-photon calcium imaging experiments. We have removed that statement. As for the suggested analysis, please see our explanation above (Suppl Fig 6B-C).
Recommendations for the authors:
Reviewer #2 (Recommendations for the authors):
(1) Since the authors did not monitor DG neuronal activity with an electrophysiological tool, please rephrase the following sentence: "In this study, we investigated the neuronal activity of the dentate gyrus (DG) with electrophysiological and optical imaging tools during sleep-wake cycles." in the Abstract.
We have rephrased the sentence as suggested.
(2) Since the authors did not manipulate the serotonin release during sleep to investigate whether serotonin release modulates DG ISO, please edit the following sentence: "Further experiments revealed that the infraslow oscillation in the DG is modulated by rhythmic serotonin release during sleep" in the Abstract.
We have rephrased the sentence as suggested.
(3) Single-cell recording in DG with two-photon microscopy may address the issue raised in the 4th paragraph of the Discussion. In addition, in Fig 6C, the photometry has only captured the diminished oscillation in Htr1a KO, but cannot distinguish whether the activity levels of GC remain at high or low, which is a clear disadvantage of photometry.
We agree with the reviewer, and have added text to the discussion.
Reviewer #3 (Recommendations for the authors):
(1) Some of the figures are missing labels in the spectrogram panels (e.g. no freq units in Figures 4 and 6).
We have added information in those figures.
(2) Missing specific locations for EEG electrodes/screws. The text states "we predrilled 2 holes on the right side of the skull (1.5 mm posterior of the Bregma) for implanting recording electrodes". 2 holes on the right side of the skull are pretty vague.
We have added this information in the Methods.
(3) Some additional work that could be cited particularly when discussing the serotonergic impact on hippocampal function as it might relate to sleep and memory would include work linking mesopontine activity (both serotonergic and non-serotonergic) to memory-associated hippocampal sharp-wave ripple activity (e.g. Jelitai et al. Front. Neural Circ. 2021, Wang et al Nat. Neuro. 2015).
We have cited these papers.
(4) The work cited at the beginning of the Results describing higher population calcium activity during sleep states (15,18,30) is generally appropriate but not explicitly related to GCamP imaging. Pilz et al. "Functional Imaging of Dentate Granule Cells in the Adult Mouse Hippocampus", J.Neurosci. 2016 might be a more relevant citation.
We have added the citation.
-
-
-
Author response:
The following is the authors’ response to the previous reviews.
Public Reviews:
Reviewer #2 (Public Review):
Summary:
This computational modeling study addresses the observation that variable observations are interpreted differently depending on how much uncertainty an agent expects from its environment. That is, the same mismatch between a stimulus and an expected stimulus would be less significant, and specifically would represent a smaller prediction error, in an environment with a high degree of variability than in one where observations have historically been similar to each other. The authors show that if two different classes of inhibitory interneurons, the PV and SST cells, (1) encode different aspects of a stimulus distribution and (2) act in different (divisive vs. subtractive) ways, and if (3) synaptic weights evolve in a way that causes the impact of certain inputs to balance the firing rates of the targets of those inputs, then pyramidal neurons in layer 2/3 of canonical cortical circuits can indeed encode uncertainty-modulated prediction errors. To achieve this result, SST neurons learn to represent the mean of a stimulus distribution and PV neurons its variance.
The impact of uncertainty on prediction errors in an understudied topic, and this study provides an intriguing and elegant new framework for how this impact could be achieved and what effects it could produce. The ideas here differ from past proposals about how neuronal firing represents uncertainty. The developed theory is accompanied by several predictions for future experimental testing, including the existence of different forms of coding by different subclasses of PV interneurons, which target different sets of SST interneurons (as well as pyramidal cells). The authors are able to point to some experimental observations that are at least consistent with their computational results. The simulations shown demonstrate that if we accept its assumptions, then the authors’ theory works very well: SSTs learn to represent the mean of a stimulus distribution, PVs learn to estimate its variance, firing rates of other model neurons scale as they should, and the level of uncertainty automatically tunes the learning rate, so that variable observations are less impactful in a high uncertainty setting.
Strengths:
The ideas in this work are novel and elegant, and they are instantiated in a progression of simulations that demonstrate the behavior of the circuit. The framework used by the authors is biologically plausible and matches some known biological data. The results attained, as well as the assumptions that go into the theory, provide several predictions for future experimental testing. The authors have taken into account earlier review comments to revise their paper in ways that enhance its clarity.
Weaknesses:
One weakness could be that the proposed theory does rely on a fairly large number of assumptions. However, there is at least some biological support for these. Importantly, the authors do lay out and discuss their key assumptions in the Discussion section, so readers can assess their validity and implications for themselves.
Thank you very much, we are very satisfied with this public review.
Reviewer #4 (Public Review):
Summary:
Wilmes and colleagues develop a model for the computation of uncertainty modulated prediction errors based on an experimentally inspired cortical circuit model for predictive processing. Predictive processing is a promising theory of cortical function. An essential aspect of the model is the idea of precision weighting of prediction errors. There is ample experimental evidence for prediction error responses in cortex. However, a central prediction of the theory is that these prediction error responses are regulated by the uncertainty of the input. Testing this idea experimentally has been difficult due to a lack of concrete models. This work provides one such model and makes experimentally testable predictions.
Strengths:
The model proposed is novel and well-implemented. It has sufficient biological accuracy to make useful and testable predictions.
Weaknesses:
One key idea the model hinges on is that stimulus uncertainty is encoded in the firing rate of parvalbumin positive interneurons. This assumption, however, is rather speculative and there is no direct evidence for this.
Thank you very much for this nice description. With regard to the weakness: it is true that the key idea hinges on uncertainty being encoded in the firing of inhibitory neurons. If it turns out that these inhibitory neurons are not PV neurons, however, the theory does not break down. The suggestion of PV neurons is fueled by the observation that PV neurons implement shunting and hence divisive inhibition and by the connectivity of PVs in the circuit. We discuss this in the discussion section: "To provide experimental predictions that are immediately testable, we suggested specific roles for SSTs and PVs, as they can subtractively and divisively modulate pyramidal cell activity, respectively. In principle, our theory more generally posits that any subtractive or divisive inhibition could implement the suggested computations. With the emerging data on inhibitory cell types, subtypes of SSTs and PVs or other cell types may turn out to play the proposed role."
Recommendations for the authors:
Reviewer #4 (Recommendations For The Authors):
(1) Line numbers would simplify reviewing.
We will add line numbers to our next submission.
(2) The existence of positive and negative PE was already suggested by Rao & Ballard.
We added the citation to the sentence "Because baseline firing rates are low in layer 2/3 pyramidal cells () positive and negative prediction errors were suggested to be represented by distinct neuronal populations [44,66],[...]" in the section "Computation of UPEs in cortical microcircuits".
(3) wekk should probably read well.
Indeed, thank you. We fixed it.
(4) Figure 4. legends A-C are mixed up. What are the two values of ¦s-u¦ in F and I - the same as in D and F.
Thank you, we fixed this.
(5) "representation neurons, the activity of which reflects the internal model". For consistency with the original definitions this should read "the activity of which reflects the internal representation". The internal "model" is the synaptic weights (or transformation between areas) - the activity of representation neurons (as the name implies) is the internal "representation".
Thank you, we changed it.
(6) "Mice trained in a predictable environment [...] [4]." This should read "reared" in an unpredictable environment, etc. Relatedly, the problem with this argument is that, the referenced paper argues that the mice never learned to predict and the reduced PE responses are a consequence of a reduction in prediction strength (these mice never - in life - had experience of visuomotor coupling). Better evidence might be the acute changes observed in normal mice (see e.g. Figure 3B in https://pubmed.ncbi.nlm.nih.gov/22681686/ However, another finding from the paper referenced is that in mice reared without visuomotor coupling, MM responses of SST interneurons are unchanged, while those in PV interneurons are completely absent. Would the authors model come to similar results if trained in an environment with (very) high uncertainty and then tested in a low uncertainty environment?
Thank you for pointing us to Figure 3B of Keller et al. 2012. We are now citing this result as it is indeed better evidence.
Thank you very much for your illuminating question and for pointing out that a mouse that never experienced a predictable visual flow may not have formed a model of the visual flow, and hence may not have any prediction about its visual experience. We haven’t considered this scenario in our paper before. So far, we only considered scenarios, in which it is possible to learn a prediction, i.e. to infer the mean from the sensory input. We now consider this other scenario in which the mouse that was reared in an unpredictable environment did not form a prediction and compare SST (1) and PV (2) activity in this mouse to one that learned to form a prediction, and added it to the section "Predictions for different cell types":
"Second, prediction error activity seems to decrease in less predictable, and hence more uncertain, contexts: in mice reared in a predictable environment [where locomotion and visual flow match, 42], error neuron responses to mismatches in locomotion and visual flow decreased with each day of experiencing these unpredictable mismatches. Third, the responses of SSTs and PVs to mismatches between locomotion and visual flow [4] are in line with our model (note that in this experiment the mismatches are negative prediction errors as visual flow was halted despite ongoing locomotion): In this study, SST responses decreased during mismatch, i.e. when the visual flow was halted, and there was no difference between mice reared in a predictable or unpredictable environment. In line with these observations, the authors concluded that SST responses reflected the actual visual input. In our model negative PE circuit, SSTs also reflect the actual stimulus input, which in our case was a whisker stimulus (SST rates in Fig. 6C and I reflect the stimuli (black and grey bar) in A and G, respectively) and SST rates are the same for high and low uncertainty (corresponding to mice reared in a predictable or unpredictable environment). In the same study, PV responses were absent towards mismatches in animals reared in an unpredictable environment [4]. The authors argued that mice reared in an unpredictable environment did not learn to form a prediction. In our model, the missing prediction corresponds to missing predictive input from the auditory domain (e.g. due to undeveloped synapses from the predictive auditory input). If we removed the predictive input in our model, PVs in the negative PE circuit would also be silent as they would not receive any of the excitatory predictive inputs."
(7) "Our model further posits the existence of two distinct subtypes of SSTs in positive and negative error circuits." There is some evidence for this: Figure 5a in https://pubmed.ncbi.nlm.nih.gov/36747710/
Thank you, we added this citation to the corresponding section.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public Review):
Summary:
The focus of this manuscript was to investigate the role of Cldn9 in the development of the mammalian cochlea. The main rationale of the study is the fact that cochlear hair cells do not regenerate, so when damaged they are lost forever, causing irreparable hearing loss. The authors have attempted to address this problem by inducing the ectopic production of additional hair cells and testing whether they acquire the morphological and functional characteristics of native hair cells. They show that downregulation of Cldn9 using a well-established genetic manipulation of transgenic mice led to the production of extra numerary inner hair cells, which were able to survive for several months. By performing a large battery of experiments, the authors were able to determine that the native and ectopic inner hair cells have comparable morphological and physiological characteristics. There are several conclusions highlighted by the authors in different parts of the manuscript, including the key role of Cldn9 in coordinating embryonic and postnatal development, the differentiation of supporting cells into inner hair cells, and the possible use of Cldn9 to induce inner hair cell differentiation following deafness induced by hair cell loss.
Strengths:
Several of the conclusions in this study are well supported by the experimental work.
Weaknesses:
Some aspects of the data and its interpretation needs better explanation and requires further investigation.
(1) The Results section is the most difficult part to read and understand. It contains a very limited, and in some places confusing and repetitive, description of the data. Statistical analysis is missing for some of the key data (e.g., ABRs), and in some places the text contradicts the data presented in the figures (e.g., Figure 8). I am sure carefully revising the text would clarify some of these issues.
We thank the reviewer for the suggestion. We revised parts of the results section and added the statistical analysis to the ABRs and DPOAE (lines 151-159; Page 29, lines 846-880).
(2) One puzzling finding that is not addressed in the manuscript is the lack of functional benefit from these additional inner hair cells. In fact, it appears to be detrimental based on the increased ABR thresholds. Maybe it would be useful to analyze the wave 1 characteristics.
We thank the reviewer for the suggestion. We added the wave 1 characteristics as S8.
(3) It is not clear what direct evidence there is, apart from some immunostaining, indicating that the ectopic inner hair cells derive from the supporting cells. This part would benefit from a more careful consideration and maybe an attempt at a more direct experimental approach.
We thank the reviewer for the suggestion. We intend to investigate the origin of the ectopic inner hair cells using (for example, a qRT-PCR, sm FISH, etc.) in our future study.
(4) One point that should be made clear throughout the manuscript is that the ectopic inner hair cells are generated in a cochlea that is undergoing normal maturation. Thus, there is no guarantee that modulating the expression levels of Cldn9 in a deaf mouse lacking hair cells would produce the same result as that shown in this study. My guess is that it probably won't, but I am sure this could be tested (maybe in the future) using the excellent experimental approach applied in this study.
That is a great point. We will explore it in our future experiments.
Reviewer #2 (Public Review):
Summary:
The generation of functional extranumerary inner hair cells (IHCs) in postnatal mice, particularly with virus-mediated knockdown of Cldn9 mRNA expression in the neonatal cochlear duct, is an important observation. It is significant because not many studies exist that report molecular manipulations of the neonatal organ of Corti that result in the generation of new hair cells that remain functional and appear to be intact for an extended time, here more than one year. Overall, this is a carefully conducted study; the observations are clear, and the methods are solid. Two independent methods for reducing the expression of Cldn9 mRNA were used: a conditional transgenic model and AAV-mediated knockdown with shRNA. The lack of a functional explanation of how the reduced expression of Cldn9 specifically leads to the formation of extranumerary IHCs leaves open questions. For example, it is not clear whether there is indeed a fate change happening and whether Cldn9 reduction affects developmental processes. The discussion of how Cldn9 reduction potentially affects Notch signaling, without hard evidence, is handwaving.
Strengths:
It is a very interesting observation and somewhat unexpected in its specificity for inner hair cells. Using two different approaches to manipulate Cldn9 expression provides a strong experimental foundation. The study is conducted quantitatively and with care.
Weaknesses:
The lack of mechanistic insight results in an open-ended story where at least the potential interaction of Cldn9 reduction with known and well-characterized signaling pathway components should have been investigated. This missed opportunity limits the scope of the study and should be addressed: How does Cldn9 downregulation affect the expression levels of other known genes linked to hair cell production and cell fate decisions? Quantitative RT-PCR works well for the authors, and comparing the expression of Notch or other known pathway components could provide mechanistic insight.
We thank the reviewer for the suggestion. We did quantitative RT-PCR to compare the expression of Notch or other known pathway components in our future work. Besides, we used smFISH with ccnd1 probe and cdkn1b probe to detect cyclin D1 and cyclin-dependent kinase inhibitor 1B (p27) separately in the mouse cochlea. GAPDH was selected as a reference gene. The quantification results showed no significant difference between Cldn9<sup>+/T</sup> mice and Cldn9<sup>+/+</sup> mice at P2, P7, and P14.
It is unclear how P21 inner hair cells were identified for the patch-clamp experiments shown in Fig 4E-H. This is a challenging endeavor without the possibility of using specific markers.
We did not have a specific marker for IHCs. However, one with experience in hair bundle morphology and knowledge of their location in the epithelia can identify IHCs from the upright microscope.
Please also address the numerous minor points outlined below; it will improve the paper's readability.
Thanks. Please find the point-to-point answers below.
Please include page numbers and line numbers in a revised manuscript.
We include page numbers and line numbers in a revised manuscript.
Reviewer #3 (Public Review):
This important study by Chen et al help in advancing our knowledge about the regulation of inner hair cell (IHC) development and revealed the role of Cldn9 in IHC embryonic and postnatal induction by transdifferentiation from the supporting cells. The authors developed an inducible doxycycline (dox)-tet-OFF-Cldn9 transgenic mice to regulate expression levels of Cldn9 and show that downregulation of Cldn9 resulted in additional, although incomplete row of IHCs immediately adjacent to the original IHC row. These induced extra IHCs had similar well developed hair bundles, able to mechanotransduce and were innervated by auditory neurons resembling wild-type IHCs. In addition, the authors knock down Cldn9 postnatally using shRNA injections in P1-7 mice with similar induction of extranumerary IHC next to the original row of IHCs. The conclusions of this paper are mostly well supported by the data, but some data analysis needed to be clarified and some crucial controls should be provided to improve the confidence in the presented results. There is a great potential for practical use of these valuable findings and new knowledge on IHC developmental regulation to design Cldn9 gene therapy in the future.
The described by Chen et al mechanisms of extra hair cell generation by suppression of the tight junction protein Cldn9 expression level are very interesting and previously unknown. In particular, the generation of extra IHCs postnatally using downregulation of Cldn9 by shRNA could potentially be very useful as a replacement of HCs lost after noise-induced trauma, ototoxic agents, or other environmental trauma. On the other hand, the replacement of lost hair cells due to various genetic mutations by inducing a supernumerary IHCs with the same abnormalities would not be reasonable.
The authors show that postnatally generated ectopic IHCs are viable and mechanotransducive, but it would be nice to show the maturation steps of ectopic IHC during this postnatal period. For example, stereocilia bundles of the ectopic hair cells should mature later than the original IHCs. A few days after viral delivery of shRNA, you should be able to observe immature IHC bundles that unequivocally will define newly generated IHCs. Unfortunately, the authors show only examples of already mature ectopic IHCs at P21 and in 5-6 weeks old mice and at relatively low resolution. Also, during maturation, IHCs usually have transient axo-somatic synapses that are not present in mature IHCs. It would be great to see if, in 5-6 weeks old mouse, the ectopic IHCs still have axo-somatic synapses or not, and if the majority of the ectopic IHCs have innervation. Some of the data in this study would benefit from showing corresponding controls and some - from higher resolution imaging.
We appreciate the reviewer's suggestion. The objective of the paper is to report the phenomenon and present the coarse features of the Cldn9-mediated induced ectopic hair cells. The systematic details are for future studies, which are ongoing and out of the current scope.
In the mammalian cochlea, each HC is separated from the next by intervening supporting cells, forming an invariant and alternating mosaic along the cochlea's length. Cochlear supporting cells in some conditions can divide and trans-differentiate into HCs, serving as a potential resource for HC differentiation, using transcription and other developmental signaling factors.
However, when ectopic hair cells are generated from supporting cell trans-differentiation, the intricate mosaic of the organ of Corti is altered, which could by itself lead to hearing issues. In case of downregulation of Cldn9, the extra row of IHCs seems to be positioned immediately adjacent to the original IHC row. It is not clear if the newly formed unusual junctions between the ectopic and original IHCs are sufficiently tight to prevent leakage of the endolymph to the basolateral surface of IHCs. Also, it is not clear if the other organ of Corti tight junctions could lose their tightness due to the downregulation of Cldn9, which could over time affect the endocochlear potential as shown by this study and hearing abilities.
There was a slightly increased ABR threshold (5 dB -15 dB) (Fig. 4A) and a decrease in the magnitude of the EP and the rise in the K<sup>+</sup> concentration in the endolymph and perilymph of Cldn9+/T mice compared to from age-matched littermates (S10) indicated there might be a compromised epithelium tight junction. The downregulation of Cldn9 affected the endocochlear potential and hearing abilities ((Fig. 4A, S10) after 2m, suggesting an age-dependent effect. The effective downregulation of Cldn9 would require proper titration of Cldn9 levels to induce extra hair cells with intact epithelial integrity; work may require additional studies.
Importantly, CLDN9 immunofluorescence staining data that show cytoplasmic staining of supporting cells should be revisited and the organ of Corti schematics showing CLDN9 expression should be corrected, considering that CLDN9 localizes to the tight junctions of the reticular lamina as was shown by immunoEM in this study and described in previous publications (Kitajiri et al., 2004; Nakano et al., 2009, Ramzan et al., 2021). While the current version of the manuscript will interest scientists working in the inner ear development and regeneration field, it could be more valuable to hearing researchers outside this immediate field and perhaps developmental biologists and cell biologists after proper revision.
We appreciate the reviewer's comments. We were concerned about the observation, but the results were consistent. Indeed, that was the motivation for performing the immunoEM (S3). A follow-up report may address it further.
Recommendations for the authors:
Reviewer #2 (Recommendations For The Authors):
Please address the points I made about the presentation (word choice, inconsistencies in labeling, etc). It ultimately helps a reader to understand and to follow your logic. This is an important observation.
We corrected the inconsistencies in labeling and addressed the points you suggested.
Making the extra effort to investigate a possible interaction between Cldn9 and Notch signaling would substantially increase the significance of the work.
Thanks for the suggestions. We will explore it in our future work.
Minor points:
Some sentences would benefit from revision:
- The abstract argues that hearing loss is incurable because mammalian hair cells are terminally differentiated (3rd sentence). This is not accurate.
Mammalian HCs are terminally differentiated by birth, making HC loss challenging to replace.
- The second sentence of the second paragraph of the introduction, "Cochlear SCs can divide and trans-differentiate into HCs, serving as a potential resource for HC differentiation, using transcription and developmental signaling factors (White et al., 2006)," should be referenced in the context of the animal's age. This feature of supporting cells is transient and only observed in neonatal mice. The following sentences in the same paragraph would also benefit from being placed into the same context when appropriate.
We thank the reviewer for the suggestion. These sentences have been corrected.
- Introduction: "But functional features of the newly developed HC are circumspect." The authors probably meant "circumspect," but is this the appropriate word? Also, please use the plural of HC = HCs.
The sentence has been corrected to “but the functional features of the newly developed HCs are circumspect”.
- Introduction: Isn't an essential function of tight junctions in the organ of Corti the separation of fluid-filled spaces? Perhaps additional functions of tight junction proteins are unclear, but at least this one function appears clear.
We thank the reviewer for the suggestion. We added the “additional” before the “function” in this sentence.
- Introduction: "using shRNA injection in postnatal (P) days (P1-7) mice." This is a rather vague statement that could be better defined. Perhaps mention that the injections targeted the round window and that an AAV-based method was used. Also, it is not clear from the methods whether the injection needle pierced the round window. Please clarify. Likewise, the methods state that these experiments were conducted in P1-P15 mice, but the main text says P1-P7. Later, in the results section and in the figure legend for Fig 7, the mice are between P1-P7 and P14; the figure itself is labeled with P1 and P14. However, data is presented (Fig 6) for injections at P2, P4, P7, and P14. In the text referring to Fig 6B in the results section, it is stated, "By contrast, the P14-21 inner ear transfected with Cldn9-shRNA produced no detectable increase..." Only data for P2, P4, P7, and P14 injections are presented. These are minor issues, but please check the inconsistencies because they make it difficult to follow.
We corrected this sentence to “Analogous additional putative IHCs differentiation was observed when Cldn9-shRNA was injected through the round window to postnatal (P) days (P2-7, and P14) mice…”. The label in Fig 7A has been changed to P2-7, and the text referring to Fig 6B in the result section has been changed to “the P14 inner ear transfected with Cldn9-shRNA produced no detectable increase...".
- Last statement of the Introduction: "making Cldn9 a viable target for generating transformed IHCs." It is not clear what transformed IHCs are.
We replaced the transformed with supernumerary.
- To understand the Southern Blot analysis in Fig 1E, the location of BstAPI and BamHI restriction sites and the probe need to be illustrated in Fig 1D.
The restriction sites BstAPI, (Bst), and BamHI (Bam) are indicated (Fig. 1D).
- Please define the purple arrows and arrowheads in Fig 1D. What do the different colors for the backbone mean? I see red and green, but also orange and yellow in the floxed allele. In Fig 1F, is "Knock-in" synonymous with homozygote? Would it be clearer to use the nomenclature Cldn9(T/T), Cldn9(T/+), and Cldn9(+/+), which is used later in the text?
We have made the changes as requested.
- Results, first paragraph: "Results of RT-PCR..." This refers to quantitative RT-PCR; please add the word "quantitative."
Thanks. We added “quantitative” to the sentence.
- Results and Fig S1. Is the strong upregulation of Cldn9 mRNA (S1A) also reflected in stronger Cldn9 immunoreactivity?
Yes, the strong upregulation of Cldn9 mRNA showed higher cldn9 immunoreactivity.
- Results, Fig 1. Please add a schematic drawing showing all elements of the inducible gene expression cassette in the final transgenic allele, and please illustrate how the system works. This helps the reader to understand the strong Cldn9 mRNA upregulation in Cldn9(T/T) mice, where expression is likely driven by the CMV promoter and reciprocally, in the presence of doxycycline, the suppression of transcription by binding of the tTA-dox protein to the TRE elements of the modified CMV promoter. Is this a correct assumption?
Yes, this is a correct assumption
- Results, about Fig S3. Why is it important to investigate Cldn6 and ILDR1 levels in the context of Cldn9 downregulation? Also, that is meant with "no comparative differences in others?". If a potential compensatory effect is suspected, why are the authors not systematically characterizing the expression of other tight junction proteins with quantitative RT-PCR? The results shown in S3 are anecdotal, without proper quantification, and lack context.
The goal is to examine the potential compensatory changes in other TJ proteins. It was not to examine all possible TJ proteins localized in the inner ear.
Results, section headed with "Downregulation of..." First sentence. Fig. 2A-C à Fig. 2A-E.
Thanks. We corrected the sentence “5-week-old mice Cldn9<sup>+/T</sup> cochleae displayed a notable row of ectopic HCs (Fig. 2A-C).” to “5-week-old mice Cldn9<sup>+/T</sup> cochleae displayed a notable row of ectopic HCs (Fig. 2A-E).”
The same section: "were negatively labeled with anti-prestin antibody." Consider "were not labeled with antibody to prestin." Likewise, a few sentences below, please consider rephrasing "the ectopic HCs ... reacted positively to otoferlin antibodies". Also, "...expressed multiple CtBP2 labeling..." - this reads like an incomplete sentence.
Thanks for the suggestions. We have corrected the three sentences mentioned.
The phrase "putative ectopic" lacks clarity because "putative" could refer to "ectopic" (like an adverb). Consider swapping the two words and writing "ectopic putative IHCs" or simply "ectopic IHCs."
Thanks for the suggestions. We replaced the “putative ectopic IHCs” with “ectopic IHCs” in all contexts.
Please use more precise figure labels when referring to a specific figure panel. For example, "Additionally, the ectopic HCs show IHC bundle features (Fig. 2)," - Bundles are shown in Fig 2D and Fig 2E. Please check all instances where a full figure is mentioned, but the specific reference is to a panel of the figure. Another example, "... using quantitative RT-PCR (S7)..." would be more specific if Fig S7A is referred to.
Thanks for the suggestions. We checked all instances and corrected the labels. Thanks!
"IHC counts at different ages (P2-P21) and the cochlear frequency segments (4-32 kHz) demonstrate..."- the figure shows data for 8 kHz and 32 kHz; please revise: "segments (8 kHz and 32 kHz) demonstrate."
This sentence has been revised based on your suggestion. Thanks!
Please add a legend to Fig. 3C (like the one shown in Fig. 2F).
Thanks for the reminder. The legend for Fig. 3C was modified.
Fig 4A and Fig 4B. It is impossible to distinguish the open/closed circles and the many lines. Please consider a different format or an extended supplemental figure. Also, drawing a line connection between the 32 kHz and click data points in 4A is inappropriate.
Instead of the open/closed circles, the dashed line means Cldn9<sup>+/+</sup> mice, and solid lines represent Cldn9<sup>+/T</sup> mice. We added the line labels. The line connecting between 32 kHz and click data points was removed.
Fig 4, legend. Please define BHB and BHC levels.
BHB and BHC are defined.
The paragraph "Synaptic features of PE IHCs match original IHCs" is confusing because it states the following: "The synapses between the IHCs and auditory neurons at the apical, middle, and basal cochlear locations from 5-week-old Cldn9+/+ and Cldn9+/T mice show substantial differences." The meaning of the heading, therefore, does not match what is ultimately shown and discussed.
We have changed the title to “Synaptic features of ectopic IHCs and original IHCs”.
Moreover, no actual features of synapses are investigated; CtBP2/Homer pairs were used to identify afferent synapses, which this reviewer would argue provides a reasonable estimate of the number of synapses where pre- and post-synaptic markers are detected in close vicinity. It would be helpful to describe the method for counting juxtaposed CtBP2 and Homer-labeled puncta with more detail.
The method section now includes more information about the synapse count, which this reviewer would argue provides a reasonable estimate of the number of synapses where pre- and post-synaptic markers are detected in close proximity.
The final concluding sentence of the section also suggests that synaptic transmission from PE IHCs might be compromised because significant differences in synapse numbers were identified. It would be important to mention this.
Thanks for the reminder. We added this information to the final concluding sentence.
Fig. 5C, 5D; legend. Is "co-expressed" the right word choice? Consider "colocalized" or "juxtaposed".
The "co-expressed" has been replaced with "colocalized".
Voltage-clamp recordings of P21 inner hair cell mechanoelectrical transduction currents. This reviewer cannot identify a previous publication describing the details of this method on P21 cochlear inner hair cells; this seems like an excellent methodological advance.
Yes, we can record data from older mice. Thanks for pointing it out.
"Transfection in vivo of Cldn9 shRNA," the P14-21 inner ear transfected with Cldn9-shRNA." Plus, additional use of the word "transfection." Transfection generally means the introduction of plain nucleic acid into cells. The word refers to methods that do not use viruses. In contrast, "transduction" is the term used for virus-mediated gene transfer. The authors used AAVs. Please correct for appropriate scientific terminology.
Thanks for the clarification. This information has been corrected accordingly.
"A slight decline in the amplitude of the EP and a substantial rise in perilymph K+ was detected in 8-month-old Cldn9+/T (S7)." Probably Fig. S8A,B is meant.
Yes, it referred to Fig. S8 A, B. We corrected it in the result section. Thanks!
Heading "Discussions" -> "Discussion"
The focus of the second part of the discussion on potential interactions between Cldn9 suppression and known signaling pathways is essential. The logic that is presented with respect to Notch signaling, however, is not clear and misleading. For example, it is not obvious what is meant by "Cldn9 subserves the signaling catalyst to activate NICD cascades" and whether this statement is supported by any published data.
The statement was a suggestion and has been qualified with a “may” clause (line 299).
The authors might consider discussing whether the observed effect caused by Cldn9 elimination is a specific role of the Cldn9 protein itself or is an epiphenomenon resulting from cytomechanical changes in the developing and maturing organ of Corti. This would add a potential Notch-independent component for a possible interpretation of the observations.
We state lines 302-304 “Alternatively, Cldn9 levels disruption may alter the mechanical properties of the developing and maturing organ of Corti that may trigger ectopic IHC differentiation, an epiphenomenon independent of the Notch signaling“.
Methods:
"Deletion of the selection marker in the tTA cassette by crossing the F1 mouse with the embryonic Cre line (B6.129S4-Meox2tm1(cre)Sor/J)." This sentence seems to be incomplete.
Thanks for pointing it out. This sentence has been rewritten.
"Images were captured under a confocal microscope." Consider writing "with a confocal microscope".
This sentence has been corrected. Thanks!
RNA extraction and... How many mice were used per experiment? 10-15 or just 10?
The mice number for the RNA extraction is between 10 and 15. Thanks
Reviewer #3 (Recommendations For The Authors):
Below are my suggestions, questions, and criticisms.
(1) The red outline on Fig1A schematic does not correspond to the previously published expression pattern of CLDN9 in the organ of Corti reticular lamina tight junctions (Kitajiri et al, 2004, Nakano et al., 2009, Ramzan et al., 2021). Also, there are no tight junctions all around the pillar cells. The tight junctions are restricted to the sites of tight attachments between two cells. The immunofluorescence staining using CLDN9 antibody looks rather cytoplasmic (Fig 1 and Fig S1) than associated with the tight junctions as it was shown by immunoEM data here and reported previously (Kitajiri et al, 2004; Nakano et al, 2009; Ramzan et al, 2021). Please correct the schematic and explain your data.
We have redrawn the diagram (Fig. 7).
(2) The CLDN9 staining in Figure 1, B and C, highlights the cytoplasm of the supporting cells, and hair cells devoid of the staining. From the images in Fig. S1C, it also looks like CLDN9 is present only in supporting cells and not in hair cells? How would the authors reconcile their data with Cldn9 expression data from the gEAR database and Ramzan et al.'s 2021 RNAscope data? Please provide the validation of the antibody used in this study.
We recognize the reviewer’s concern but RNA and protein levels are not always in parallel.
(3) Figure 1D. The dash lines from the targeting vector to the wt allele seem to indicate a recombination event. Please do not show the recombination event, instead just show what part of the targeting vector was incorporated to replace wt Cldn9. There is no description in the figure 1 legend what purple arrows and arrowheads mean and what yellow and orange line segments in the floxed allele schematic indicate. Please also show where the BstAPI and BamHI restriction enzyme sites are.
We have provided supplement Fig 1., and have noted the BstAPI and BamHI restriction enzyme sites in Fig. 1D.
(4) What does the organ of Corti that has 40-to-55-fold increase in Cldn9 mRNA expression looks like before dox treatment? Any abnormalities at all? How is CLDN9 protein localization looks in the Cldn9+/T untreated mice? Do they have normal number of IHCs? Cldn9+/T untreated mice should be used as another control at least in Figure S1. What does the organ of Corti that has a 40-to-55-fold increase in Cldn9 mRNA expression look like before dox treatment? Are there any abnormalities at all?
The untreated Cldn9<sup>+/T</sup> mice can grow normally but are not fertile. So, we used a very low concentration of dox water (0.1 mg/ml) instead of normal water to keep the breeding pairs. The protein level increased in the Cldn9<sup>+/T</sup> mice compared with Cldn9<sup>+/+</sup>mice. With 0.1 mg/ml dox water, they also showed ectopic IHCs.
(5) It is interesting that decline of 0.4-0.6-fold in mRNA level leads to about 8-fold decrease in protein level based on your immunoEM data on tight junctions of IHC with supporting cells. Do you observe the same effect in OHC-SC tight junctions, or the decrease was observed selectively around IHCs?
The reviewer is alluding to matching RNA and protein levels. It appears that for Clnd9 one cannot expect a closely matched relationship.
(6) The quality of the immunoEM data is great, but a control of secondary antibody alone staining in wt and Cldn9+/T dox treated should be shown and compared to the Cldn9+/T treated sample.
We thank the reviewer for raising the issue. Secondary antibodies are used as a control in all immunoEMs in the laboratory. We opted not to show negative results.
(7) The authors observed a decrease in Cldn6 expression albeit not quantitative in response to Cldn9 downregulation. How were the immunofluorescence signals compared and evaluated? Please provide a detailed description of the method used. Did the authors used the same image acquisition parameters? Was the Cldn9 and Cldn6 immunostaining done using same protocol with the same aliquot and dilution of the secondary antibodies, etc.? The staining for CLDN6 seems to be concentrated in the cytoplasm of supporting cells, and not in the tight junctions, similar to CLDN9 immunoreactivity shown in Fig. S1C and to the ILDR1 pattern of staining in Fig. S3. How can the authors explain this? How were the antibodies validated?
The Cldn9 and Cldn6 immunostaining were done using the same protocol with the same aliquot and dilution of the secondary antibodies.
(8) CLDN14 is also expressed in the organ of Corti tight junctions. What happened to this TJ protein during CLDN9 downregulation?
We detected Cldn14 with immunostaining in the Cldn9+/T mice and Cldn9+/+ mice fed with 0.25 mg/ml dox water, and the results showed increased expression of Cldn14 in Cldn9+/T mice. Detail alterations of other TJ proteins have been reserved for future studies.
(9) When supernumerary IHCs were observed in Cldn9+/T mice, have the authors noticed a corresponding decrease in supporting cells surrounding IHCs? Quantification of the IHCs supporting cells would be useful. Do the ectopic IHCs have apical tight junctions with original IHCs or they are surrounded by supporting cells?
We quantified the SCs around the IHCs but did not detect significant differences among the groups.
(10) The authors indicated that viable PE IHCs were observed in 15 months old Cldn9+/T dox treated mice. How stereocilia bundles look in these ectopic hair cells? Are they preserved similar to the original IHCs or degenerated? It is hard to see this in Fig 3, phalloidin panel. High-resolution SEM would show this better.
For the remaining ectopic IHCs in 15 months, we did not detect apparent differences in hair bundles compared with the original IHCs.
(11) Interestingly, the authors indicate that the highest number of the ectopic IHCs were developed in the apical turn and the higher elevation of ABR threshold was also observed at low frequencies end. This may indicate that extra IHCs do not help hearing function.
The extra IHCs showed along the whole cochlea, even though it is more obvious in the apical turn. The declined hearing may have resulted from the leakage of the endolymph K+ to the perilymph and EP decline.
(12) No age-matched wt control is shown for decreased expression of Cldn9 after shRNA injection at P2 (Fig. 6A).
As indicated earlier, we opted to state but did not show negative results.
(13) Figure 6C. The better- quality SEM images showing a longer stretch of IHCs are needed to convince readers that there are ectopic IHCs that are well preserved in 5-6 weeks old mice in all cochlear turns after GFP-Cldn9 shRNA treatment at P2-P7.
In S4, we showed that there are ectopic IHCs along the cochlear axis.
(14) Do scrambled shRNA control samples had some ectopic IHCs? This control is missing in Fig.6D.
No scrambled shRNA controls did not show ectopic IHCs. We have stated it.
(15) Figure 7B, lower schematic. There are no known continuous tight junctions and CLDN9 expression around the OHCs and IHCs. CLDN9 is known to be concentrated at the reticular lamina tight junctions which separate the endolymph from perilymph. Please, correct all schematics accordingly.
We have made the changes as requested.
Minor comments:
(1) Page 1, Abstract. I would not say "making HC loss incurable" since recent gene therapy results show some advances in this direction. Please rephrase more accurately.
We have made the changes as requested.
(2) Page 4, Results, line 5; please rephrase "PCR of tail tissue samples performed genotyping."
It has been corrected to “The genotyping was performed by the PCR with the tail tissue.”
(3) Fig. 1 legend, panel B, replace "showing IHC stained myosin7a" with "showing IHC stained by myosin7a". Also, in the same sentence, "phalloidin, actin (green) antibodies," Phalloidin is not an antibody; please change this.
Thanks. We have corrected this information.
(4) Fig 2C, IHC label obscures the view of IHCs, please move this label out and use an arrow to point to IHCs.
We have made the changes as requested.
(5) Figure 4, title. Replace "currents elicited original" with "current elicited from original".
This sentence has been corrected. Thanks.
(6) Figure 4, panel A. It is hard to see the open symbols on the graph. Are they associated with the dash lines? Please make them more visible or indicate what dash lines are. "ABR threshold for (n=12)" should be "ABR threshold for Cldn9+/+(n=12)"?
Yes, they are associated with the dash lines. We added the labels for the solid lines and dash lines. "ABR threshold for (n=12)" was corrected to "ABR threshold for Cldn9+/+(n=12)."
(7) Figure 4, legend. "Within each wt and heterozygote mice, there was no significant shift...". Do you mean within each group of mice? Also "Mean DPOAE threshold for 2-8 mos (n=9) was tested,..." Do you mean (n=9) for each group or what group?
Yes, "Within each wt and heterozygote mice, there was no significant shift..." has been revised. The number of mice in each group for the DPOAE test was clarified in the Fig. 4B legend. Thanks.
(8) Please label the X axis in Figure 4D.
The X-axis has been labeled (Time (s))
(9) Figure 4 B, do the colors of the lines indicate the same age groups as in Fig 4A? Do the dash lines associate with open symbols? Please state this clearly in the figure's legend.
Yes. We added this information in Fig. 4B legend.
(10) Figure 4D. Please label the X axis of the fluorescence intensity graph.
The X-axis has been labeled (Time (s))
(11) Figure 4G, legend. Replace "(mean +std)" with "(mean +SD)" for consistency here and in Figure 5 legend.
Thanks. We replaced "(mean +std)" with "(mean +SD) in the legend of Fig. 4G and Fig.5 and Fig.6.
(12) Figure 5B, legend. Replace "makers" with "markers".
Thanks. This information was corrected.
(13) Figure 6A, legend. There is no downregulation of Cldn9 by shRNA shown in "S5". Do the authors mean Figure S7? Please, correct "S5" to "Fig. S7".
This information was corrected. Thanks.
(14) Figure 6A, legend. There is no reduced CLDN9 protein expression shown in Fig. 1C. Do the authors mean Fig. 6A, third panel? Please correct the phrase "reduced protein expression (Fig. 1C) is shown in the 3rd Panel (Cldn9, red)" accordingly, and do not capitalize "p" in the "3rd Panel".
This information was corrected. Thanks (line 917-918).
(15) Also there, replace "The right Panel shows two rows of IHCs (marked HC marker, Myo7a (cyan), and the merged photomicrograph" with "The right panel shows the merged image with two rows of IHCs stained with HC marker Myo7a (cyan) and the expression of Ad-GFP-mCldn9 shRNA (green) in the adjacent row of supporting cells". Please indicate in what cells Ad-GFP-mCldn9 shRNA (green) is expressed. It looks like only one row of supporting cells has this green signal.
This information was corrected.
(16) Figure 6B, legend. Replace "Examples of photomicrographs of sections of the whole-mount cochlea of P2, P4, P7, and P14 Cldn9 shRNA injected mice" with "Examples of phalloidin stained whole-mount organ of Corti samples from cochleae of the wild-type mice injected at P2, P4, P7 and P14 with Cldn9 shRNA"
This sentence has been modified based on your suggestions. Thanks!
(17) Replace "action labeling" with "actin labeled."
Thanks! The "action labeling" has been replaced with "actin labeled." Line 924
(18) Figure 6C. Insert "C" before SEM images description in the legend. The authors stated that SEM images of "5-6-wks-old mice" are shown. Please indicate the exact age of mice shown on each image and at what age these mice received the virus injection.
Thanks! The “C” has been added. We have noted that the SEM images are from 5-week-old mice" in the legend, and the virus was injected at P2.
(19) Figure 6D, legend. Last sentence: move "are significantly different" and insert this between "IHCs" and "at P2 apex".
This information was corrected.
(20) Figure S7, legend. Replace "(sram)" with "(scram)" as in the figure itself. Also, Indicate the age of samples at the harvesting time for imaging and the age at injection of Cldn9 shRNA.
"(sram)" has been replaced with "(scram)". The age of samples at the harvesting time for imaging and the age at injection of Cldn9 shRNA are indicated.
(21) Figure S8. Replace "4 mos-old" and "8 mos-old" with "4 months-old" and "8 months-old" everywhere in the legend and in the figure labels.
We have made the changes as suggested.
(22) Page 8, 5th lane from the bottom. Change "EP and K+ concentration endolymph" to "EP and K+ concentration of the endolymph".
It has been corrected. Thanks.
(23) Page 8, next to the last sentence before the Discussion. Wrong figure number, please replace "(S7)" with "Fig. S8".
It has been corrected. Thanks.
-
-
-
Author response:
The following is the authors’ response to the previous reviews.
Joint Public Review:
Summary:
The authors aimed to identify the neural sources of behavioral variation in fruit flies deciding between odor and air, or between two odors.
Strengths:
- The question is of fundamental importance.
- The behavioral studies are automated, and high-throughput.
- The data analyses are sophisticated and appropriate.
- The paper is clear and well-written aside from some initially strong wording.
- The figures beautifully illustrate their results.
- The modeling efforts mechanistically ground observed data correlations.
Weaknesses:
- The correlations between behavioral variations and neural activity/synapse morphology are relatively weak, and sometimes overstated in the wording that describes them.
We sincerely thank the reviewers for these evaluations.
Recommendations for the authors:
Line 56: "We hypothesize that as sensory cues are encoded and transformed to produce motor outputs, their representation in the nervous system becomes increasingly idiosyncratic and predictive of individual behavioral responses". This seems obvious a priori. The sensory stimuli are the same, but the motor responses are different. Along the way there has to be a progression from same to different. Is there an alternative hypothesis? If so, perhaps state the alternative.
We added text to the first paragraph of the introduction (lines 58-60) laying out an alternative hypothesis that individuality emerges through biomechanical differences and environmental interactions, and we have altered our motivating question to assess whether circuit elements in which activity is predictive of individual behavior exist, and if so, where (lines 60-62).
Line 157: typo "remaining"
We changed “remaining” to “remain” (line 160).
Line 163: why report r sometimes and R^2 other times? Better to use R^2 throughout.
We changed all instances of r to R<sup>2</sup>, notably when reporting combined train/test statistics for calcium - behavior models (line 162). We also reframed the outputs (medians + 90% confidence intervals) of the supplemental analysis inferring the strength of the latent calcium-behavior relationship to be in terms of R<sup>2</sup> (lines 166, 173-175, 241, 252; modified text in Inference of correlation between latent calcium and behavior states in Materials and Methods; adjusted figure and caption for Figure 1 – figure supplement 9).
Line 182: "odorant". Should be "odorant receptors"?
We respectfully disagree – our ORN and PN calcium data are responses to odorants in 5 glomerulus/odorant receptor types. When we group PCA loadings by glomerulus for both ORN and PN calcium, the consistency within groups is much stronger than when we group the loadings by odorant (Figure 1 – figure supplement 8). Additionally, “odorant receptor organization” would mean the same thing as “glomerular organization,” since all ORNs expressing the same odorant receptor project to a single glomerulus.
Line 331: "harbor". Maybe more modestly "contribute to"?
We changed “harbor” to “contribute to” (line 334) and added additional moderating language that the difference in DC2 and DM2 activations in PNs explains a large portion of the individuality signal (lines 337-339).
Line 403: typo "is"
We retained “is” as the corresponding verb for “the net effect,” but we adjusted the position of the reference to Gomez-Marin and Ghazanfar, 2019 for more clarity (lines 406-408).
-
-
www.medrxiv.org www.medrxiv.org
-
Author response:
Reviewer #1(Public review):
Summary:
This manuscript details the results of a small pilot study of neoadjuvant radiotherapy followed by combination treatment with hormone therapy and dalpiciclib for early-stage HR+/HER2-negative breast cancer.
Strengths:
The strengths of the manuscript include the scientific rationale behind the approach and the inclusion of some simple translational studies.
Weaknesses:
The main weakness of the manuscript is that overly strong conclusions are made by the authors based on a very small study of twelve patients. A study this small is not powered to fully characterize the efficacy or safety of a treatment approach, and can, at best, demonstrate feasibility. These data need validation in a larger cohort before they can have any implications for clinical practice, and the treatment approach outlined should not yet be considered a true alternative to standard evidence-based approaches.
I would urge the authors and readers to exercise caution when comparing results of this 12-patient pilot study to historical studies, many of which were much larger, and had different treatment protocols and baseline patient characteristics. Cross-trial comparisons like this are prone to mislead, even when comparing well powered studies. With such a small sample size, the risk of statistical error is very high, and comparisons like this have little meaning.
We greatly appreciate your evaluation of our study and fully agree with the limitations you have pointed out. We have clearly stated the limitations of the small sample size and emphasized the need for a larger population to validate our preliminary findings in the discussion section (Lines 311-316).
We acknowledge that this small sample size is not powered to characterize this regimen as a promising alternative regimen in the treatment of patients with HR-positive, HER2-negative breast cancer. Therefore, we have revised the description of this regimen to serve as a feasible option for neoadjuvant therapy in HR-positive, HER2-negative breast cancers both in the discussion (Lines 317-320) and the abstract (Lines 71-72).
We agree with you that cross-trial comparisons should be approached with caution due to differences in study designs and patient populations. In our discussion section, we acknowledge that small sample size limited the comparison of our data with historical data in the literature due to the potential bias (Lines 312-313). We clearly state that such comparisons hold limited significance (Lines 313-314) and suggest a larger population to validate our preliminary findings.
• Why was dalpiciclib chosen, as opposed to another CDK4/6 inhibitor?
Thank you for your comments. The rationale for selecting dalpiciclib over other CDK4/6 inhibitors in our study is primarily based on the following considerations:
(1) Clinical Efficacy: In several clinical trials, including DAWNA-1 and DAWNA-2, the combination of dalpiciclib with endocrine therapies such as fulvestrant, letrozole, or anastrozole has been shown to significantly extend the progression-free survival (PFS) in patients with hormone receptor-positive, HER2-negative advanced breast cancer (1-2).
(2) Tolerability and Management of Adverse Reactions: The primary adverse reactions associated with dalpiciclib are neutropenia, leukopenia, and anemia. Despite these potential side effects, the majority of patients are able to tolerate them, and with proper monitoring and management, these reactions can be effectively mitigated (1-2).
(3) Comparable pharmacodynamic with other CDK4/6 inhibitors: The combination of CDK4/6 inhibitors, including palbociclib, ribociclib, and abemaciclib, with aromatase inhibitors has demonstrated an enhanced ability to suppress tumor proliferation and increase the rate of clinical response in neoadjuvant therapy for HR-positive, HER2-negative breast cancer (3-5). Furthermore, preclinical studies have shown that dalpiciclib has comparable in vivo and in vitro pharmacodynamic activity to palbociclib, suggesting its potential effectiveness in similar treatment regimens (6).
(4) Accessibility and Regulatory Approval: Dalpiciclib has gained marketing approval in China on December 31, 2021, which facilitates the accessibility of this medication, making it a more convenient option when considering treatment plans.
References:
(1) Zhang P, Zhang Q, Tong Z, et al. Dalpiciclib plus letrozole or anastrozole versus placebo plus letrozole or anastrozole as first-line treatment in patients with hormone receptor-positive, HER2-negative advanced breast cancer (DAWNA-2): a multicentre, randomised, double-blind, placebo-controlled, phase 3 trial(J). The Lancet Oncology, 2023, 24(6): 646-657.
(2) Xu B, Zhang Q, Zhang P, et al. Dalpiciclib or placebo plus fulvestrant in hormone receptor-positive and HER2-negative advanced breast cancer: a randomized, phase 3 trial(J). Nature medicine, 2021, 27(11): 1904-1909.
(3) Hurvitz S A, Martin M, Press M F, et al. Potent cell-cycle inhibition and upregulation of immune response with abemaciclib and anastrozole in neoMONARCH, phase II neoadjuvant study in HR+/HER2− breast cancer(J). Clinical Cancer Research, 2020, 26(3): 566-580.
(4) Prat A, Saura C, Pascual T, et al. Ribociclib plus letrozole versus chemotherapy for postmenopausal women with hormone receptor-positive, HER2-negative, luminal B breast cancer (CORALLEEN): an open-label, multicentre, randomised, phase 2 trial(J). The lancet oncology, 2020, 21(1): 33-43.
(5) Ma C X, Gao F, Luo J, et al. NeoPalAna: neoadjuvant palbociclib, a cyclin-dependent kinase 4/6 inhibitor, and anastrozole for clinical stage 2 or 3 estrogen receptor–positive breast cancer(J). Clinical Cancer Research, 2017, 23(15): 4055-4065.
(6) Long F, He Y, Fu H, et al. Preclinical characterization of SHR6390, a novel CDK 4/6 inhibitor, in vitro and in human tumor xenograft models(J). Cancer science, 2019, 110(4): 1420-1430.
• The eligibility criteria are not consistent throughout the manuscript, sometimes saying early breast cancer, other times saying stage II/III by MRI criteria.
criteria in our manuscript. We deeply apologize for any confusion caused by these inconsistencies. We have revised the term from “early-stage HR-positive, HER2-negative breast cancer” to “early or locally advanced HR-positive, HER2-negative breast cancer” (Lines 128 and 150). The term “early or locally advanced” encompasses two different stages of breast cancer, whereas “Stage II/III by MRI criteria” refers to specific stages within the TNM staging system.
• The authors should emphasize the 25% rate of conversion from mastectomy to breast conservation and also report the type and nature of axillary lymph node surgery performed. As the authors note in the discussion section, rates of pathologic complete response/RCB scores are less prognostic for hormone-receptor-positive breast cancer than other subtypes, so one of the main rationales for neoadjuvant medical therapy is for surgical downstaging. This is a clinically relevant outcome.
We appreciate your constructive comments. Based on your suggestions, we have made the following revisions and additions to the article.
The breast conservation rate serves as a secondary endpoint in our study (Line 62 and 179). We have highlighted the significant 25% conversion rate from mastectomy to breast conservation in both the results (Lines 229-230) and discussion sections (Lines 290-292).
In our study, all patients underwent lymph node surgery, including sentinel lymph node biopsy or axillary lymph node dissection. Among them, 58.3% of patients (7/12) underwent sentinel lymph node biopsies.
We agree with your point that the prognostic value of pathologic complete response/RCB score is lower for hormone receptor-positive breast cancer compared to other subtypes, we have revised the discussion section to clarify that one of the principal objectives for neoadjuvant therapy in this patient population is to facilitate downstaging and enhance the rate of breast conservation (Lines 289-290). And also emphasized that this neoadjuvant therapeutic regiment appeared to improve the likelihood of pathological downstaging and achieve a margin-free resection, particularly for those with locally advanced and high-risk breast cancer (Lines 293-295).
Reviewer #2 (Public review):
Firstly, as this is a single-arm preliminary study, we are curious about the order of radiotherapy and the endocrine therapy. Besides, considering the radiotherapy, we also concern about the recovery of the wound after the surgery and whether related data were collected.
Thanks for the comments. The treatment sequence in this study is to first administer radiotherapy, followed by endocrine therapy. A meta-analysis has indicated that concurrent radiotherapy with endocrine therapy does not significantly impact the incidence of radiation-induced toxicity or survival rates compared to a sequential approach (1). In light of preclinical research suggesting enhanced therapeutic efficacy when radiotherapy is delivered prior to CDK4/6 inhibitors, we have opted to administer radiotherapy before the combination therapy of CDK4/6 inhibitors and hormone therapy (2).
In our study, we collected data on surgical wound recovery. All 12 patients had Class I incisions, which healed by primary intention. The wounds exhibited no signs of redness, swelling, exudate, or fat necrosis.
References:
(1) Li Y F, Chang L, Li W H, et al. Radiotherapy concurrent versus sequential with endocrine therapy in breast cancer: A meta-analysis(J). The Breast, 2016, 27: 93-98.
(2) Petroni G, Buqué A, Yamazaki T, et al. Radiotherapy delivered before CDK4/6 inhibitors mediates superior therapeutic effects in ER+ breast cancer(J). Clinical Cancer Research, 2021, 27(7): 1855-1863.
Secondly, in the methodology, please describe the sample size estimation of this study and follow up details.
Thanks for pointing out this crucial omission. Sample size estimation for this study and follow-up details have been added in the methodology section. The section on sample size estimation has been revised to state in Statistical analysis: “This exploratory study involves 12 patients, with the sample size determined based on clinical considerations, not statistical factors (Lines 210-211).” The section on follow up has been revised to state in Procedures section “A 5-year follow-up is conducted every 3 months during the first 2 years, and every 6 months for the subsequent 3 years. Additionally, safety data are collected within 90 days after surgery for subjects who discontinue study treatment (Lines 169-172).”
Thirdly, in Table 1, the item HER2 expression, it's better to categorise HER2 into 0, 1+, 2+ and FISH-.
Thank you very much for pointing out this issue. The item HER2 expression in Table 1 has been revised from “negative, 1+, 2+ and FISH-” to “0, 1+, 2+ and FISH-”.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public Review):
Weaknesses:
The weaknesses of the study include the following.
(1) It remains unclear whether the function described for CDK2 is regulatory, that is, it affects TBK1 levels during physiological responses such as viral infection or cell cycle progression, or if it is homeostatic, governing the basal abundance of TBK1 but not responding to signaling.
The regulation of TBK1 by CDK2 described in this article occurs during viral infection. Simultaneously, we also investigated the effects of CDK2 overexpression and knockdown on TBK1 levels under non-infected state and observed a slight reduction, as shown in Figure 4K and 4L. Thus, we speculate that the regulation of TBK1 by CDK2 serves, on one hand, to maintain cellular homeostasis and, on the other hand, to respond to signaling triggered by viral infection.
(2) The authors have not explored whether the catalytic activity of CDK2 is required for TBK1 ubiquitinoylation and, if so, what its target specificity is.
We found that the ubiquitination modification of TBK1 was not affected by treatment with a CDK2 kinase activity inhibitor (SNS-032), as demonstrated in the results below (Author response image 1).
Author response image 1.
(3) Given the multitude of CDK isoforms in fish, it remains unexplored whether the identified fish CDK2 homolog is a requisite cell cycle regulator or if its action in the cell cycle is redundant with other CDKs.
A comparison of the protein sequences of fish CDK2 and human CDK2 revealed a 90% similarity (Author response image 2). It has also been reported that the kinase activity of goldfish CDK2 significantly increases during oocyte maturation (ref. 1). Furthermore, UHRF1 phosphorylation by cyclin A2/CDK2 is crucial for zebrafish embryogenesis (ref. 2). Additionally, Red grouper nervous necrosis virus (RGNNV) infection activated the p53 pathway, leading to the upregulation of p21 and downregulation of cyclin E and CDK2, which forces infected cells to remain in the G1/S replicative phase (ref. 3). All these evidences suggest that fish CDK2 plays a vital role in cell cycle regulation, and there have been no reports of other CDKs demonstrating CDK2-like functions.
References:
(1) Hirai T, et al. (1992) Isolation and Characterization of Goldfish Cdk2, a Cognate Variant of the Cell-Cycle Regulator Cdc2. Developmental biology 152(1):113-120.
(2) Chu J, et al. (2012) UHRF1 phosphorylation by cyclin A2/cyclin-dependent kinase 2 is required for zebrafish embryogenesis. Molecular biology of the cell 23(1):59-70.
(3) Mai WJ, Liu HX, Chen HQ, Zhou YJ, & Chen Y (2018) RGNNV-induced cell cycle arrest at G1/S phase enhanced viral replication via p53-dependent pathway in GS cells. Virus Res 256:142-152.
Author response image 2.
Reviewer #2 (Public Review):
Weaknesses:
(1) While the study focuses on fish, the broader implications for other lower vertebrates and higher vertebrates are not extensively discussed.
Thanks to your comment, we have added a paragraph to the Discussion section of the manuscript regarding the implications of the negative regulation of IFN expression by fish CDK2 for other vertebrates (lines 398-403). The details are as follows: first, we selected representative species from each of the six major vertebrate groups and compared their CDK2 protein sequences, finding that they are over 90% similar to one another (Author response image 3). This suggests that the function of CDK2 may be conserved to some extent across vertebrates. Additionally, CDK2 inhibition has been shown to enhance anti-tumor immunity by increasing the IFN response to endogenous retroviruses (ref. 1). Our studies provide evidence that fish CDK2 inhibits the IFN response by promoting the ubiquitination and degradation of TBK1, strongly supporting the role of CDK2 in the regulation of the immune response.
Reference:
(1) Chen Y, et al. (2022) CDK2 Inhibition Enhances Antitumor Immunity by Increasing IFN Response to Endogenous Retroviruses. Cancer Immunol Res 10(4):525-539.
Author response image 3.
(2) The study heavily relies on specific fish models, which may limit the generalizability of the findings across different species.
Thank you for your comment. First, we compared the amino acid sequences of CDK2 proteins from fish and other vertebrates, which show over 90% similarity. Moreover, the small size, low cost, and external development of zebrafish make it an excellent model for vertebrate developmental biology. It has been reported that due to the high genomic and molecular similarities between zebrafish and other vertebrates, including humans, many significant discoveries in zebrafish development are relevant to humans (ref. 2). Our study concentrated on CDK2 in zebrafish, and the findings should be valuable for other vertebrates.
Reference:
(2) Veldman MB & Lin S (2008) Zebrafish as a Developmental Model Organism for Pediatric Research. Pediatr Res 64(5):470-476.
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
The following additional data/discussion could improve the manuscript.
(1) Investigate whether the catalytic activity of CDK2 is required to regulate TBK1 abundance. It is common for E3 ligases to be directed towards phosphorylated substrates, so it would be of interest to know if CDK2 phosphorylates TBK1 to facilitate its recognition for ubiquitinylation.
We examined the effect of CDK2 on the TBK1 protein after inhibiting its kinase activity with SNS-032 treatment and found that it could still affect TBK1 expression, as shown in the results below (Figure R4). Our previous experiments investigating the effect of CDK2 on TBK1 did not show that CDK2 caused the migration of TBK1 bands (typically, proteins that undergo phosphorylation exhibit band migration). Furthermore, in this study, CDK2 did not function as an E3 ligase; instead, it recruited the E3 ligase Dtx4 to ubiquitinate TBK1.
Author response image 4.
(2) Investigate how CDK2 abundance is regulated by viral infection and whether viral infection impacts cell cycle progression in a CDK2-dependent manner.
In fact, as illustrated in Figure 1, we investigated the changes in CDK2 at both the mRNA and protein levels following viral infection. Our findings revealed that SVCV infection resulted in an increase in CDK2 mRNA and protein expression. Additionally, our earlier reports have indicated that SVCV infection can induce alterations in the cell cycle, resulting in a notable increase in the S phase (Figure 1 of ref. 1). However, whether SVCV infection impacts cell cycle progression in a CDK2dependent manner will be explored in our upcoming study.
Reference:
(1) Li S, et al. Spring viraemia of carp virus modulates p53 expression using two distinct mechanisms. PLoS Pathog 15, e1007695 (2019).
(3) Provide data/discussion concerning the role of fish CDK2 in the regulation of cell cycle progression and whether this process is impacted by viral infection (part 1). Are TBK1 abundance and interferon production differentially regulated across the cell cycle due to the action of CDK2 (part 2).
Thank you for your advice. This concern is addressed in two parts, as follows:
For part 1: To date, there has been limited research conducted on fish CDK2 in the regulation of cell cycle progression. The details are as follows: It has been reported that the kinase activity of goldfish CDK2 significantly increases during oocyte maturation (ref. 1). Furthermore, UHRF1 phosphorylation by cyclin A2/CDK2 is crucial for zebrafish embryogenesis (ref. 2). Additionally, a novel CDK2 homolog has been identified in Japanese lamprey, which plays a crucial role in apoptosis (ref. 3). Red grouper nervous necrosis virus (RGNNV) infection activates the p53 pathway, leading to the upregulation of p21 and downregulation of cyclin E and CDK2, which forces infected cells to remain in the G1/S replicative phase (ref. 4). All this evidence suggests that fish CDK2 plays a vital role in cell cycle regulation, and this process is also impacted by viral infection. Relevant content has been added to the Discussion section in the revised manuscript (lines 389-398).
References:
(1) Hirai T, et al. (1992) Isolation and Characterization of Goldfish Cdk2, a Cognate Variant of the Cell-Cycle Regulator Cdc2. Developmental biology 152(1):113-120.
(2) Chu J, et al. (2012) UHRF1 phosphorylation by cyclin A2/cyclin-dependent kinase 2 is required for zebrafish embryogenesis. Molecular biology of the cell 23(1):5970.
(3) Xu Y, Tian Y, Zhao H, Zheng N, Ren KX, Li QW. A novel CDK-2 homolog identified in lamprey, with roles in apoptosis. Fish Physiol Biochem 47, 189-189 (2021).
(4) Mai WJ, Liu HX, Chen HQ, Zhou YJ, & Chen Y (2018) RGNNV-induced cell cycle arrest at G1/S phase enhanced viral replication via p53-dependent pathway in GS cells. Virus Res 256:142-152.
For part 2: TBK1 plays a crucial role in regulating IFN production. Variations in CDK2 activity during different phases of the cell cycle may lead to changes in the expression and function of TBK1. Our findings suggest that heightened CDK2 activity may suppress TBK1 expression, thereby hindering the cell's capacity to produce IFN. Conversely, during the late phase of the cell cycle or in an inhibited state, TBK1 expression may rise, enhancing IFN synthesis and release. In summary, CDK2 is involved in intracellular signaling by modulating TBK1 levels and IFN production, affecting the cellular immune response and cycle regulation—two processes that are notably distinct at various stages of the cell cycle. Relevant content has been added to the Discussion section in the revised manuscript (lines 377-384).
Minor suggestions:
(1) The authors introduce their study with the consideration that knowledge of fish signaling pathways can inform mammalian biology because mammals evolved from fish. This is not strictly true, since mammals and fish both evolved from an ancient common ancestor and the diversification of signaling in each species likely occurred in response to distinct evolutionary selective pressures.
Thank you for your suggestion. We have revised the statement in the manuscript to eliminate the notion that mammals evolved from fish (lines 98-99). The immune systems of higher vertebrates (e.g., humans) and lower vertebrates (e.g., fish) generally exhibit some consistency, although there are notable differences.
(2) On line 210 and line 276, the authors appear to have misstated the data. CDK2 knockout increases not decreases TBK1 and Dtx4 knockdown abrogated rather than restored CDK2 suppression of TBK1.
Thanks for your reminder, I jumped to the wrong conclusions in these two places (line 204 and line 267) and have changed them as you suggested.
Reviewer #2 (Recommendations For The Authors):
The manuscript has some shortcomings that, if addressed, could improve the overall quality of the article.
(1) Line 63-72, line 77-79, line 88-90- please add additional references for these sentences.
Thanks to your comment, we have added references for these sentences (Line 63-72, line 77-79, line 88-90).
(2) It is of the utmost importance to quantify the data presented in Figures 4J and 5D, as this will facilitate the visualization of the immunoblot.
Thank you for your comment. We have quantified the data presented in Figures 4J and 5D to enhance the clarity of the immunoblot.
(3) The scale in Figure 4E is difficult to discern.
Thanks for your comment. To improve the visual clarity of the image, we have enlarged the scale label in Figure 4E.
(4) In Figure 3B, shCDK2 is shown in italics, preferably in line with other standards such as Figures 3C and 3F.
Thank you for your comment. We have revised the shCDK2 in Figure 3B.
(5) The functions of CDK family members in immunity are hoped to be discussed.
Thanks for your suggestion. We have discussed the functions of CDK family members in immunity (lines 363-387). The details are as follows: Recent studies have demonstrated that CDK activity is crucial for virus-induced innate immune responses. Reports indicate that CDKs are involved in the Toll-like receptor (TLR) signaling pathway, the nuclear factor-κB (NF-κB) signaling pathway, and the JAK-STAT signaling pathway. For instance, CDK8 and/or CDK19 enhanced the transcription of inflammatory genes, such as IL-8 and IL-10, in cells following TLR9 stimulation. CDKs and NF-κB establish a remarkable paradigm where CDKs can act directly on substrate proteins rather than depending solely on transcriptional control. It has been reported that CDK1 serves as a positive regulator of the IFN-I signaling pathway, facilitating STAT1 phosphorylation, which subsequently boosts the expression of ISGs. Furthermore, inhibiting CDK activity has been shown to obstruct STAT phosphorylation, proinflammatory gene activation, and ISG mRNA induction in response to SeV infection. It is important to note that no evidence suggests the involvement of CDKs in RLR signaling pathways. This study has shown that fish CDK2 functions as a negative regulator of the key kinase TBK1, which is involved in the RLR signaling pathway. A better understanding of the relationship between CDK2 and RLR signaling pathways will enhance our grasp of the regulatory mechanisms of CDKs in antiviral innate immunity.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the previous reviews.
Reviewer #1 (Public review):
Summary:
Lodhiya et al. demonstrate that antibiotics with distinct mechanisms of action, norfloxacin and streptomycin, cause similar metabolic dysfunction in the model organism Mycobacterium smegmatis. This includes enhanced flux through the TCA cycle and respiration as well as a build-up of reactive oxygen species (ROS) and ATP. Genetic and/or pharmacologic depression of ROS or ATP levels protect M. smegmatis from norfloxacin and streptomycin killing. Because ATP depression is protective, but in some cases does not depress ROS, the authors surmise that excessive ATP is the primary mechanism by which norfloxacin and streptomycin kill M. smegmatis. In general, the experiments are carefully executed; alternative hypotheses are discussed and considered; the data are contextualized within the existing literature.
We thank the reviewer for the very comprehensive summary of the study.
Strengths:
The authors tackle a problem that is both biologically interesting and medically impactful, namely, the mechanism of antibiotic-induced cell death.
Experiments are carefully executed, for example, numerous dose- and time-dependency studies; multiple, orthogonal readouts for ROS; and several methods for pharmacological and genetic depletion of ATP.
There has been a lot of excitement and controversy in the field, and the authors do a nice job of situating their work in this larger context.
Inherent limitations to some of their approaches are acknowledged and discussed e.g., normalizing ATP levels to viable counts of bacteria.
We thank the reviewer for the encouraging comments.
Weaknesses:
All of the experiments performed here were in the model organism M. smegmatis. As the authors point out, the extent to which these findings apply to other organisms (most notably, slow-growing pathogens like M. tuberculosis) is to be determined. To avoid the perception of overreach, I would recommend substituting "M. smegmatis" for Mycobacteria (especially in the title and abstract).
At first glance, a few of the results in the manuscript seem to conflict with what has been previously reported in the (referenced) literature. In their response to reviewers, the authors addressed my concerns. It would also be ideal to include a few lines in the manuscript briefly addressing these points. (Other readers may have similar concerns).
In the first round of review, I suggested that the authors consider removing Figs. 9 and 10A-B as I believe they distract from the main point of the paper and appear to be the beginning of a new story rather than the end of the current one. I still hold this opinion. However, one of the strengths of the eLife model is that we can agree to disagree.
We acknowledge the reviewer’s concern and have changed title of the manuscript by including Mycobacterium smegmatis instead of Mycobacteria. The abstract already mentioned the same.
In the discussion section of the revised manuscript, we have already addressed and analysed our results extensively within the context of the available literature, regardless of whether our findings aligned with or differed from previous studies. We still believe that the mentioned discussion will help suffice to explain our results to the readers.
In this manuscript we also sought to assess the bacteria's ability to counteract drug induced stresses, contributing to our understanding of how antibiotic tolerance develop in Mycobacterium smegmatis. Results presented in Figure 9 clearly demonstrate that M.smegmatis attempt to reduce respiration by decreasing flux through the complete TCA cycle, thereby mitigating ROS and ATP production in response to antibiotics. Additionally, the bacterial response also included increased expression of the protein Eis, which is exemplar for intrinsic drug resistance, with a concomitant increase in mutation frequency, thereby hinting at the development of antibiotic tolerance followed by resistance. We still believe that these data should be included to support our observations and they make the study more comprehensive.
Reviewer #2 (Public review):
Summary:
The authors are trying to test the hypothesis that ATP bursts are the predominant driver of antibiotic lethality of Mycobacteria
Strengths:
No significant strengths in the current state as it is written.
Weaknesses:
A major weakness is that M. smegmatis has a doubling time of three hours and the authors are trying to conclude that their data would reflect the physiology of M. tuberculossi that has a doubling time of 24 hours. Moreover, the authors try to compare OD measurements with CFU counts and thus observe great variabilities.
Comments on revisions:
I am surprised that the authors simply did not repeat the study in figure one with CFU counts and repeated in triplicate. Since this is M. smegmatis, it would take no longer than two weeks to repeat this experiment and replace the figure. I understand that obtaining CFU counts is much more laborious than OD measurements but it is necessary. Your graph still says that there is 0 bacteria at time 0, yet in your legend it says you started with 600,000 CFU/ml. I don't understand why this experiment was not repeated with CFU counts measured throughout. This is not a big ask since this is M. smegmatis but it appears that the authors do not want to repeat this experiment. Minimally, fix the graph to represent the CFU.
We acknowledge the reviewer’s concern and have changed title of the manuscript by specifying Mycobacterium smegmatis instead of Mycobacteria.
It is still not clear to the authors what the reviewer mean by OD measurements. All the data presented in the entire manuscript , including in Figure 1 are solely based on CFU measurements. So, as suggested by the reviewer, all experiments are already presented in terms of CFU.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
We thank the editors and reviewers for the constructive assessment. We plan to address the comments as follows:
Reviewer #1 (Public review):
We are generating a new cohort of Lv-TGFB2 overexpressing mice in which IOP will be compared under the anesthesia conditions that are identical for diurnal and nocturnal states. Parenthetically, we used the awake (diurnal) and isoflurane (nocturnal) anesthesia to mirror the conditions in the Patel et al (2021) PNAS study.
Reviewer #2 (Public review):
We are not sure what the Reviewer means by the “difference between the message and transcript data” and are not sure whether providing evidence about the TRPV4-dependence of the expression of fibrotic genes and canonical TGFb2 pathway genes fits within the scope of our study (which focuses on the TGFB2-dependence of TRPV4 expression and IOP regulation). We propose to address this by including new data about the TGFb2- and TRPV4 dependence of TRPV4 and Piezo1 expression. We could include information about the effect of TGFB2 on fibrosis-related genes from a (submitted study) in which we used RNASeq to investigate TGFB2 and TGFB2 + HC067047-dependence of gene expression in TM cells on a confidential basis but not include it in the revised manuscript.
- Re: b-tubulin comment [b-tubulin associates with the plasma membrane by binding to integral membrane proteins in the plasma and organellar membranes, through palmitoylation and attachment to linker proteins and as an integral component of exocytotic vesicles (Wolff, BBA 2009; Hogerheide et al., PNAS 2017). Together with b-actin and Gapdh it is often used as a loading control to assess cellular TRPV4 protein expression (e.g., https://www.cellsignal.com/products/primary-antibodies/trpv4-antibody/65893; Grove et al., Science Signaling 2019 and Moore et al., PNAS 2013). Our qPCR and RNASeq studies show that TGFB2 does not affect b-tubulin expression]
- We will provide a higher resolution image for Fig. 4A
- Will address the Fig 5A and 6A comment [We thank the Reviewer for noticing the ambiguity and revised Figure Legends to clarify that “pre-injection” in Figures 5B and 6B refers to IOP measurements before the intracameral injection of HC-06 not pre-injection of lentiviral constructs].
- We will address the issue of constitutive TRPV4 activity and Piezo1 involvement in the revised Discussion.
We hope this is sufficient information at this point but would be more than happy to provide more information if needed.
Thank you, we are very impressed by the eLife review protocols.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
eLife Assessment:
This study provides valuable insights, addressing the growing threat of multi-drug-resistant (MDR) pathogens by focusing on the enhanced efficacy of colistin when combined with artesunate and EDTA against colistin-resistant Salmonella strains. The evidence is solid, supported by comprehensive microbiological assays, molecular analyses, and in vivo experiments demonstrating the effectiveness of this synergic combination. However, the discussion on the clinical application challenges of this triple combination is incomplete, and it would benefit from addressing the high risk associated with using three potential nephrotoxic agents in vivo.
The development of novel pharmaceutical dosage forms, pharmacokinetic, pharmacodynamic and safety analysis of the triple combination will be further conducted in our next study to provide a theoretical basis for the next clinical drug use. The discussion of potential toxicity of AS, colistin, EDTA and the triple combination have been added in line 318 to 337.
Public Reviews:
Reviewer #1 (Public Review):
(1) The study focuses on a limited number of Salmonella strains, and broader testing on various MDR pathogens would strengthen the findings.
The number of COL-resistant clinical strains that actually used was larger than that mentioned in our original article, when evaluating the antimicrobial activities of AS, EDTA, COL alone or drug combinations. But, considering that there were superfluous results of mcr-1 positive Salmonella strains, we omitted these results (Table supplement 7 and 8 in revised supplement materials) to avoid redundant data presentation in the original article. Additionally, much more gram-negative and -positive MDR bacteria, such as Klebsiella pneumoniae, Pseudomonas aeruginosa and Staphylococcus aureus will be selected for the next study including the development of novel pharmaceutical dosage forms, pharmacokinetic, pharmacodynamic and safety analysis et al.
(2) While the study elucidates several mechanisms, further molecular details could provide deeper insights into the interactions between these drugs and bacterial targets.
In our next study, further molecular details will be focused on the regulatory targets of CheA and SpvD-related pathways, as well as the precise inhibition targets of MCR protein by the triple combination, through the generation of deletion or point mutations, and analysis of intermolecular interactions.
(3) The time-kill experiment was conducted over 12 hours instead of the recommended 24 hours. To demonstrate a synergistic effect among the drugs, a reduction of at least 2 log10 in colony count should be shown in a 24-hour experiment. Additionally, clarifying the criteria for selecting drug concentrations is important to improve the interpretation of the results.
The time-kill experiment of 24 hours have been re-executed and could be used to replace the Figure 1 in the original paper. The New Figure 1 has been uploaded and the change do not affect our interpretation of the result.
Although in vitro studies have determined that with increasing dose of AS and EDTA, the antibacterial synergistic activity was gradually enhanced, and meanwhie, may also resulting in more toxic side effects. Thus, in our study, the 1/8 MICs of AS and EDTA were selected to ensure excellent antibacterial activity whereas minimize the potential toxicity. The instructions on the selection of drug concentration have been added in line 323 to 326.
(4) While the combination of EDTA, artesunate, and colistin shows promising in vitro results against Salmonella strains, the clinical application of this combination warrants careful consideration due to potential toxicity issues associated with these compounds.
The development of novel pharmaceutical dosage forms, pharmacokinetic, pharmacodynamic and safety analysis of the triple combination will be further conducted in our next study to provide a theoretical basis for the next clinical drug use.
Reviewer #2 (Public Review):
(1) The study by Zhai et al describes repurposing of artesunate, to be used in combination with EDTA to resensitize Salmonella spp. to colistin. The observed effect applied both to strains with and without mobile colistin resistance determinants (MCR). It was already known that EDTA in combination with colistin has an inhibitory effect on MCR-enzymes, but at the same time, both colistin and EDTA can contribute to nephrotoxicity, something which is also true for artesunate. Thus, the triple combination of three nephrotoxic agents has significant challenges in vivo, which is not particularly discussed in this paper.
The discussion of potential toxicity of triple combination has been added in line 318 to 337.
(2) The selection of strains is not very clear. Nothing is known about the sequence types of the strains or how representative they are for strains circulating in general. Thus, it is difficult to generalize from this limited number of isolates, although the studies done in these isolates are comprehensive.
The tested strains in this study were all COL-resistant clinical isolates, and the genome sequencing and comparative analysis of these strains have not been analyzed. The antibacterial activities of different antimicrobial drugs against the S16 and S30 strains have been measured and listed in the Table supplement 9 within revised supplement materials. Considering that the number of COL-resistant clinical strains that actually used was larger than that mentioned in our original article (see the NO.1 response to the Public Reviewer #1), we think that the results obtained in this study could be representative to some extent.
(3) Nothing is known about the susceptibility of the strains to other novel antimicrobial agents. Colistin has a limited role in the treatment of gram-negative infections, and although it can be used sometimes in combination, it is not clear why it would be combined with two other nephrotoxic agents and how this could have relevance in a clinical setting.
The antibacterial activities of different antimicrobial drugs against the S16 and S30 strains have been measured and listed in the Table supplement 9 within revised supplement materials. Additionally, the discussion of potential toxicity of triple combination has been added in line 318 to 337.
(4) It is not clear whether their transcriptomics analysis should at least be carried out in duplicate for reasons of being able to assess reproducibility. It is also not clear why the samples were incubated for 6 hours - no discussion is presented on the selection of a time point for this.
As it can be seen from the time kill curves that the survival number of bacteria started to decrease after 4 h incubation of drug combinations. If the incubation time is too short (for example less than 4 h), the differentially expressed genes can not be fully revealed, while too long incubation time (such as 8 h and 12 h) may lead to a significant CFU reduction of bacteria, and result in inaccurate sequencing results. Therefore, we selected the incubation time 6 h, at which point drugs exhibited significant antibacterial effects and there were also enough survival bacteria in the sample for transcriptome analysis. Each sample had three replications to preserve the accuracy of results.
(5) Discussion is lacking on the reproducibility and selection of details for the methodology.
The results obtained in this paper have been repeated several times, which indicated that the detailed operation steps described in the materials and methods section were reproducibility. To avoid redundancy, we did not include too much details in the discussion section.
Reviewer #3 (Public Review):
(1) Number of strains tested.
The number of COL-resistant clinical strains that actually used was larger than that mentioned in our original article (see the NO.1 response to the Public Reviewer #1)
(2) Response to comment: Lack of data on cytotoxicity.
The pharmacokinetic, pharmacodynamic and safety analysis of the triple combination will be further conducted in our next study to provide a theoretical basis for the next clinical drug use.
Recommendations For The Authors:
Reviewer #1 (Recommendations For The Authors):
(1) Introduction:
The introduction should provide more context about the pathogen Salmonella, its significance in both human and veterinary medicine, and the impact of colistin resistance in these pathogens. Salmonella is a leading cause of foodborne illnesses worldwide, resulting in substantial morbidity and mortality. It can cause a range of diseases, from gastroenteritis to more severe systemic infections like typhoid fever and invasive non-typhoidal salmonellosis. In veterinary medicine, Salmonella infections can lead to significant economic losses in livestock industries due to illness and death among animals, as well as through the contamination of animal products.
The description has been added in the introduction section in line 47 to 53.
(2) Results and Discussion:
(1) While the combination of EDTA, artesunate, and colistin shows promising in vitro results against Salmonella, the clinical application of this combination warrants careful consideration due to potential toxicity issues associated with these compounds. Colistin is known for nephrotoxicity and neurotoxicity, limiting its use to severe cases where the benefits outweigh the risks. EDTA, as a chelating agent, can disrupt essential metal ions in the body, posing risks of metabolic imbalances. Although it has clinical applications, primarily in cases of heavy metal poisoning, its use as an adjuvant in antibiotics may present risks. Although generally well-tolerated for malaria, interactions of artesunate with other drugs and long-term safety in combined therapies require thorough investigation.
The discussion of potential toxicity of triple combination has been added in line 318 to 337.
(2) Table 1: The manuscript mentions that some strains used in the study are mcr-positive and mcr-negative. It is important to indicate in Table 1, in addition to the identification of Salmonella species, which strains are mcr-positive or mcr-negative.
The relevant information has been added in Table 1.
(3) Figure 2: What is the authors' hypothesis regarding the growth curves labeled "a" and "e" where strains JS and S16 resume growth 12 hours after treatment with AS? In the legend of Figure 2, describe what was used as the "positive control group."
The growth curves labeled “a” and “e” were in Figure 1. After incubated with AC for 8 h, the survival CFUs of JS and S16 strains showed a slightly reduction, but there were still living cells. Since the bactericidal activity of AC is not strong enough to exert sustained bactericidal activity, these remaining living cells will resume growth after treatment with AC for 12 h. The “positive control group” in the legend of Figure 2 has been indicated in line 724.
(4) What is the authors' hypothesis for the differences observed in the transcriptome and metabolome?
The changes in gene transcription level may cause corresponding changes in protein level, but these proteins are not all involved in the bacterial metabolic process. For example, MCR protein is encoded by the COL resistance related gene mcr, which mediates the modification of lipid A, but are not involved in the cellular metabolic process. Therefore, the transcriptome change of mcr gene may affect the protein production of MCR, nor the bacterial metabolic processes, so there are differences observed in the transcriptome and metabolome.
(5) In some parts of the text, the authors state that artesunate and EDTA potentiate the action of colistin, which is a bacteriostatic drug. However, in other parts, the authors describe the effect of the AEC combination as bacteriostatic (Abstract: line 32; Results: line 179). How do the authors explain this inconsistency?
The artesunate and EDTA could be regarded as “adjuvants” for the bacteriostatic drug colistin. Adjuvants itself act no or weak antibacterial effect on bacteria. For antimicrobial drugs, the “adjuvants” are compounds that generally used in combination with antibacterial drugs to re-sensitizing bacteria that have developed drug resistance. Thus, in this paper the AEC combination could be regared as bacteriostatic.
(6) According to Brennan & Kirby (2019; doi: 10.1016/j.cll.2019.04.002), to evaluate the synergism between different drug combinations, bacterial growth curves need to be assessed over 24 hours. If the colony count is {greater than or equal to} 2 log10 lower than that of the most active antimicrobial alone, the combination is considered synergistic. Based on the growth curve results shown in Figure 1, the experiment was conducted for 12 hours, and in some cases, only a small reduction in growth was observed, even at the maximum concentration of colistin. Moreover, in some cases, the curve resumes rising between 8 and 12 hours. What is the authors' hypothesis in this case? It is important to conduct the assay over 24 hours to confirm the synergism between these drugs.
The time-kill experiment of 24 hours have been re-executed and could be used to replace the Figure 1 in the original paper. Additionally, the phenomenon that “the curve resumes rising between 8 and 12 hours” has been explained in the response to comment of “Reviewer #1 (Recommendations For The Authors), Results and Discussion, (3) Figure 2”.
(7) To prove that CheA and SpvD play a critical role in the effect of the AEC combination, deletion of these genes should be performed, and the mutant strains should be tested.
The deletion of cheA and spvD will be carried out in our next study.
(8) To demonstrate that the flagellum is no longer assembled, a transmission electron microscopy image using antibodies against flagellin should be performed, along with motility tests.
The motility assays have been performed and displayed as Figure supplement 5 in the revised supplement materials.
(9) Figure 7: In the X-axis legend, specify what "model" refers to.
The “model” refers to the PBS control group that mice were treated with PBS after the intraperitoneal injection of 100 µL bacterial solution (1.31 × 10<sup>5</sup> CFU).
(10) Figure 8 Legend: In the legend of Figure 8 (line 717), are the authors referring to E. coli or Salmonella?
It referred to Salmonella, which has already been illustrated in the headline of Figure 8 in the revised manuscript.
(3) Materials and Methods:
(1) Bacterial Strains and Agents: It would be beneficial to include in the table the species of the strains used in the study, as well as the concentrations of colistin, artesunate, and EDTA utilized (lines 321 - 332).
We have ever tried to add the above information to Table 1, but the addition of this information would make the table too large and beyond the margins, which is not conducive to the layout design of the table, so we chose to display these information in the materials and methods section instead of the table.
(2) Antibacterial Activity In Vitro: Ensure clarity and well-defined ranges for the concentrations of colistin, EDTA, and artesunate used separately and in combinations (lines 335 - 344).
The drug concentrations have been listed in line 369 to 371.
(3) Time-Kill Assays: Clarify the criteria for selecting concentrations, whether based on MICs or peak and trough concentrations relevant to human and animal treatments with colistin (lines 345 - 351).
Although in vitro studies have determined that with increasing dose of AS and EDTA, the antibacterial synergistic activity was gradually enhanced, and meanwhie, may also resulting in more toxic side effects. Thus, in our study, the 1/8 MICs of AS and EDTA were selected to ensure excellent antibacterial activity whereas minimize the potential toxicity. The instructions on the selection of drug concentration have been added in line 323 to 326.
(4) General Corrections: Throughout the manuscript, correct typographical errors and consistently include the concentration values in mg/L alongside the MIC fractions. Specify the strains used for all experiments to ensure clarity. In the manuscript, the term "medication regimens" is used to describe the experimental setups involving different combinations of drugs tested in vitro. To improve accuracy and clarity, it is recommended to use the term "drug combination" instead. This term is more appropriate for in vitro experiments and will help avoid confusion with clinical treatment protocols.
The typographical errors have been checked and corrected throughout the manuscript, and the “medication regimens” have been replaced by “drug combinations”.
Reviewer #2 (Recommendations For The Authors):
Please see above for recommendations on what can be done to improve the manuscript.
While other omics analyses have been conducted herein, the authors do not comment on the genomic analysis of their own strains. It would have been a natural step to sequence all the strains used in the experiments.
Due to limited program funding, the genome sequencing and comparative analysis of these strains have not been analyzed. The antibacterial activities of different antimicrobial drugs against the S16 and S30 strains have been measured and listed in the Table supplement 9 within revised supplement materials.
Some minor comments:
(1) There are some spelling errors - e.g. "bacteria strains" instead of "bacterial strains".
The grammar and spelling errors have been corrected throughout the manuscript.
(2) I would avoid words like "unfortunately".
The word “unfortunately” has been changed.
(3) Some MIC-values in Table 1 seem incorrect - e.g. 24 mg/L. This is not a 2-log value - the value should be 32 mg/L if the dilution series has been carried out correctly.
We are so sorry for the mistake. The data has been corrected, and we also checked other data.
Reviewer #3 (Recommendations For The Authors):
Below are some suggestions.
(1) Sentences L47 & L48 "Infections with antibiotic-resistant pathogens, especially carbapenemase-producing Enterobacteriaceae, represent an impending catastrophe of a return to the pre-antibiotic era" - this is slightly exaggerated! I also wonder if we need to use Enterobacterales instead of Enterobacteriaceae.
The sentences in L47 & L48 have been changed. We googled the “carbapenemase-producing Enterobacteriaceae” and found it is a high-frequency word in numerous reports.
(2) L48. The drying up of the antibiotic discovery pipeline is NOT necessarily the reason to use colistin as a drug of last resort!
The sentence has been revised.
(3) The manuscript requires extensive English editing but has merit based on the strong compilation of data.
We have optimized and revised the writing of the whole article.
(4) I suggest the authors have some data on the cytotoxicity of AS alone, colistin alone, and both of them against eucaryotic cells (Caco-) and if possible determine IS (index selectivity). This additional experiment will strengthen the quality of the manuscript. The authors must also explain how to put such tri-therapy into practice.
The development of novel pharmaceutical dosage forms, pharmacokinetic, pharmacodynamic and safety analysis of the triple combination will be further conducted in our next study to provide a theoretical basis for the next clinical drug use. The discussion of potential toxicity of AS, colistin, EDTA and the triple combination have been added in line 318 to 337.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
Summary:
SARS-CoV-2 infection induces syncytia formation, which promotes viral transmission. In this paper, the authors aimed to understand how host-derived inflammatory cytokines IL-1α/β combat SARS-CoV-2 infection.
Strengths:
First, they used a cell-cell fusion assay developed previously to identify IL-1α/β as the cytokines that inhibit syncytia formation. They co-cultured cells expressing the spike protein and cells expressing ACE2 and found that IL-1β treatment decreased syncytia formation and S2' cleavage.
Second, they investigated the IL-1 signaling pathway in detail, using knockouts or pharmacological perturbation to understand the signaling proteins responsible for blocking cell fusion. They found that IL-1 prevents cell-cell fusion through MyD88/IRAK/TRAF6 but not TAK1/IKK/NF-κB, as only knocking out MyD88/IRAK/TRAF6 eliminates the inhibitory effect on cell-cell fusion in response to IL-1β. This revealed that the inhibition of cell fusion did not require a transcriptional response and was mediated by IL-1R proximal signaling effectors.
Third, the authors identified RhoA/ROCK activation by IL-1 as the basis for this inhibition of cell fusion. By visualizing a RhoA biosensor and actin, they found a redistribution of RhoA to the cell periphery and cell-cell junctions after IL-1 stimulation. This triggered the formation of actin bundles at cell-cell junctions, preventing fusion and syncytia formation. The authors confirmed this molecular mechanism by using constitutively active RhoA and an inhibitor of ROCK.
Diverse Cell types and in vivo models were used, and consistent results were shown across diverse models. These results were convincing and well-presented.
Weaknesses:
As the authors point out in the discussion, whether IL-1-mediated RhoA activation is specific to viral infection or regulates other RhoA-regulated processes is unclear. We would also require high-magnification images of the subcellular organization of the cytoskeleton to appreciate the effect of IL-1 stimulation.
Thanks for the suggestions. We tested the role of IL-1β in other RhoA-regulated processes, and found that IL-1β-mediated RhoA activation also reduced cell migration in a cell scratch assay (see Author response image 1). We also provided high-magnification images in the revised Figures 4 and 5, as well as their respective figure supplements.
Author response image 1.
(A) Cell scratch assay images of HEK293T cells treated with PBS or IL-1β. (B) Quantification of cell migration in (A).
Reviewer #2 (Public Review):
Summary:
In this study, Zheng et al investigated the role of inflammatory cytokines in protecting cells against SARS-CoV-2 infection. They demonstrate that soluble factors in the supernatants of TLR-stimulated THP1 cells reduce fusion events between HEK293 cells expressing SARS-CoV-2 S protein and the ACE2 receptor. Using qRT-PCR and ELISA, they demonstrate that IL-1 cytokines are (not surprisingly) upregulated by TLR treatment in THP1 cells. Further, they convincingly demonstrate that recombinant IL-1 cytokines are sufficient to reduce cell-to-cell fusion mediated by the S protein. Using chemical inhibitors and CRISPR knock-out of key IL-1 receptor signaling components in HEK293 cells, they demonstrate that components of the myddosome (MYD88, IRAK1/4, and TRAF6) are required for fusion inhibition, but that downstream canonical signaling (i.e., TAK1 and NFKB activation) is not required. Instead, they provide evidence that IL-1-dependent non-canonical activation of RhoA/Rock is important for this phenotype. Importantly, the authors demonstrate that expression of a constitutively active RhoA alone is sufficient to inhibit fusion and that chemical inhibition of Rock could reverse this inhibition. The authors followed up these in vitro experiments by examining the effects of IL-1 on SARS-COV-2 infection in vivo and they demonstrate that recombinant IL-1 can reduce viral burden and lung pathogenesis in a mouse model of infection. However, the contribution of the RhoA/Rock pathway and inhibition of fusion to IL-1-mediated control of SARS-CoV-2 infection in vivo remains unclear.
Strengths:
(1) The bioluminescence cell-cell fusion assay provides a robust quantitative method to examine cytokine effects on viral glycoprotein-mediated fusion.
(2) The study identifies a new mechanism by which IL-1 cytokines can limit virus infection.
(3) The authors tested IL-1 mediated inhibition of fusion induced by many different coronavirus S proteins and several SARS-CoV-2 strains.
Weaknesses:
(1) The qualitative assay demonstrating S2 cleavage and IL-1 mediated inhibition of this phenotype is extremely variable across the data figures. Sometimes it appears like S2 cleavage (S2') is reduced, while in other figures immunoblots show that total S2 protein is decreased. Based on the proposed model the expectation would be that S2 abundance would be rescued when cleavage is inhibited.
In our present manuscript, IL-1-mediated changes of the full-length spike showed some variation between authentic SARS-CoV-2 infection model and HEK293T-S + HEK293T-ACE2 coculture model, while IL-1 inhibited S2’ cleavage accompanied by a reduction of S2 subunit in both models.
In the authentic SARS-CoV-2 infection model, we observed that IL-1 inhibited S2' cleavage accompanied with a reduction in both S2 subunit and full-length spike protein. This is likely because the S2 subunit and full-length spike protein in this model are not only from infected cells, but also from intracellular viral particles. IL-1 inhibited SARS-CoV-2 induced cell-cell fusion and reduced the viral load in host cells, therefore the abundance of S2 subunit and full-length spike proteins were both reduced.
In the HEK293T-based co-culture model, IL-1 inhibited S2' cleavage accompanied with a reduction in S2 subunit, while the full-length spike protein was more or less rescued. Based on our previous study, R685A and ΔRRAR spike mutants cannot generate the S2 subunit, but still generated S2′ fragment to induce cell-cell fusion, and the S2' fragment produced from R685A and ΔRRAR spike mutants were only slightly reduced compared to wild-type spike protein, suggesting that the S2' fragment is mainly derived from the full-length spike directly, and to a minimal extent from the S2 subunit (Fig. 4B and 4G, PMID: 34930824). Thus, inhibition of S2’ cleavage by IL-1 mainly rescued the full-length spike protein.
(2) The text referencing Figure 1H suggests that TLR-stimulated THP-1 cell supernatants "significantly" reduce syncytia, but image quantification and statistics are not provided to support this statement.
Thanks for pointing out this issue. We have provided fluorescence image quantification and statistics in the revised version of our manuscript (Figure 1D, Figure 1-figure supplement 1A, Figure 1H-1I, Figure 2H-2I, Figure 1-figure supplement 1D-1E, Figure 1-figure supplement 1H-1I, Figure 2-figure supplement 1C-1D, Figure 2-figure supplement 2B-2E, Figure 2-figure supplement 2G-2H, Figure 2-figure supplement 6A-6B, Figure 2-figure supplement 7F-7G).
(3) The authors conclude that because IL-1 accumulates in TLR2-stimulated THP1 monocyte supernatants, this cytokine accounts for the ability of these supernatants to inhibit cell-cell fusion. However, they do not directly test whether IL-1 is required for the phenotype. Inhibition of the IL-1 receptor in supernatant-treated cells would help support their conclusion.
Thanks for the suggestion. Accordingly, we performed experiment and found that IL-1RA treatment reduced the inhibitory effect of PGN-stimulated THP-1 cell culture supernatant on cell-cell fusion, suggesting that IL-1 is required for the inhibition. This result has been added in our revised manuscript (Figure 2J and Figure2-figure supplement 4C).
(4) Immunoblot analysis of IL-1 treated HEK293 cells suggests that this cytokine does not reduce the abundance of ACE2 or total S protein in cells. However, it is possible that IL-1 signaling reduces the abundance of these proteins on the cell surface, which would result in a similar inhibition of cell-cell fusion. The authors should confirm that IL-1 treatment of their cells does not change Ace2 or S protein on the cell surface.
Thanks for the suggestion. Accordingly, we applied Wheat Germ Agglutinin (WGA) to stain cell surface in HKE293T cells and observed that IL-1β treatment did not change ACE2 or Spike protein on the cell surface. This result has been added in our revised manuscript (Figure 5-figure supplement 3A-D).
(5) In Figure 5A, expression of constitutively active RhoA appears to have profound effects on how ACE2 runs by SDS-PAGE, suggesting that RhoA may have additional effects on ACE2 biology that might account for the decreased cell-cell fusion. This phenotype should be addressed in the text and explored in more detail.
Thanks for pointing out this. We also noticed that the occurrence of cell-cell fusion reduced the amount of ACE2, whereas inhibition of cell-cell fusion restored the ACE2 abundance. Take the original Figure 5A (revised Figure 4-figure supplement 2B) as example, the increased ACE2 protein should be attributed to the decreased cell-cell fusion upon RhoA-CA transfection, as Spike binding with ACE2 leads to clathrin- and AP2-dependent endocytosis, resulting in ACE2 degradation in the lysosome (PMID: 36287912).
In addition, we have examined the potential effect of RhoA-CA on ACE2, and found that RhoA-CA did not affect ACE2 expression, nor Spike binding to ACE2 (revised Figure 5-figure supplement 2E); it did not affect ACE2 distribution on cell surface either (revised Figure 5-figure supplement 2F and G).
(6) The experiments linking IL-1 mediated restriction of SARS-COV-2 fusion to the control of virus infection in vivo are incomplete. The reported data demonstrate that recombinant IL-1 can restrict virus replication in vivo, but they fall short of confirming that the in vitro mechanism described (reduced fusion) contributes to the control of SARS-CoV2 replication in vivo. A critical piece of data that is missing is the demonstration that the ROCK inhibitor phenocopies IL-1RA treatment of SARS-COV-2 infected mice (viral infection and pathology).
Thanks for this suggestion. Accordingly, we applied the ROCK inhibitor in vivo to confirm its role in SARS-CoV-2-infected mice, and found similar phenotype as the IL-1RA treatment experiment. That is to say, Y-26732 treatment prevented the formation of IL-1β-induced actin bundles at cell-cell junctions, thus promoted syncytia formation and further viral transmission in vivo (revised Figure 7).
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
I suggest providing single-channel images in a supplementary figure for the live-cell images in Figures 4 and 5. Higher magnification images would also help distinguish the subcellular details of the cytoskeleton organization.
Thanks for the suggestion. We have provided the single channel images and higher magnification images in the revised Figures 4 and 5, as well as their respective figure supplements.
In Figure 4, the authors showed that IL-1 activates RhoA and induces the accumulation of activated RhoA at the cell-cell junctions. They also showed that IL-1 promotes the formation of actin bundles at cell-cell junctions. However, the authors have not shown any connection between RhoA and actin yet, but in lines 263-264, they claim that actin bundle formation is induced by RhoA. Evidence for this part was shown in later results, but at this moment, it is lacking. The same applies to lines 282-284; I think this conclusion that IL-1-induced actin bundle formation is through the RhoA-ROCK pathway should come after showing how RhoA affects actin bundle formation at cell-cell junctions. To this end, I suggest moving Supplementary Figures S12B and S12D to the main figure, as they provide strong evidence of the IL-1-RhoA-ROCK-actin pathway.
We appreciate these valuable comments. As suggested, we have moved the respective supplementary figures to the main figures to support our findings in the revised manuscript (Figure 4E and Figure 4-figure supplement 2B; Figure 5C and Figure 5-figure supplement 2A), the text has also been adjusted accordingly.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public review):
Summary:
In this study, Bu et al examined the dynamics of TRPV4 channel in cell overcrowding in carcinoma conditions. They investigated how cell crowding (or high cell confluence) triggers a mechano-transduction pathway involving TRPV4 channels in high-grade ductal carcinoma in situ (DCIS) cells that leads to large cell volume reduction (or cell volume plasticity) and proinvasive phenotype.
In vitro, this pathway is highly selective for highly malignant invasive cell lines derived from a normal breast epithelial cell line (MCF10CA) compared to the parent cell line, but not present in another triple-negative invasive breast epithelial cell line (MDA-MB-231). The authors convincingly showed that enhanced TRPV4 plasma membrane localization correlates with highgrade DCIS cells in patient tissue samples.
Specifically in invasive MCF10DCIS.com cells, they showed that overcrowding or overconfluence leads to a decrease in cell volume and intracellular calcium levels. This condition also triggers the trafficking of TRPV4 channels from intracellular stores (nucleus and potentially endosomes), to the plasma membrane (PM). When these over-confluent cells are incubated with a TRPV4 activator, there is an acute and substantial influx of calcium, attesting to the fact that there are a high number of TRPV4 channels present on the PM. Long-term incubation of these over-confluent cells with the TRPV4 activator results in the internalization of the PMlocalized TRPV4 channels.
In contrast, cells plated at lower confluence primarily have TRPV4 channels localized in the nucleus and cytosol. Long-term incubation of these cells at lower confluence with a TRPV4 inhibitor leads to the relocation of TRPV4 channels to the plasma membrane from intracellular stores and a subsequent reduction in cell volume. Similarly, incubation of these cells at low confluence with PEG 3000 (a hyperosmotic agent) promotes the trafficking of TRPV4 channels from intracellular stores to the plasma membrane.
Strengths:
The study is elegantly designed and the findings are novel. Their findings on this mechanotransduction pathway involving TRPV4 channels, calcium homeostasis, cell volume plasticity, motility, and invasiveness will have a great impact in the cancer field and are potentially applicable to other fields as well. Experiments are well-planned and executed, and the data is convincing. The authors investigated TRVP4 dynamics using multiple different strategies- overcrowding, hyperosmotic stress, and pharmacological means, and showed a good correlation between different phenomena.
Weaknesses:
A major emphasis in the study is on pharmacological means to relate TRPV4 channel function to the phenotype. I believe the use of genetic means would greatly enhance the impact and provide compelling proof for the involvement of TRPV4 channels in the associated phenotype.
In this regard, I wonder if siRNA-mediated knockdown of TRPV4 in over-confluent cells (or knockout) would lead to an increase in cell volume and normalize the intracellular calcium levels back to normal, thus ultimately leading to a decrease in cell invasiveness.
We greatly appreciate the positive feedback regarding the design of our study and the novelty of our findings. We also acknowledge the valuable suggestion to complement our pharmacological approaches with genetic manipulation of TRPV4.
In response to the comment regarding siRNA-mediated knockdown or knockout of TRPV4, we fully agree that this would further substantiate our findings. In the revised manuscript, we implemented shRNA targeting TRPV4 to investigate its functional effects on intracellular calcium level changes, cell volume plasticity, and invasiveness phenotypes, assessed through singlecell motility assays under cell crowding or hyperosmotic stress. These results have been incorporated into the revised manuscript, and detailed descriptions of these findings are included below.
Using the shRNA approach that resulted in ~50% reduction of TRPV4 expression
(Supplementary Figure 6A and 6B show TRPV4 expression levels via IF and immunoblots, respectively), we examined the effect of reduced TRPV4 on intracellular calcium levels in MCF10DCIS.com cells under normal density (ND) and stress conditions (confluent; Con and hyperosmotic; PEG) using Fluo-4 AM imaging (Fig. 4S-X). We found that shRNA TRPV4 slightly decreased calcium levels in ND cells, likely due to fewer active calcium channels at the plasma membrane resulting from lower TRPV4 expression (as shown in the summary plot in Fig. 4W). With fewer active calcium channels, cells treated with shRNA TRPV4 exhibited less reduction in intracellular calcium levels under cell crowding conditions compared to control cells. Additionally, hyperosmotic stress using PEG 300 induced smaller calcium spikes in shRNA cells compared to the significant spike observed in control cells. This reduced calcium response to Con and hyperosmotic stress in shRNA cells was reflected in the decreased cell volume reduction by PEG 300 shown in Fig. 4Y. Consequently, shRNA-mediated TRPV4 reduction impaired cell volume plasticity in MCF10DCIS.com cells and abolished the pro-invasive mechanotransduction capability involving cell volume reduction, as evidenced by no increase in cell motility (both cell diffusivity and directionality) under hyperosmotic conditions (Fig. 5H-J). These findings demonstrate the critical role of TRPV4 in conferring pro-invasive
mechanotransduction capability to MCF10DCIS.com cells through cell volume reduction.
Reviewer #2 (Public review):
Summary:
The metastasis poses a significant challenge in cancer treatment. During the transition from non-invasive cells to invasive metastasis cells, cancer cells usually experience mechanical stress due to a crowded cellular environment. The molecular mechanisms underlying mechanical signaling during this transition remain largely elusive. In this work, the authors utilize an in vitro cell culture system and advanced imaging techniques to investigate how non-invasive and invasive cells respond to cell crowding, respectively.
Strengths:
The results clearly show that pre-malignant cells exhibit a more pronounced reduction in cell volume and are more prone to spreading compared to non-invasive cells. Furthermore, the study identifies that TRPV4, a calcium channel, relocates to the plasma membrane both in vitro and in vivo (patient samples). Activation and inhibition of the TRPV4 channel can modulate the cell volume and cell mobility. These results unveil a novel mechanism of mechanical sensing in cancer cells, potentially offering new avenues for therapeutic intervention targeting cancer metastasis by modulating TRPV4 activity. This is a very comprehensive study, and the data presented in the paper are clear and convincing. The study represents a very important advance in our understanding of the mechanical biology of cancer.
Weaknesses:
However, I do think that there are several additional experiments that could strengthen the conclusions of this work. A critical limitation is the absence of genetic ablation of the TRPV4 gene to confirm its essential role in the response to cell crowding.
We are deeply grateful for the positive assessment of our study and its contribution to advancing our understanding of mechanical signaling in cancer progression. We also greatly appreciate the suggestion to incorporate genetic ablation experiments to further validate the role of TRPV4 in cell crowding responses.
As noted in our response to Reviewer #1, we employed an shRNA approach to investigate the functional effects of TRPV4 knockdown on intracellular calcium level changes, cell volume plasticity, and invasiveness phenotypes. We assessed these effects using Fluo-4 AM calcium assay, single-cell volume measurements, and single-cell motility assays under cell crowding or hyperosmotic stress. These results have been incorporated into the revised manuscript and are described in detail in our response to Reviewer #1's "weaknesses" comment.
Reducing TRPV4 expression levels by shRNA diminished mechanosensing intracellular calcium changes under cell crowding and hyperosmotic conditions using PEG 300 treatment. Furthermore, a significantly reduced cell volume plasticity was observed under hyperosmotic conditions in shRNA treated cells compared to control cells (Fig. 4S-X). This diminished mechanosensing capability abolished the pro-invasive mechanotransduction effect, as assessed by single cell motility under hyperosmotic conditions (Fig. 5H-J). These findings demonstrate the critical role of TRPV4 in conferring pro-invasive mechanotransduction capability to MCF10DCIS.com cells through cell volume reduction.
Reviewer #1 (Recommendations for the authors):
The way the results or discussion section is written. It was a little confusing for me to relate to some phenomena. For example, it is not clear how TRPV4 inhibition (due to overcrowding) leads to a decrease in intercellular calcium levels, especially when TRPV4 channels were intercellular (not on the PM) to begin with (in normal density (ND) conditions). Along the same lines, how GSK219 causes a dip in calcium levels in ND cells when TRPV4 channels are primarily intercellular (Figure 4E). If most of the TRPV4 channels that are translocated to the PM in response to cell crowding are in an inactive state, how do they confer enhanced cell volume plasticity relative to non-invasive cell lines?
Thank you very much for raising this important point. We fully agree with your concern and have significantly revised the manuscript to clarify this aspect. Specifically, we have emphasized that a modest level of TRPV4 channels are constitutively active at the plasma membrane in normal density (ND) cells. This is now discussed in detail in the context of Fig. 4:
Page 14: “Considering these factors, we hypothesized that cell crowding might inhibit calcium-permeant ion channels that are constitutively active at the plasma membrane, including TRPV4, which would then lower intracellular calcium levels and subsequently reduce cell volume via osmotic water movement.”
Page 16-17: “… However, the temporal profile of Fluo-4 intensity in Fig. 4E, which corresponds to the time points marked in Fig. 4D (t<sub>1</sub>: baseline and t<sub>2</sub>: dip), clearly shows the dip at t<sub>2</sub>, indicated by ΔCa (the vertical dashed line between the dip and baseline). This modest Fluo-4 dip at t<sub>2</sub> represents the inhibition of activity by GSK219 on a small population of constitutively active TRPV4 channels at the plasma membrane under ND conditions.
In Con cells, 1 nM GSK219 caused a smaller dip in Fluo-4 intensity compared to the one observed in ND cells, with no subsequent changes. This is likely due to fewer constitutively active TRPV4 at the plasma membrane in Con cells than in ND cells. …These findings suggest that a substantial portion of TRPV4 channels relocated to the plasma membrane under cell crowding was inactive, and some constitutively active TRPV4 channels already present in the membrane became inactive as a result of cell crowding.”
'Internalization' might be a better word than 'uptake' in the following line in the results section
"...activating TRPV4 under cell crowding conditions triggered channel uptake, indicating that TRPV4 trafficking depended on the channel's activation status."
Thank you very much for this suggestion. As recommended, we replaced ‘uptake’ with internalization’ on page 18:
“However, in Con cells, where a large number of inactive TRPV4 channels are likely located at the plasma membrane, GSK101 treatment notably reduced plasma membrane-associated TRPV4 in a dose-dependent manner through internalization (Fig. 4O, 4Q), consistent with previous findings65. These data suggest that plasma membrane TRPV4 levels were largely
regulated by the channel activity status. Specifically, channel activation led to the internalization of TRPV4, while channel inhibition promoted the relocation of TRPV4 to the plasma membrane.”
-
Out of curiosity:
-
Is there any information on what the intercellular TRPV4 channels are doing in the cytosol and in the nucleus? Is there any role of intercellular calcium stores in the proposed pathway?
We greatly appreciate this insightful question. Although we were unable to find studies specifically exploring the roles of cytosolic TRPV4, a recent study (Reference 74) identified a role for nuclear TRPV4 in regulating calcium within the nucleus. We speculate that when TRPV4 activity is severely impaired, such as with additional TRPV4 inhibition under cell crowding conditions, some TRPV4 channels may be redirected to the nucleus. This redistribution could help maintain nuclear calcium homeostasis.
This discussion is included on page 18 of the manuscript:
“These findings suggest that further TRPV4 inhibition under crowding conditions triggers a distinct trafficking alteration. Recent studies have implicated nuclear TRPV4 in regulating nuclear Ca2+ homeostasis and Ca2+-regulated transcription74. In light of this study and our findings, TRPV4 may relocate to the nucleus as a compensatory mechanism to maintain nuclear calcium regulation. This relocation could reflect an adaptive response to preserve calcium-dependent transcriptional programs or other nuclear processes essential for cell survival under mechanical stress.”
One recommendation is to add some explanation or some minor details for the convenience of the reader. For example:
At normal or lower confluence, cells show an acute large dip in intercellular calcium when an inhibitor is applied implying that there are a few TRPV4 channels on the PM and they are constitutively active.
Thank you very much for highlighting this important point and for the helpful suggestion to improve clarity. We have significantly revised the text associated with Fig. 4 to ensure this point is clear. Specifically, we have added the following explanation on page 16:
"This modest Fluo-4 dip at t2 represents the inhibition of activity by GSK219 on a small population of constitutively active TRPV4 channels at the plasma membrane under ND conditions."
Reviewer #2 (Recommendations for the authors):
(1) Figure 1. The authors frequently change the medium to prevent acidification in overconfluent cultures. A cell viability assay should be performed to ensure that the over-confluent cells remain healthy and viable during the experiments. There are commercial kits that can be easily used to quantify the number of viable cells and the extent of cell toxicity. The number of viable cells would provide a more reliable basis for comparison between normal density and overconfluent conditions.
Thank you very much for raising this important point. We have consistently observed that cell crowding does not induce significant cell death in MCF10DCIS.com cells. To address your recommendation, we performed a viability assay using propidium iodide (PI) to selectively stain dead cells and WGA-488 to stain all live cells. Cell death was quantified under normal density (ND) conditions and at 1, 3, 5, 7, and 10 days post-confluence.
Our results indicate that cells remain similarly viable post-confluence, with minimal cell death
(~1.5%) compared to ND cells (~0.75%). These findings are summarized in Supplementary Figure 2, demonstrating that over-confluent cultures remain healthy and viable during the experiments.
(2) Figure 2. To determine whether the reduction in cell volume is reversible, over-confluent cells can be further diluted back to normal density. Additionally, the reversibility of TRPV4 channel trafficking to the plasma membrane should be assessed under these conditions in IF experiments and cell surface biotinylation.
Thank you for this suggestion. We reseeded the previously overcrowded (OC) cells at normal density and observed that their TRPV4 distribution predominantly returned to being intracellular, with only modest plasma membrane localization, as shown by line analysis (Supplementary Figure 10A-C, page 13). Furthermore, their invasiveness decreased to levels comparable to the original normal density (ND) cells (Supplementary Figure 3C and 3E, page 6). These results demonstrate the reversibility of TRPV4 trafficking changes and the increase in invasiveness under mechanical stress.
Page 6. "The enhanced invasiveness of MCF10DCIS.com cells under cell crowding was largely reversible. When OC cells were reseeded at normal density for invasion assays, their invasive cell fraction decreased to approximately 15%, slightly lower (p = 0.012) than the initial value of around 24% (Suppl. Fig. 3C, 3E)."
Page 13. “We investigated whether TRPV4 relocation to the plasma membrane induced by cell crowding is reversible, as suggested by its impact on invasiveness (Suppl. Fig. 3E). To test this, previously OC MCF10DCIS.com cells were reseeded under ND conditions. We then assessed TRPV4 localization via immunofluorescence (IF) imaging to determine if most channels returned to the cytoplasm and could be relocated to the plasma membrane under mechanical stress, such as hyperosmotic conditions. Consistent with their initial ND state, reseeded ND MCF10DCIS.com cells displayed intracellular TRPV4 distribution (Suppl. Fig. 10A). Upon exposure to hyperosmotic stress (74.4 mOsm/Kg PEG300), TRPV4 was again relocated to the plasma membrane (Suppl. Fig. 10B). These findings, quantified through line analysis (Suppl. Fig. 10C), demonstrate that the mechanosensing response of MCF10DCIS.com cells is reversible.”
(3) Figure 3B. A control using intracellular proteins such as GAPDH or Tubulin is missing. Including this control would help exclude the possibility of cell rupture or compromised cell membranes in crowded environments, which is very common in a cell crowding environment.
Thank you very much for pointing this out. The control lanes (GAPDH) were already included in the full gel results shown in Supplementary Figure 5. For the immunoprecipitation and immunoblotting of surface-biotinylated cell lysates, we did not expect to detect GAPDH; however, some GAPDH signals were still observed. As shown for MCF10DCIS.com cells, less GAPDH was detected under OC conditions, but the immunoprecipitated samples displayed significantly higher levels of TRPV4 on the cell surface compared to ND cells (Supplementary Figure 5A). For the whole cell lysates, TRPV4 protein levels were comparable across different cell lines based on the immunoblot results, with consistent GAPDH signals serving as a loading control (Supplementary Figure 5B).
(4) Figure 4. To convincingly demonstrate TRPV4 relocation to the plasma membrane, IF should be performed under non-permeable conditions (i.e., without detergents like saponin). This approach ensures that only plasma membrane proteins are accessible to antibodies, reducing intracellular background. The same approach should be applied to Piezo1 and TfR.
Thank you for this suggestion. We observed that under non-permeable conditions, primary antibodies could still access intracellular proteins. To address this issue, we employed extracellular-binding TRPV4 antibodies to selectively detect TRPV4 relocation to the plasma membrane under hyperosmotic conditions (74.4 mOsm/kg PEG 300) in live MCF10DCIS.com cells, as shown in Supplementary Figure 9. These results clearly demonstrate the plasma membrane relocation of TRPV4 under hyperosmotic conditions, distinguishing it from control conditions. Unfortunately, we were unable to identify high-affinity extracellular-binding antibodies for Piezo1 and TfR. Nevertheless, our findings strongly support the mechanosensing plasma membrane relocation of TRPV4.
Essential Weakness:
Throughout the study, only TRPV4 inhibitors and activators were used to show that TRPV4 relocation is associated with intracellular calcium concentration and cell size changes. It is crucial to use TRPV4 KO or KD cells to confirm that the observed effects are specific to TRPV4 and not due to off-target effects on other proteins. Additionally, fusing a plasma membrane targeting sequence to TRPV4 to make a constitutive plasma membrane-localized construct could demonstrate the opposite effect.
Thank you very much for this important comment. As noted in our response to Reviewer #1, we employed an shRNA approach to investigate the functional effects of TRPV4 knockdown on intracellular calcium level changes, cell volume plasticity, and invasiveness phenotypes. We assessed these effects using Fluo-4 AM calcium assay, single-cell volume measurements, and single-cell motility assays under cell crowding or hyperosmotic stress. These results have been incorporated into the revised manuscript and are described in detail in our response to Reviewer #1's "weaknesses" comment.
Reducing TRPV4 expression levels by shRNA diminished mechanosensing intracellular calcium changes under cell crowding and hyperosmotic conditions using PEG 300 treatment. Furthermore, a significantly reduced cell volume plasticity was observed under hyperosmotic conditions in shRNA treated cells compared to control cells (Fig. 4S-X). This diminished mechanosensing capability abolished the pro-invasive mechanotransduction effect, as assessed by single cell motility under hyperosmotic conditions (Fig. 5H-J). These findings demonstrate the critical role of TRPV4 in conferring pro-invasive mechanotransduction capability to MCF10DCIS.com cells through cell volume reduction.
Minor Points:
The introduction section is poorly written; many results currently included in the introduction would be more appropriately placed in the discussion section. The long redundant introduction makes the article hard to read through.
Thank you very much for pointing this out. In the revised introduction, we have significantly reduced references to the results, streamlining the section to make it more concise and focused. This adjustment ensures the introduction is clearer and avoids redundancy, improving the readability of the manuscript.
-
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
eLife Assessment
In an important fMRI study with an elegant experimental design and rigorous cross-decoding analyses, this work shows a solid dissociation between two parietal regions in visually processing actions. Specifically, aIPL is found to be sensitive to the causal effects of observed actions, while SPL is sensitive to the patterns of body motion involved in those actions. Additional analysis and explanation would help to determine the strength of evidence and the mechanistic underpinnings would benefit from closer consideration. Nevertheless, the work will be of broad interest to cognitive neuroscientists, particularly vision and action researchers.
We thank the editor and the reviewers for their assessment and their excellent comments and suggestions. We really believe they helped us to provide a stronger and more nuanced paper. In our revision, we addressed all points raised by the reviewers. Most importantly, we added a new section on a series of analyses to characterize in more detail the representations isolated by the action-animation and action-PLD cross-decoding. Together, these analyses strengthen the conclusion that aIPL and LOTC represent action effect structures at a categorical rather than specific level, that is, the type of change (e.g., of location or configuration) rather than the specific effect type (e.g. division, compression). SPL is sensitive to body-specific representations, specifically manuality (unimanual vs. bimanual) and movement kinematics. We also added several other analyses and addressed each point of the reviewers. Please find our responses below.
Public Reviews:
Reviewer #1 (Public Review):
Summary:
The authors report a study aimed at understanding the brain's representations of viewed actions, with a particular aim to distinguish regions that encode observed body movements, from those that encode the effects of actions on objects. They adopt a cross-decoding multivariate fMRI approach, scanning adult observers who viewed full-cue actions, pantomimes of those actions, minimal skeletal depictions of those actions, and abstract animations that captured analogous effects to those actions. Decoding across different pairs of these actions allowed the authors to pull out the contributions of different action features in a given region's representation. The main hypothesis, which was largely confirmed, was that the superior parietal lobe (SPL) more strongly encodes movements of the body, whereas the anterior inferior parietal lobe (aIPL) codes for action effects of outcomes. Specifically, region of interest analyses showed dissociations in the successful cross-decoding of action category across full-cue and skeletal or abstract depictions. Their analyses also highlight the importance of the lateral occipito-temporal cortex (LOTC) in coding action effects. They also find some preliminary evidence about the organisation of action kinds in the regions examined.
Strengths:
The paper is well-written, and it addresses a topic of emerging interest where social vision and intuitive physics intersect. The use of cross-decoding to examine actions and their effects across four different stimulus formats is a strength of the study. Likewise, the a priori identification of regions of interest (supplemented by additional full-brain analyses) is a strength.
Weaknesses:
I found that the main limitation of the article was in the underpinning theoretical reasoning. The authors appeal to the idea of "action effect structures (AES)", as an abstract representation of the consequences of an action that does not specify (as I understand it) the exact means by which that effect is caused, nor the specific objects involved. This concept has some face validity, but it is not developed very fully in the paper, rather simply asserted. The authors make the claim that "The identification of action effect structure representations in aIPL has implications for theories of action understanding" but it would have been nice to hear more about what those theoretical implications are. More generally, I was not very clear on the direction of the claim here. Is there independent evidence for AES (if so, what is it?) and this study tests the following prediction, that AES should be associated with a specific brain region that does not also code other action properties such as body movements? Or, is the idea that this finding -- that there is a brain region that is sensitive to outcomes more than movements -- is the key new evidence for AES?
Thank you for raising this important issue. We reasoned that AES should exist to support the recognition of perceptually variable actions, including those that we have never experienced before. To the best of our knowledge, there is only indirect evidence for the existence of AES, namely that humans effortlessly and automatically recognize actions (and underlying intentions and feelings) in movements of abstract shapes, as in the famous Heider and Simmel (1949) animations. As these animations do not contain any body posture or movement information at all, the only available cues are the spatiotemporal relations between entities and entity parts in the perceived scene. We think that the effortless and automatic attribution of actions to these stimuli points toward an evolutionary optimized mechanism to capture action effect structures from highly variable action instantiations (so general that it even works for abstract animations). Our study thus aimed to test for the existence of such a level of representation in the brain. We clarified this point in the introduction.
In our revised manuscript, we also revised our discussion of the implications of the finding of AES representations in the brain:
"The identification of action effect structure representations in aIPL and LOTC has implications for theories of action understanding: Current theories (see for review e.g. Zentgraf et al., 2011; Kemmerer, 2021; Lingnau and Downing, 2024) largely ignore the fact that the recognition of many goal-directed actions requires a physical analysis of the action-induced effect, that is, a state change of the action target. Moreover, premotor and inferior parietal cortex are usually associated with motor- or body-related processing during action observation. Our results, together with the finding that premotor and inferior parietal cortex are similarly sensitive to actions and inanimate object events (Karakose-Akbiyik et al., 2023), suggest that large parts of the 'action observation network' are less specific for body-related processing in action perception than usually thought. Rather, this network might provide a substrate for the physical analysis and predictive simulation of dynamic events in general (Schubotz, 2007; Fischer, 2024). In addition, our finding that the (body-independent) representation of action effects substantially draws on right LOTC contradicts strong formulations of a 'social perception' pathway in LOTC that is selectively tuned to the processing of moving faces and bodies (Pitcher and Ungerleider, 2021). The finding of action effect representation in right LOTC/pSTS might also offer a novel interpretation of a right pSTS subregion thought to specialized for social interaction recognition: Right pSTS shows increased activation for the observation of contingent action-reaction pairs (e.g. agent A points toward object; agent B picks up object) as compared to two independent actions (i.e., the action of agent A has no effect on the action of agent B) (Isik et al., 2017). Perhaps the activation reflects the representation of a social action effect - the change of an agent's state induced by someone else's action. Thus, the representation of action effects might not be limited to physical object changes but might also comprise social effects not induced by a physical interaction between entities. Finally, not all actions induce an observable change in the world. It remains to be tested whether the recognition of, e.g., communication (e.g. speaking, gesturing) and perception actions (e.g. observing, smelling) similarly relies on structural action representations in aIPL and LOTC"
On a more specific but still important point, I was not always clear that the significant, but numerically rather small, decoding effects are sufficient to support strong claims about what is encoded or represented in a region. This concern of course applies to many multivariate decoding neuroimaging studies. In this instance, I wondered specifically whether the decoding effects necessarily reflected fully five-way distinction amongst the action kinds, or instead (for example) a significantly different pattern evoked by one action compared to all of the other four (which in turn might be similar). This concern is partly increased by the confusion matrices that are presented in the supplementary materials, which don't necessarily convey a strong classification amongst action kinds. The cluster analyses are interesting and appear to be somewhat regular over the different regions, which helps. However: it is hard to assess these findings statistically, and it may be that similar clusters would be found in early visual areas too.
We agree that in our original manuscript, we did not statistically test what precisely drives the decoding, e.g., specific actions or rather broader categories. In our revised manuscript, we included a representational similarity analysis (RSA) that addressed this point. In short, we found that the action-animation decoding was driven by categorical distinctions between groups of actions (e.g. hit/place vs. the remaining actions) rather than a fully five-way distinction amongst all action kinds. The action-PLD decoding was mostly driven by , specifically manuality (unimanual vs. bimanual)) and movement kinematics; in left and right LOTC we found additional evidence for action-specific representations.
Please find below the new paragraph on the RSA:
"To explore in more detail what types of information were isolated by the action-animation and action-PLD cross-decoding, we performed a representational similarity analysis.
We first focus on the representations identified by the action-animation decoding. To inspect and compare the representational organization in the ROIs, we extracted the confusion matrices of the action-animation decoding from the ROIs (Fig. 5A) and compared them with different similarity models (Fig. 5B) using multiple regression. Specifically, we aimed at testing at which level of granularity action effect structures are represented in aIPL and LOTC: Do these regions encode the broad type of action effects (change of shape, change of location, ingestion) or do they encode specific action effects (compression, division, etc.)? In addition, we aimed at testing whether the effects observed in EVC can be explained by a motion energy model that captures the similarities between actions and animations that we observed in the stimulus-based action-animation decoding using motion energy features. We therefore included V1 in the ROI analysis. We found clear evidence that the representational content in right aIPL and bilateral LOTC can be explained by the effect type model but not by the action-specific model (all p < 0.005; two-sided paired t-tests between models; Fig. 5C). In left V1, we found that the motion energy model could indeed explain some representational variance; however, in both left and right V1 we also found effects for the effect type model. We assume that there were additional visual similarities between the broad types of actions and animations that were not captured by the motion energy model (or other visual models; see Supplementary Information). A searchlight RSA revealed converging results, and additionally found effects for the effect type model in the ventral part of left aIPL and for the action-specific model in the left anterior temporal lobe, left dorsal central gyrus, and right EVC (Fig. 5D). The latter findings were unexpected and should be interpreted with caution, as these regions (except right EVC) were not found in the action-animation cross-decoding and therefore should not be considered reliable (Ritchie et al., 2017). The motion energy model did not reveal effects that survived the correction for multiple comparison, but a more lenient uncorrected threshold of p = 0.005 revealed clusters in left EVC and bilateral posterior SPL.
To characterize the representations identified by the action-PLD cross-decoding, we used a manuality model that captures whether the actions were performed with both hands vs. one hand, an action-specific model as used in the action-animation RSA above, and a kinematics model that was based on the 3D kinematic marker positions of the PLDs (Fig. 6B). Since pSTS is a key region for biological motion perception, we included this region in the ROI analysis. The manuality model explained the representational variance in the parietal ROIs, pSTS, and LOTC, but not in V1 (all p < 0.002; two-sided paired t-tests between V1 and other ROIs; Fig. 6C). By contrast, the action-specific model revealed significant effects in V1 and LOTC, but not in pSTS and parietal ROIs (but note that effects in V1 and pSTS did not differ significantly from each other; all other two-sided paired t-tests between mentioned ROIs were significant at p < 0.0005). The kinematics model explained the representational variance in all ROIs. A searchlight RSA revealed converging results, and additionally found effects for the manuality model in bilateral dorsal/medial prefrontal cortex and in right ventral prefrontal cortex and insula (Fig. 6D).”
We also included an ROI covering early visual cortex (V1) in our analysis. While there was significant decoding for action-animation in V1, the representational organization did not substantially match the organization found in aIPL and LOTC: A cluster analysis revealed much higher similarity between LOTC and aIPL than between these regions and V1:
(please note that in this analysis we included the action-PLD RDMs as reference, and to test whether aIPL shows a similar representational organization in action-anim and action-PLD; see below)
Given these results, we think that V1 captured different aspects in the action-animation cross-decoding than aIPL and LOTC. We address this point in more detail in our response to the "Recommendations for The Authors".
Reviewer #2 (Public Review):
Summary:
This study uses an elegant design, using cross-decoding of multivariate fMRI patterns across different types of stimuli, to convincingly show a functional dissociation between two sub-regions of the parietal cortex, the anterior inferior parietal lobe (aIPL) and superior parietal lobe (SPL) in visually processing actions. Specifically, aIPL is found to be sensitive to the causal effects of observed actions (e.g. whether an action causes an object to compress or to break into two parts), and SPL to the motion patterns of the body in executing those actions.
To show this, the authors assess how well linear classifiers trained to distinguish fMRI patterns of response to actions in one stimulus type can generalize to another stimulus type. They choose stimulus types that abstract away specific dimensions of interest. To reveal sensitivity to the causal effects of actions, regardless of low-level details or motion patterns, they use abstract animations that depict a particular kind of object manipulation: e.g. breaking, hitting, or squashing an object. To reveal sensitivity to motion patterns, independently of causal effects on objects, they use point-light displays (PLDs) of figures performing the same actions. Finally, full videos of actors performing actions are used as the stimuli providing the most complete, and naturalistic information. Pantomime videos, with actors mimicking the execution of an action without visible objects, are used as an intermediate condition providing more cues than PLDs but less than real action videos (e.g. the hands are visible, unlike in PLDs, but the object is absent and has to be inferred). By training classifiers on animations, and testing their generalization to full-action videos, the classifiers' sensitivity to the causal effect of actions, independently of visual appearance, can be assessed. By training them on PLDs and testing them on videos, their sensitivity to motion patterns, independent of the causal effect of actions, can be assessed, as PLDs contain no information about an action's effect on objects.
These analyses reveal that aIPL can generalize between animations and videos, indicating that it is sensitive to action effects. Conversely, SPL is found to generalize between PLDs and videos, showing that it is more sensitive to motion patterns. A searchlight analysis confirms this pattern of results, particularly showing that action-animation decoding is specific to right aIPL, and revealing an additional cluster in LOTC, which is included in subsequent analyses. Action-PLD decoding is more widespread across the whole action observation network.
This study provides a valuable contribution to the understanding of functional specialization in the action observation network. It uses an original and robust experimental design to provide convincing evidence that understanding the causal effects of actions is a meaningful component of visual action processing and that it is specifically localized in aIPL and LOTC.
Strengths:
The authors cleverly managed to isolate specific aspects of real-world actions (causal effects, motion patterns) in an elegant experimental design, and by testing generalization across different stimulus types rather than within-category decoding performance, they show results that are convincing and readily interpretable. Moreover, they clearly took great care to eliminate potential confounds in their experimental design (for example, by carefully ordering scanning sessions by increasing realism, such that the participants could not associate animation with the corresponding real-world action), and to increase stimulus diversity for different stimulus types. They also carefully examine their own analysis pipeline, and transparently expose it to the reader (for example, by showing asymmetries across decoding directions in Figure S3). Overall, this is an extremely careful and robust paper.
Weaknesses:
I list several ways in which the paper could be improved below. More than 'weaknesses', these are either ambiguities in the exact claims made, or points that could be strengthened by additional analyses. I don't believe any of the claims or analyses presented in the paper show any strong weaknesses, problematic confounds, or anything that requires revising the claims substantially.
(1) Functional specialization claims: throughout the paper, it is not clear what the exact claims of functional specialization are. While, as can be seen in Figure 3A, the difference between action-animation cross-decoding is significantly higher in aIPL, decoding performance is also above chance in right SPL, although this is not a strong effect. More importantly, action-PLD cross-decoding is robustly above chance in both right and left aIPL, implying that this region is sensitive to motion patterns as well as causal effects. I am not questioning that the difference between the two ROIs exists - that is very convincingly shown. But sentences such as "distinct neural systems for the processing of observed body movements in SPL and the effect they induce in aIPL" (lines 111-112, Introduction) and "aIPL encodes abstract representations of action effect structures independently of motion and object identity" (lines 127-128, Introduction) do not seem fully justified when action-PLD cross-decoding is overall stronger than action-animation cross-decoding in aIPL. Is the claim, then, that in addition to being sensitive to motion patterns, aIPL contains a neural code for abstracted causal effects, e.g. involving a separate neural subpopulation or a different coding scheme. Moreover, if sensitivity to motion patterns is not specific to SPL, but can be found in a broad network of areas (including aIPL itself), can it really be claimed that this area plays a specific role, similar to the specific role of aIPL in encoding causal effects? There is indeed, as can be seen in Figure 3A, a difference between action-PLD decoding in SPL and aIPL, but based on the searchlight map shown in Figure 3B I would guess that a similar difference would be found by comparing aIPL to several other regions. The authors should clarify these ambiguities.
We thank the reviewer for this careful assessment. The observation of action-PLD cross-decoding in aIPL is indeed not straightforward to interpret: It could mean that aIPL encodes both body movements and action effect structures by different neural subpopulations. Or it could mean that representations of action effect structures were also activated by the PLDs, which lead to successful decoding in the action-PLD cross-decoding. Our revision allows a more nuanced view on this issue:
First, we included the results of a behavioral test show that PLDs at least weakly allow for recognition of the specific actions (see our response to the second comment), which in turn might activate action effect structure representations. Second, the finding that also the cross-decoding between animations and PLDs revealed effects in left and right aIPL (as pointed out by the reviewer in the second comment) supports the interpretation that PLDs have activated, to some extent, action effect structure representations.
On the other hand, if aIPL encodes only action-effect-structures, that were also captured in the action-PLD cross-decoding, we would expect that the RDMs in aIPL are similar for the action-PLD and action-animation cross-decoding. However, the cluster analysis (see our response to Reviewer 1 above) does not show this; rather, all action-PLD RDMs are representationally more similar with each other than with action-animation RDMs, specifically with regard to aIPL. In addition, the RSA revealed sensitivity to manuality and kinematics also in aIPL. This suggests that the action-PLD decoding in aIPL was at least partially driven by representations related to body movements.
Taken together, these findings suggest that aIPL encodes also body movements. In fact, we didn't want to make the strong claim that aIPL is selectively representing action effect structures. Rather, we think that our results show that aIPL and SPL are disproportionally sensitive to action effects and body movements, respectively. We added this in our revised discussion:
"The action-PLD cross-decoding revealed widespread effects in LOTC and parietal cortex, including aIPL. What type of representation drove the decoding in aIPL? One possible interpretation is that aIPL encodes both body movements (isolated by the action-PLD cross-decoding) and action effect structures (isolated by the action-animation cross-decoding). Alternatively, aIPL selectively encodes action effect structures, which have been activated by the PLDs. A behavioral test showed that PLDs at least weakly allow for recognition of the specific actions (Tab. S2), which might have activated corresponding action effect structure representations. In addition, the finding that aIPL revealed effects for the cross-decoding between animations and PLDs further supports the interpretation that PLDs have activated, at least to some extent, action effect structure representations. On the other hand, if aIPL encodes only action effect structures, we would expect that the representational similarity patterns in aIPL are similar for the action-PLD and action-animation cross-decoding. However, this was not the case; rather, the representational similarity pattern in aIPL was more similar to SPL for the action-PLD decoding, which argues against distinct representational content in aIPL vs. SPL isolated by the action-PLD decoding. In addition, the RSA revealed sensitivity to manuality and kinematics also in aIPL, which suggests that the action-PLD decoding in aIPL was at least partially driven by representations related to body movements. Taken together, these findings suggest that aIPL encodes not only action effect structures, but also representations related to body movements. Likewise, also SPL shows some sensitivity to action effect structures, as demonstrated by effects in SPL for the action-animation and pantomime-animation cross-decoding. Thus, our results suggest that aIPL and SPL are not selectively but disproportionally sensitive to action effects and body movements, respectively."
A clarification to the sentence "aIPL encodes abstract representations of action effect structures independently of motion and object identity": Here we are referring to the action-animation cross decoding only; specifically, the fact that because the animations did not show body motion and concrete objects, the representations isolated in the action-animation cross decoding must be independent of body motion and concrete objects. This does not rule out that the same region encodes other kinds of representations in addition.
And another side note to the RSA: It might be tempting to test the "effects" model (distinguishing change of shape, change of location and ingest) also in the action-PLD multiple regression RSA in order to test whether this model explains additional variance in aIPL, which would point towards action effect structure representations. However, the "effect type" model is relatively strongly correlated with the "manuality" model (VIF=4.2), indicating that multicollinearity might exist. We therefore decided to not include this model in the RSA. However, we nonetheless tested the inclusion of this model and did not find clear effects for the "effects" model in aIPL (but in LOTC). The other models revealed largely similar effects as the RSA without the "effects" model, but the effects appeared overall noisier. In general, we would like to emphasize that an RSA with just 5 actions is not ideal because of the small number of pairwise comparisons, which increases the chance for coincidental similarities between model and neural RDMs. We therefore marked this analysis as "exploratory" in the article.
(2) Causal effect information in PLDs: the reasoning behind the use of PLD stimuli is to have a condition that isolates motion patterns from the causal effects of actions. However, it is not clear whether PLDs really contain as little information about action effects as claimed. Cross-decoding between animations and PLDs is significant in both aIPL and LOTC, as shown in Figure 4. This indicates that PLDs do contain some information about action effects. This could also be tested behaviorally by asking participants to assign PLDs to the correct action category. In general, disentangling the roles of motion patterns and implied causal effects in driving action-PLD cross-decoding (which is the main dependent variable in the paper) would strengthen the paper's message. For example, it is possible that the strong action-PLD cross-decoding observed in aIPL relies on a substantially different encoding from, say, SPL, an encoding that perhaps reflects causal effects more than motion patterns. One way to exploratively assess this would be to integrate the clustering analysis shown in Figure S1 with a more complete picture, including animation-PLD and action-PLD decoding in aIPL.
With regard to the suggestion to behaviorally test how well participants can grasp the underlying action effect structures: We indeed did a behavioral experiment to assess the recognizability of actions in the PLD stick figures (as well as in the pantomimes). In short, this experiment revealed that participants could not well recognize the actions in the PLD stick figures and often confused them with kinematically similar but conceptually different actions (e.g. breaking --> shaking, hitting --> swiping, squashing --> knitting). However, the results also show that it was not possible to completely eliminate that PLDs contain some information about action effects.
Because we considered this behavioral experiment as a standard assessment of the quality of the stimuli, we did not report them in the original manuscript. We now added an additional section to the methods that describes the behavioral experiments in detail:
"To assess how much the animations, PLD stick figures, and pantomimes were associated with the specific action meanings of the naturalistic actions, we performed a behavioral experiment. 14 participants observed videos of the animations, PLDs (without stick figures), and pantomimes in three separate sessions (in that order) and were asked to describe what kind of actions the animations depict and give confidence ratings on a Likert scale from 1 (not confident at all) to 10 (very confident). Because the results for PLDs were unsatisfying (several participants did not recognize human motion in the PLDs), we added stick figures to the PLDs as described above and repeated the rating for PLD stick figures with 7 new participants, as reported below.
A general observation was that almost no participant used verb-noun phrases (e.g. "breaking a stick") in their descriptions for all stimulus types. For the animations, the participants used more abstract verbs or nouns to describe the actions (e.g. dividing, splitting, division; Tab. S1). These abstract descriptions matched the intended action structures quite well, and participants were relatively confident about their responses (mean confidences between 6 and 7.8). These results suggest that the animations were not substantially associated with specific action meanings (e.g. "breaking a stick") but captured the coarse action structures. For the PLD stick figures (Tab. S2), responses were more variable and actions were often confused with kinematically similar but conceptually different actions (e.g. breaking --> shaking, hitting --> turning page, squashing --> knitting). Confidence ratings were relatively low (mean confidences between 3 and 5.1). These results suggest that PLD stick figures, too, were not substantially associated with specific action meanings and additionally did not clearly reveal the underlying action effect structures. Finally, pantomimes were recognized much better, which was also reflected in high confidence ratings (mean confidences between 8 and 9.2; Tab. S3). This suggests that, unlike PLD stick figures, pantomimes allowed much better to access the underlying action effect structures."
We also agree with the second suggestion to investigate in more detail the representational profiles in aIPL and SPL. We think that the best way to do so is the RSA that we reported above. However, to provide a complete picture of the results, we also added the whole brain maps and RDMs for the animation-pantomime, animation-PLD, pantomime-PLD, and action-pantomime to the supplementary information.
(3) Nature of the motion representations: it is not clear what the nature of the putatively motion-driven representation driving action-PLD cross-decoding is. While, as you note in the Introduction, other regions such as the superior temporal sulcus have been extensively studied, with the understanding that they are part of a feedforward network of areas analyzing increasingly complex motion patterns (e.g. Riese & Poggio, Nature Reviews Neuroscience 2003), it doesn't seem like the way in which SPL represents these stimuli are similarly well-understood. While the action-PLD cross-decoding shown here is a convincing additional piece of evidence for a motion-based representation in SPL, an interesting additional analysis would be to compare, for example, RDMs of different actions in this region with explicit computational models. These could be, for example, classic motion energy models inspired by the response characteristics of regions such as V5/MT, which have been shown to predict cortical responses and psychophysical performance both for natural videos (e.g. Nishimoto et al., Current Biology 2011) and PLDs (Casile & Giese Journal of Vision 2005). A similar cross-decoding analysis between videos and PLDs as that conducted on the fMRI patterns could be done on these models' features, obtaining RDMs that could directly be compared with those from SPL. This would be a very informative analysis that could enrich our knowledge of a relatively unexplored region in action recognition. Please note, however, that action recognition is not my field of expertise, so it is possible that there are practical difficulties in conducting such an analysis that I am not aware of. In this case, I kindly ask the authors to explain what these difficulties could be.
Thank you for this very interesting suggestion. We conducted a cross-decoding analysis that was based on the features of motion energy models as described in Nishimoto et al. (2011). Control analyses within each stimulus type revealed high decoding accuracies (animations: 100%, PLDs: 100%, pantomimes: 65%, actions: 55%), which suggests that the motion energy data generally contains information that can be detected by a classifier. However, the cross-decoding between actions and PLDs was at chance (20%), and the classification matrix did not resemble the neural RDMs. We also tested optical flow vectors as input to the decoding, which revealed similarly high decoding for the within-stimulus-type decoding (animations: 75%, PLDs: 100%, pantomimes: 65%, actions: 40%), but again at-chance decoding for action-PLD (20%), notably with a very different classification pattern:
Author response image 1.
Given these mixed results, we decided not to use these models for a statistical comparison with the neural action-PLD RDMs.
It is notable that the cross-decoding worked generally less well for decoding schemes that involve PLDs, which is likely due to highly different feature complexity of actions and PLDs: Naturalistic actions have much richer visual details, texture, and more complex motion cues. Therefore, motion energy features extracted from these videos likely capture a mixture of both fine-grained and broad motion information across different spatial frequencies. By contrast, motion energy features of PLDs are sparse and might not match the features of naturalistic actions. In a way, this was intended, as we were interested in higher-level body kinematics rather than lower-level motion features. We therefore decided to use a different approach to investigate the representational structure found in the action-PLD cross-decoding: As the PLDs were based on kinematic recordings of actions that were carried out in exactly the same manner as the naturalistic actions, we computed the dissimilarity of the 5 actions based on the kinematic marker positions. Specifically, we averaged the kinematic data across the 2 exemplars per PLD, vectorized the 3D marker positions of all time points of the PLDs (3 dimensions x 13 markers x 200 time points), computed the pairwise correlations between the 5 vectors, and converted the correlations into dissimilarity values by subtracting 1 - r. This RDM was then compared with the neural RDMs extracted from the action-PLD cross-decoding. This was done using a multiple regression RSA (see also our response to Reviewer 1's public comment 2), which allowed us to statistically test the kinematic model against other dissimilarity models: a categorical model of manuality (uni- vs. bimanual) and an action-specific model that discriminates each specific action from each other with equal distance.
This analysis revealed interesting results: the kinematic model explained the representational variance in bilateral SPL and (particularly right) pSTS as well as in right fusiform cortex and early visual cortex. The action-specific model revealed effects restricted to bilateral LOTC. The manuality model revealed widespread effects throughout the action observation network but not in EVC.
(4) Clustering analysis: I found the clustering analysis shown in Figure S1 very clever and informative. However, there are two things that I think the authors should clarify. First, it's not clear whether the three categories of object change were inferred post-hoc from the data or determined beforehand. It is completely fine if these were just inferred post-hoc, I just believe this ambiguity should be clarified explicitly. Second, while action-anim decoding in aIPL and LOTC looks like it is consistently clustered, the clustering of action-PLD decoding in SPL and LOTC looks less reliable. The authors interpret this clustering as corresponding to the manual vs. bimanual distinction, but for example "drink" (a unimanual action) is grouped with "break" and "squash" (bimanual actions) in left SPL and grouped entirely separately from the unimanual and bimanual clusters in left LOTC. Statistically testing the robustness of these clusters would help clarify whether it is the case that action-PLD in SPL and LOTC has no semantically interpretable organizing principle, as might be the case for a representation based entirely on motion pattern, or rather that it is a different organizing principle from action-anim, such as the manual vs. bimanual distinction proposed by the authors. I don't have much experience with statistical testing of clustering analyses, but I think a permutation-based approach, wherein a measure of cluster robustness, such as the Silhouette score, is computed for the clusters found in the data and compared to a null distribution of such measures obtained by permuting the data labels, should be feasible. In a quick literature search, I have found several papers describing similar approaches: e.g. Hennig (2007), "Cluster-wise assessment of cluster stability"; Tibshirani et al. (2001) "Estimating the Number of Clusters in a Data Set Via the Gap Statistic". These are just pointers to potentially useful approaches, the authors are much better qualified to pick the most appropriate and convenient method. However, I do think such a statistical test would strengthen the clustering analysis shown here. With this statistical test, and the more exhaustive exposition of results I suggested in point 2 above (e.g. including animation-PLD and action-PLD decoding in aIPL), I believe the clustering analysis could even be moved to the main text and occupy a more prominent position in the paper.
With regard to the first point, we clarified in the methods that we inferred the 3 broad action effect categories after the stimulus selection: "This categorization was not planned before designing the study but resulted from the stimulus selection."
Thank you for your suggestion to test more specifically the representational organization in the action-PLD and action-animation RDMs. However, after a careful assessment, we decided to replace the cluster analysis with an RSA. We did this for two reasons:
First, we think that RSA is a better (and more conventional) approach to statistically investigate the representational structure in the ROIs (and in the whole brain). The RSA allowed us, for example, to specifically test the mentioned distinction between unimanual and bimanual actions, and to test it against other models, i.e., a kinematic model and an action-specific model. This indeed revealed interesting distinct representational profiles of SPL and LOTC.
Second, we learned that the small number of items (5) is generally not ideal for cluster analyses (absolute minimum for meaningful interpretability is 4, but to form at least 2-3 clusters a minimum of 10-15 items is usually recommended). A similar rule of thumb applies to methods to statistically assess the reliability of cluster solutions (e.g., Silhouette Scores, Cophenetic Correlation Coefficient, Jaccard Coefficient). Finally, the small number of items is not ideal to run a permutation test because the number of unique permutations (for shuffling the data labels: 5! = 30) is insufficient to generate a meaningful null distribution. We therefore think it is best to discard the cluster analysis altogether. We hope you agree with this decision.
(5) ROI selection: this is a minor point, related to the method used for assigning voxels to a specific ROI. In the description in the Methods (page 16, lines 514-24), the authors mention using the MNI coordinates of the center locations of Brodmann areas. Does this mean that then they extracted a sphere around this location, or did they use a mask based on the entire Brodmann area? The latter approach is what I'm most familiar with, so if the authors chose to use a sphere instead, could they clarify why? Or, if they did use the entire Brodmann area as a mask, and not just its center coordinates, this should be made clearer in the text.
We indeed used a sphere around the center coordinate of the Brodmann areas. This was done to keep the ROI sizes / number of voxels constant across ROIs. Since we aimed at comparing the decoding accuracies between aIPL and SPL, we thereby minimized the possibility that differences in decoding accuracy between ROIs are due to ROI size differences. The approach of using spherical ROIs is a quite well established practice that we are using in our lab by default (e.g. Wurm & Caramazza, NatComm, 2019; Wurm & Caramazza, NeuroImage, 2019; Karakose, Caramazza, & Wurm, NatComm, 2023). We clarified that we used spherical ROIs to keep the ROI sizes constant in the revised manuscript.
Reviewer #3 (Public Review):
This study tests for dissociable neural representations of an observed action's kinematics vs. its physical effect in the world. Overall, it is a thoughtfully conducted study that convincingly shows that representations of action effects are more prominent in the anterior inferior parietal lobe (aIPL) than the superior parietal lobe (SPL), and vice versa for the representation of the observed body movement itself. The findings make a fundamental contribution to our understanding of the neural mechanisms of goal-directed action recognition, but there are a couple of caveats to the interpretation of the results that are worth noting:
(1) Both a strength of this study and ultimately a challenge for its interpretation is the fact that the animations are so different in their visual content than the other three categories of stimuli. On one hand, as highlighted in the paper, it allows for a test of action effects that is independent of specific motion patterns and object identities. On the other hand, the consequence is also that Action-PLD cross-decoding is generally better than Action-Anim cross-decoding across the board (Figure 3A) - not surprising because the spatiotemporal structure is quite different between the actions and the animations. This pattern of results makes it difficult to interpret a direct comparison of the two conditions within a given ROI. For example, it would have strengthened the argument of the paper to show that Action-Anim decoding was better than Action-PLD decoding in aIPL; this result was not obtained, but that could simply be because the Action and PLD conditions are more visually similar to each other in a number of ways that influence decoding. Still, looking WITHIN each of the Action-Anim and Action-PLD conditions yields clear evidence for the main conclusion of the study.
The reviewer is absolutely right: Because the PLDs are more similar to the actions than the animations, a comparison of the effects of the two decoding schemes is not informative. As we also clarified in our response to Reviewer 2, we cannot rule out that the action-PLD decoding picked up information related to action effect structures. Thus, the only firm conclusion that we can draw from our study is that aIPL and SPL are disproportionally sensitive to action effects and body movements, respectively. We clarified this point in our revised discussion.
(2) The second set of analyses in the paper, shown in Figure 4, follows from the notion that inferring action effects from body movements alone (i.e., when the object is unseen) is easier via pantomimes than with PLD stick figures. That makes sense, but it doesn't necessarily imply that the richness of the inferred action effect is the only or main difference between these conditions. There is more visual information overall in the pantomime case. So, although it's likely true that observers can more vividly infer action effects from pantomimes vs stick figures, it's not a given that contrasting these two conditions is an effective way to isolate inferred action effects. The results in Figure 4 are therefore intriguing but do not unequivocally establish that aIPL is representing inferred rather than observed action effects.
We agree that higher decoding accuracies for Action-Pant vs. Action-PLD and Pant-PLD could also be due to visual details (in particular of hands and body) that are more similar in actions and pantomimes relative to PLDs. However, please note that for this reason we included also the comparison of Anim-Pant vs. Anim-PLD. For this comparison, visual details should not influence the decoding. We clarified this point in our revision.
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
It struck me that there are structural distinctions amongst the 5 action kinds that were not highlighted and may have been unintentional. Specifically, three of the actions are "unary" in a sense: break(object), squash(object), hit(object). One is "binary": place(object, surface), and the fifth (drink) is perhaps ternary - transfer(liquid, cup, mouth)? Might these distinctions be important for the organization of action effects (or actions generally)?
This is an interesting aspect that we did not think of yet. We agree that for the organization of actions (and perhaps action effects) this distinction might be relevant. One issue we noticed, however, is that for the animations the suggested organization might be less clear, in particular for "drink" as ternary, and perhaps also for "place" as binary. Thus, in the action-animation cross-decoding, this distinction - if it exists in the brain - might be harder to capture. We nonetheless tested this distinction. Specifically, we constructed a dissimilarity model (using the proposed organization, valency model hereafter) and tested it in a multiple regression RSA against an effect type model and two other models for specific actions (discriminating each action from each other with the same distance) and motion energy (as a visual control model). This analysis revealed no effects for the "valency" model in the ROI-based RSA. Also a searchlight analysis revealed no effects for this model. Since we think that the valency model is not ideally suited to test representations of action effects (using data from the action-animation cross-decoding) and to make the description of the RSA not unnecessarily complicated, we decided to not include this model in the final RSA reported in the manuscript.
In general, I found it surprising that the authors treated their LOTC findings as surprising or unexpected. Given the long literature associating this region with several high-level visual functions related to body perception, action perception, and action execution, I thought there were plenty of a priori reasons to investigate the LOTC's behaviour in this study. Looking at the supplementary materials, indeed some of the strongest effects seem to be in that region.
(Likewise, classically, the posterior superior temporal sulcus is strongly associated with the perception of others' body movements; why not also examine this region of interest?)
One control analysis that would considerably add to the strength of the authors' conclusions would be to examine how actions could be cross-decoded (or not) in the early visual cortex. Especially in comparisons of, for example, pantomime to full-cue video, we might expect a high degree of decoding accuracy, which might influence the way we interpret similar decoding in other "higher level" regions.
We agree that it makes sense to also look into LOTC and pSTS, and also EVC. We therefore added ROIs for these regions: For EVC and LOTC we used the same approach based on Brodmann areas as for aIPL and SPL, i.e., we used BA 17 for V1 and BA 19 for LOTC. For pSTS, we defined the ROI based on a meta analysis contrast for human vs. non-human body movements (Grobras et al., HBM 2012). Indeed we find that the strongest effects (for both action effect structures and body movements) can be found in LOTC. We also found effects in EVC that, at least for the action-animation cross-decoding, are more difficult to interpret. To test for a coincidental visual confound between actions and animations, we included a control model for motion energy in the multiple regression RSA, which could indeed explain some of the representational content in V1. However, also the effect type model revealed effects in V1, suggesting that there were additional visual features that caused the action-animation cross-decoding in V1. Notably, as pointed out in our response to the Public comments, the representational organization in V1 was relatively distinct from the representational organization in aIPL and LOTC, which argues against the interpretation that effects in aIPL and LOTC were driven by the same (visual) features as in V1.
Regarding the analyses reported in Figure 4: wouldn't it be important to also report similar tests for SPL?
In the analysis of implied action effect structures, we focused on the brain regions that revealed robust effects for action-animation decoding in the ROI and the searchlight analysis, that is, aIPL and SPL. However, we performed a whole brain conjunction analysis to search for other brain regions that show a profile for implied action effect representation. This analysis (that we forgot to mention in our original manuscript; now corrected) did not find evidence for implied action effect representations in SPL.
However, for completeness, we also added a ROI analysis for SPL. This analysis revealed a surprisingly complex pattern of results: We observed stronger decoding for Anim-Pant vs. Anim-PLD, whereas there were no differences for the comparisons of Action-Pant with Action-PLD and Pant-PLD:
This pattern of results is not straightforward to explain: First, the equally strong decoding for Action-Pant, Action-PLD, and Pant-PLD suggests that SPL is not substantially sensitive to body part details. Rather, the decoding relied on the coarse body part movements, independently of the specific stimulus type (action, pantomime, PLD). However, the stronger difference between Anim-Pant and Anim-PLD suggests that SPL is also sensitive to implied AES. This appears unlikely, because no effects (in left aIPL) or only weak effects (in right SPL) were found for the more canonical Action-Anim cross-decoding. The Anim-Pant cross-decoding was even stronger than the Action-Anim cross-decoding, which is counterintuitive because naturalistic actions contain more information than pantomimes, specifically with regard to action effect structures. How can this pattern of results be interpreted? Perhaps, for pantomimes and animations, not only aIPL and LOTC but also SPL is involved in inferring (implied) action effect structures. However, for this conclusion, also differences for the comparison of Action-Pant with Action-PLD and for Action-Pant with Pant-PLD should be found. Another non-mutually exclusive interpretation is that both animations and pantomimes are more ambiguous in terms of the specific action, as opposed to naturalistic actions. For example, the squashing animation and pantomime are both ambiguous in terms of what is squashed/compressed, which might require additional load to infer both the action and the induced effect. The increased activation of action-related information might in turn increase the chance for a match between neural activation patterns of animations and pantomimes.
In any case, these additional results in SPL do not question the effects reported in the main text, that is, disproportionate sensitivity for action effect structures in right aIPL and LOTC and for body movements in SPL and other AON regions. The evidence for implied action effect structures representation in SPL is mixed and should be interpreted with caution.
We added this analysis and discussion as supplementary information.
Statistical arguments that rely on "but not" are not very strong, e.g. "We found higher cross-decoding for animation-pantomime vs. animation-PLD in right aIPL and bilateral LOTC (all t(23) > 3.09, all p < 0.0025; one-tailed), but not in left aIPL (t(23) = 0.73, p = 0.23, one-tailed)." Without a direct statistical test between regions, it's not really possible to support a claim that they have different response profiles.
Absolutely correct. Notably, we did not make claims about different profiles of the tested ROIs with regard to implied action effect representations. But of course it make sense to test for differential profiles of left vs. right aIPL, so we have added a repeated measures ANOVA to test for an interaction between TEST (animation-pantomime, animation-PLD) and ROI (left aIPL, right aIPL), which, however, was not significant (F(1,23)=3.66, p = 0.068). We included this analysis in the revised manuscript.
Reviewer #2 (Recommendations for The Authors):
(1) I haven't found any information about data and code availability in the paper: is the plan to release them upon publication? This should be made clear.
Stimuli, MRI data, and code are deposited at the Open Science Framework (https://osf.io/am346/). We included this information in the revised manuscript.
(2) Samples of videos of the stimuli (or even the full set) would be very informative for the reader to know exactly what participants were looking at.
We have uploaded the full set of stimuli on OSF (https://osf.io/am346/).
(3) Throughout the paper, decoding accuracies are averaged across decoding directions (A->B and B->A). To my knowledge, this approach was proposed in van den Hurk & Op de Beeck (2019), "Generalization asymmetry in multivariate cross-classification: When representation A generalizes better to representation B than B to A". I believe it would be fair to cite this paper.
Absolutely, thank you very much for the hint. We included this reference in our revised manuscript.
(4) Page 3, line 70: this is a very nitpicky point, but "This suggests that body movements and the effects they induce are at least partially processed independently from each other." is a bit of an inferential leap from "these are distinct aspects of real-world actions" to "then they should be processed independently in the brain". The fact that a distinction exists in the world is a prerequisite for this distinction existing in the brain in terms of functional specialization, but it's not in itself a reason to believe that functional specialization exists. It is a reason to hypothesize that the specialization might exist and to test that hypothesis. So I think this sentence should be rephrased as "This suggests that body movements and the effects they induce might be at least partially processed independently from each other.", or something to that effect.
Your reasoning is absolutely correct. We revised the sentence following your suggestion.
(5) Page 7, line 182: the text says "stronger decoding for action-animation vs. action-PLD" (main effect of TEST), which is the opposite of what can be seen in the figure. I assume this is a typo?
Thanks for spotting this, it was indeed a typo. We corrected it: “…stronger decoding for action-PLD vs. action-animation cross-decoding..”
(6) Page 7, Figure 3B: since the searchlight analysis is used to corroborate the distinction between aIPL and SPL, it would be useful to overlay the contours of these ROIs (and perhaps LOTC as well) on the brain maps.
We found that overlaying the contours of the ROIs onto the decoding searchlight maps would make the figure too busy, and the contours would partially hide effects. However, we added a brain map with all ROIs in the supplementary information.
(7) Page 9, Figure 4A: since the distinction between the significant difference between anim-pant and anim-PLD is quite relevant in the text, I believe highlighting the lack of difference between the two decoding schemes in left aIPL (for example, by writing "ns") in the figure would help guide the reader to see the relevant information. It is generally quite hard to notice the absence of something.
We added “n.s.” to the left aIPL in Fig. 4A.
(8) Page 11, line 300: "Left aIPL appears to be more sensitive to the type of interaction between entities, e.g. how a body part or an object exerts a force onto a target object" since the distinction between this and the effect induced by that interaction" is quite nuanced, I believe a concrete example would clarify this for the reader: e.g. I guess the former would involve a representation of the contact between hand and object when an object is pushed, while the latter would represent only the object's displacement following the push?
Thank you for the suggestion. We added a concrete example: “Left aIPL appears to be more sensitive to the type of interaction between entities, that is, how a body part or an object exerts a force onto a target object (e.g. how a hand makes contact with an object to push it), whereas right aIPL appears to be more sensitive to the effect induced by that interaction (the displacement of the object following the push).”
(9) Page 12, line 376: "Informed consent, and consent to publish, was obtained from the participant in Figure 2." What does this refer to? Was the person shown in the figure both a participant in the study and an actor in the stimulus videos? Since this is in the section about participants in the experiment, it sounds like all participants also appeared in the videos, which I guess is not the case. This ambiguity should be clarified.
Right, the statement sounds misleading in the “Participants” section. We rephrased it and moved it to the “Stimuli” section: “actions…were shown in 4 different formats: naturalistic actions, pantomimes, point light display (PLD) stick figures, and abstract animations (Fig. 2; informed consent, and consent to publish, was obtained from the actor shown in the figure).”
(10) Page 15, line 492: Here, "within-session analyses" are mentioned. However, these analyses are not mentioned in the text (only shown in Figure S2) and their purpose is not clarified. I imagine they were a sanity check to ensure that the stimuli within each stimulus type could be reliably distinguished. This should be explained somewhere.
We clarified the purpose of the within session decoding analyses in the methods section: "Within-session decoding analyses were performed as sanity checks to ensure that for all stimulus types, the 5 actions could be reliably decoded (Fig. S2)."
(11) Page 20, Figure S1: I recommend using the same color ranges for the two decoding schemes (action-anim and action-PLD) in A and C, to make them more directly comparable.
Ok, done.
Reviewer #3 (Recommendations For The Authors):
(1) When first looking at Figure 1B, I had a hard time discerning what action effect was being shown (I thought maybe it was "passing through") Figure 2 later clarified it for me, but it would be helpful to note in the caption that it depicts breaking.
Thank you for the suggestion. Done.
(2) It would be helpful to show an image of the aIPL and SPL ROIs on a brain to help orient readers - both to help them examine the whole brain cross-decoding accuracy and to aid in comparisons with other studies.
We added a brain map with all ROIs in the supplementary information.
(3) Line 181: I'm wondering if there's an error, or if I'm reading it incorrectly. The line states "Moreover, we found ANOVA main effects of TEST (F(1,24)=33.08, p=7.4E-06), indicating stronger decoding for action-animation vs. action-PLD cross-decoding..." But generally, in Figure 3A, it looks like accuracy is lower for Action-Anim than Action-PLD in both hemispheres.
You are absolutely right, thank you very much for spotting this error. We corrected the sentence: “…stronger decoding for action-PLD vs. action-animation cross-decoding..”
(4) It might be useful to devote some more space in the Introduction to clarifying the idea of action-effect structures. E.g., as I read the manuscript I found myself wondering whether there is a difference between action effect structures and physical outcomes in general... would the same result be obtained if the physical outcomes occurred without a human actor involved? This question is raised in the discussion, but it may be helpful to set the stage up front.
We clarified this point in the introduction:
In our study, we define action effects as induced by intentional agents. However, the notion of action effect structures might be generalizable to physical outcomes or object changes as such (e.g. an object's change of location or configuration, independently of whether the change is induced by an agent or not).
(5) Regarding my public comment #2, it would perhaps strengthen the argument to run the same analysis in the SPL ROIs. At least for the comparison of Anim-Pant with Anim-PLD, the prediction would be no difference, correct?
The prediction would indeed be that there is no difference for the comparison of Anim-Pant with Anim-PLD, but also for the comparison of Action-Pant with Action-PLD and for Action-Pant with Pant-PLD, there should be no difference. As explained in our response to the public comment #2, we ran a whole brain conjunction (Fig. 4B) to test for the combination of these effects and did not find SPL in this analysis. However, we did found differences for Anim-Pant vs. Anim-PLD, which is not straightforward to interpret (see our response to your public comment #2 for a discussion of this finding).
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the previous reviews.
Public Reviews:
Reviewer #1 (Public review):
Summary:
Shen et al. conducted three experiments to study the cortical tracking of the natural rhythms involved in biological motion (BM), and whether these involve audiovisual integration (AVI). They presented participants with visual (dot) motion and/or the sound of a walking person. They found that EEG activity tracks the step rhythm, as well as the gait (2-step cycle) rhythm. The gait rhythm specifically is tracked superadditively (power for A+V condition is higher than the sum of the A-only and V-only condition, Experiments 1a/b), which is independent of the specific step frequency (Experiment 1b). Furthermore, audiovisual integration during tracking of gait was specific to BM, as it was absent (that is, the audiovisual congruency effect) when the walking dot motion was vertically inverted (Experiment 2). Finally, the study shows that an individual's autistic traits are negatively correlated with the BM-AVI congruency effect.
Strengths:
The three experiments are well designed and the various conditions are well controlled. The rationale of the study is clear, and the manuscript is pleasant to read. The analysis choices are easy to follow, and mostly appropriate.
Weaknesses:
There is a concern of double-dipping in one of the tests (Experiment 2, Figure 3: interaction of Upright/Inverted X Congruent/Incongruent). I raised this concern on the original submission, and it has not been resolved properly. The follow-up statistical test (after channel selection using the interaction contrast permutation test) still is geared towards that same contrast, even though the latter is now being tested differently. (Perhaps not explicitly testing the interaction, but in essence still testing the same.) A very simple solution would be to remove the post-hoc statistical tests and simply acknowledge that you're comparing simple means, while the statistical assessment was already taken care of using the permutation test. (In other words: the data appear compelling because of the cluster test, but NOT because of the subsequent t-tests.)
We are sorry that we did not explain this issue clearly before, which might have caused some misunderstanding. When performing the cluster-based permutation test, we only tested whether the audiovisual congruency effect (congruent vs. incongruent) between the upright and inverted conditions was significantly different [i.e., (UprCon – UprInc) vs. (InvCon – InvInc)], without conducting extra statistical analyses on whether the congruency effect was significant in each orientation condition. Such an analysis yielded a cluster with a significant interaction between audiovisual integration and BM orientation for the cortical tracking effect at 1Hz (but not at 2Hz). However, this does not provide valid information about whether the audiovisual congruency effect at this cluster is significant in each orientation condition, given that a significant interaction effect may result from various patterns of data across conditions: such as significant congruency effects in both orientation conditions (Author response image 1a), a significant congruency effect in the upright condition and a non-significant effect in the inverted condition (Author response image 1b), or even non-significant yet opposite effects in the two conditions (Author response image 1c). Here, our results conform to the second pattern, indicating that cortical tracking of the high-order gait cycles involves a domain-specific process exclusively engaged in the AVI of BM. In a similar vein, the non-significant interaction found at 2Hz does not necessarily indicate that the congruency effect is non-significant in each orientation condition (Author response image 1f&e). Indeed, the congruency effect was significant in both the upright and inverted conditions at 2Hz in our study despite the non-significant interaction, suggesting that neural tracking of the lower-order step cycles is associated with a domain-general AVI process mostly driven by temporal correspondence in physical stimuli.
Therefore, we need to perform subsequent t-tests to examine the significance of the simple effects in the two orientation conditions, which do not duplicate the clusterbased permutation test (for interaction only) and cause no double-dipping. Results from interaction and simple effects, put together, provide solid evidence that the cortical tracking of higher-order and lower-order rhythms involves BM-specific and domaingeneral audiovisual processing, respectively.
To avoid ambiguity, we have removed the sentence “We calculated the audiovisual congruency effect for the upright and the inverted conditions” (line 194, which referred to the calculation of the indices rather than any statistical tests) from the manuscript. We have also clarified the meanings of the findings based on the interaction and simple effects together at the two temporal scales, respectively (Lines 205-207; Lines 213-215).
Author response image 1.
Examples of different patterns of data yielding a significant or nonsignificant interaction effect.
Reviewer #2 (Public review):
Summary:
The authors evaluate spectral changes in electroencephalography (EEG) data as a function of the congruency of audio and visual information associated with biological motion (BM) or non-biological motion. The results show supra-additive power gains in the neural response to gait dynamics, with trials in which audio and visual information was presented simultaneously producing higher average amplitude than the combined average power for auditory and visual conditions alone. Further analyses suggest that such supra-additivity is specific to BM and emerges from temporoparietal areas. The authors also find that the BM-specific supra-additivity is negatively correlated with autism traits.
Strengths:
The manuscript is well-written, with a concise and clear writing style. The visual presentation is largely clear. The study involves multiple experiments with different participant groups. Each experiment involves specific considered changes to the experimental paradigm that both replicate the previous experiment's finding yet extend it in a relevant manner.
Weaknesses:
In the revised version of the paper, the manuscript better relays the results and anticipates analyses, and this version adequately resolves some concerns I had about analysis details. Still, it is my view that the findings of the study are basic neural correlate results that do not provide insights into neural mechanisms or the causal relevance of neural effects towards behavior and cognition. The presence of an inversion effect suggests that the supra-additivity is related to cognition, but that leaves open whether any detected neural pattern is actually consequential for multi-sensory integration (i.e., correlation is not causation). In other words, the fact that frequency-specific neural responses to the [audio & visual] condition are stronger than those to [audio] and [visual] combined does not mean this has implications for behavioral performance. While the correlation to autism traits could suggest some relation to behavior and is interesting in its own right, this correlation is a highly indirect way of assessing behavioral relevance. It would be helpful to test the relevance of supra-additive cortical tracking on a behavioral task directly related to the processing of biological motion to justify the claim that inputs are being integrated in the service of behavior. Under either framework, cortical tracking or entrainment, the causal relevance of neural findings toward cognition is lacking.
Overall, I believe this study finds neural correlates of biological motion, and it is possible that such neural correlates relate to behaviorally relevant neural mechanisms, but based on the current task and associated analyses this has not been shown.
Thank you for providing these thoughtful comments regarding the theoretical implications of our neural findings. Previous behavioral evidence highlights the specificity of the audiovisual integration (AVI) of biological motion (BM) and reveals the impairment of such ability in individuals with autism spectrum disorder. However, the neural implementation underlying the AVI of BM, its specificity, and its association with autistic traits remain largely unknown. The current study aimed to address these issues.
It is noteworthy that the operation of multisensory integration does not always depend on specific tasks, as our brains tend to integrate signals from different sensory modalities even when there is no explicit task. Hence, many studies have investigated multisensory integration at the neural level without examining its correlation with behavioral performance. For example, the widely known super-additivity mode for multisensory integration proposed by Perrault and colleagues was based on single-cell recording findings without behavioral tasks (Perrault et al., 2003, 2005). As we mentioned in the manuscript, the super-additive and sub-additive modes indicate non-linear interaction processing, either with potentiated neural activation to facilitate the perception or detection of near-threshold signals (super-additive) or a deactivation mechanism to minimize the processing of redundant information cross-modally (subadditive) (Laurienti et al., 2005; Metzger et al., 2020; Stanford et al., 2005; Wright et al., 2003). Meanwhile, the additive integration mode represents a linear combination between two modalities. Distinguishing among these integration modes helps elucidate the neural mechanism underlying AVI in specific contexts, even though sometimes, the neural-level AVI effects do not directly correspond to a significant behavioral-level AVI effect (Ahmed et al., 2023; Metzger et al., 2020). In the current study, we unveiled the dissociation of multisensory integration modes between neural responses at two temporal scales (Exps. 1a & 1b), which may involve the cooperation of a domain-specific and a domain-general AVI processes (Exp. 2). While these findings were not expected to be captured by a single behavioral index, they revealed the multifaceted mechanism whereby hierarchical cortical activity supports audiovisual BM integration. They also advance our understanding of the emerging view that multi-timescale neural dynamics coordinate multisensory integration (Senkowski & Engel, 2024), especially from the perspective of natural stimuli processing.
Meanwhile, our finding that the cortical tracking of higher-order rhythmic structure in audiovisual BM specifically correlated with individual autistic traits extends previous behavioral evidence that ASD children exhibited reduced orienting to audiovisual synchrony in BM (Falck-Ytter et al., 2018), offering new evidence that individual differences in audiovisual BM processing are present at the neural level and associated with autistic traits. This finding opens the possibility of utilizing the cortical tracking of BM as a potential neural maker to assist the diagnosis of autism spectrum disorder (see more details in our Discussion Lines 334-346).
However, despite the main objective of the current study focusing on the neural processing of BM, we agree with the reviewer that it would be helpful to test the relevance of supra-additive cortical tracking on a behavioral task directly related to BM perception, for further justifying that inputs are being integrated in the service of behavior. In the current study, we adopted a color-change detection task entirely unrelated to audiovisual correspondence but only for maintaining participants’ attention. The advantage of this design is that it allows us to investigate whether and how the human brain integrates audiovisual BM information under task-irrelevant settings, as people in daily life can integrate such information even without a relevant task. However, this advantage is accompanied by a limitation: the task does not facilitate the direct examination of the correlation between neural responses and behavioral performance, since the task performance was generally high (mean accuracy >98% in all experiments). Future research could investigate this issue by introducing behavioral tasks more relevant to BM perception (e.g., Shen et al., 2023). They could also apply advanced neuromodulation techniques to elucidate the causal relevance of the cortical tracking effect to behavior (e.g., Ko sem et al., 2018, 2020).
We have discussed the abovementioned points as a separate paragraph in the revised manuscript (Lines 322-333). In addition, since the scope of the current study does not involve a causal correlation with behavioral performance, we have removed or modified the descriptions related to "functional relevance" in the manuscript (Abstract; Introduction, lines 101-103; Results, lines 239; Discussion, line 336; Supplementary Information, line 794、803). Moreover, we have strengthened the descriptions of the theoretical implications of the current findings in the abstract.
We hope these changes adequately address your concern.
References
Ahmed, F., Nidiffer, A. R., O’Sullivan, A. E., Zuk, N. J., & Lalor, E. C. (2023). The integration of continuous audio and visual speech in a cocktail-party environment depends on attention. Neuroimage, 274, 120143. https://doi.org/10.1016/j.neuroimage.2023.120143
Falck-Ytter, T., Nystro m, P., Gredeba ck, G., Gliga, T., Bo lte, S., & the EASE team. (2018). Reduced orienting to audiovisual synchrony in infancy predicts autism diagnosis at 3 years of age. Journal of Child Psychology and Psychiatry, 59(8), 872–880. https://doi.org/10.1111/jcpp.12863
Ko sem, A., Bosker, H., Jensen, O., Hagoort, P., & Riecke, L. (2020). Biasing the Perception of Spoken Words with Transcranial Alternating Current Stimulation. Journal of Cognitive Neuroscience, 32, 1–10. https://doi.org/10.1162/jocn_a_01579
Ko sem, A., Bosker, H. R., Takashima, A., Meyer, A., Jensen, O., & Hagoort, P. (2018). Neural Entrainment Determines the Words We Hear. Current Biology, 28(18), 2867-2875.e3. https://doi.org/10.1016/j.cub.2018.07.023
Laurienti, P. J., Perrault, T. J., Stanford, T. R., Wallace, M. T., & Stein, B. E. (2005). On the use of superadditivity as a metric for characterizing multisensory integration in functional neuroimaging studies. Experimental Brain Research, 166(3), 289–297. https://doi.org/10.1007/s00221-005-2370-2
Metzger, B. A., Magnotti, J. F., Wang, Z., Nesbitt, E., Karas, P. J., Yoshor, D., & Beauchamp, M. S. (2020). Responses to Visual Speech in Human Posterior Superior Temporal Gyrus Examined with iEEG Deconvolution. The Journal of Neuroscience: The Official Journal of the Society for Neuroscience, 40(36), 6938–6948. https://doi.org/10.1523/JNEUROSCI.0279-20.2020
Perrault, T. J., Vaughan, J. W., Stein, B. E., & Wallace, M. T. (2003). Neuron-Specific Response Characteristics Predict the Magnitude of Multisensory Integration. Journal of Neurophysiology, 90(6), 4022–4026. https://doi.org/10.1152/jn.00494.2003
Perrault, T. J., Vaughan, J. W., Stein, B. E., & Wallace, M. T. (2005). Superior Colliculus Neurons Use Distinct Operational Modes in the Integration of Multisensory Stimuli. Journal of Neurophysiology, 93(5), 2575–2586. https://doi.org/10.1152/jn.00926.2004
Senkowski, D., & Engel, A. K. (2024). Multi-timescale neural dynamics for multisensory integration. Nature Reviews Neuroscience, 25(9), 625–642. https://doi.org/10.1038/s41583-024-00845-7
Shen, L., Lu, X., Wang, Y., & Jiang, Y. (2023). Audiovisual correspondence facilitates the visual search for biological motion. Psychonomic Bulletin & Review, 30(6), 2272–2281. https://doi.org/10.3758/s13423-023-02308-z
Stanford, T. R., Quessy, S., & Stein, B. E. (2005). Evaluating the Operations Underlying Multisensory Integration in the Cat Superior Colliculus. Journal of Neuroscience, 25(28), 6499–6508. https://doi.org/10.1523/JNEUROSCI.5095-04.2005
Wright, T. M., Pelphrey, K. A., Allison, T., McKeown, M. J., & McCarthy, G. (2003). Polysensory Interactions along Lateral Temporal Regions Evoked by Audiovisual Speech. Cerebral Cortex, 13(10), 1034–1043. https://doi.org/10.1093/cercor/13.10.1034
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the previous reviews.
Reviewer #1 (Public review):
Summary:
"Neural noise", here operationalized as an imbalance between excitatory and inhibitory neural activity, has been posited as a core cause of developmental dyslexia, a prevalent learning disability that impacts reading accuracy and fluency. This is study is the first to systematically evaluate the neural noise hypothesis of dyslexia. Neural noise was measured using neurophysiological (electroencephalography [EEG]) and neurochemical (magnetic resonance spectroscopy [MRS]) in adolescents and young adults with and without dyslexia. The authors did not find evidence of elevated neural noise in the dyslexia group from EEG or MRS measures, and Bayes factors generally informed against including the grouping factor in the models. Although the comparisons between groups with and without dyslexia did not support the neural noise hypothesis, a mediation model that quantified phonological processing and reading abilities continuously revealed that EEG beta power in the left superior temporal sulcus was positively associated with reading ability via phonological awareness. This finding lends support for analysis of associations between neural excitatory/inhibitory factors and reading ability along a continuum, rather than as with a case/control approach, and indicates the relevance of phonological awareness as an intermediate trait that may provide a more proximal link between neurobiology and reading ability. Further research is needed across developmental stages and over a broader set of brain regions to more comprehensively assess the neural noise hypothesis of dyslexia, and alternative neurobiological mechanisms of this disorder should be explored.
Strengths:
The inclusion of multiple methods of assessing neural noise (neurophysiological and neurochemical) is a major advantage of this paper. MRS at 7T confers an advantage of more accurately distinguishing and quantifying glutamate, which is a primary target of this study. In addition, the subject-specific functional localization of the MRS acquisition is an innovative approach. MRS acquisition and processing details are noted in the supplementary materials using according to the experts' consensus recommended checklist (https://doi.org/10.1002/nbm.4484). Commenting on rigor the EEG methods is beyond my expertise as a reviewer.
Participants recruited for this study included those with a clinical diagnosis of dyslexia, which strengthens confidence in the accuracy of the diagnosis. The assessment of reading and language abilities during the study further confirms the persistently poorer performance of the dyslexia group compared to the control group.
The correlational analysis and mediation analysis provide complementary information to the main case-control analyses, and the examination of associations between EEG and MRS measures of neural noise is novel and interesting.
The authors follow good practice for open science, including data and code sharing. They also apply statistical rigor, using Bayes Factors to support conclusions of null evidence rather than relying only on non-significant findings. In the discussion, they acknowledge the limitations and generalizability of the evidence and provide directions for future research on this topic.
Weaknesses:
Though the methods employed in the paper are generally strong, the MRS acquisition was not optimized to quantify GABA, so the findings (or lack thereof) should be interpreted with caution. Specifically, while 7T MRS affords the benefit of quantifying metabolites, such as GABA, without spectral editing, this quantification is best achieved with echo times (TE) of 68 or 80 ms in order to minimize the spectral overlap between glutamate and GABA and reduce contamination from the macromolecular signal (Finkelman et al., 2022, https://doi.org/10.1016/j.neuroimage.2021.118810). The data in the present study were acquired at TE=28 ms, and are therefore likely affected by overlapping Glu and GABA peaks at 2.3 ppm that are much more difficult to resolve at this short TE, which could directly affect the measures that are meant to characterize the Glu/GABA+ ratio/imbalance. In future research, MRS acquisition schemes should be optimized for the acquisition of Glutamate, GABA, and their relative balance.
As the authors note in the discussion, additional factors such as MRS voxel location, participant age, and participant sex could influence associations between neural noise and reading abilities and should be considered in future studies.
We have modified Figure 2 and revised the paragraph discussing the MRS methodological limitations in accordance with Reviewer #1's recommendations. Additionally, we have included the CRLB and linewidth thresholds in the Results section. Furthermore, a new figure showing the correlations between EEG and MRS biomarkers has been added (Figure 3).
Appraisal:
The authors present a thorough evaluation of the neural noise hypothesis of developmental dyslexia in a sample of adolescents and young adults using multiple methods of measuring excitatory/inhibitory imbalances as an indicator of neural noise. The authors concluded that there was not support for the neural noise hypothesis of dyslexia in their study based on null significance and Bayes factors. This conclusion is justified, and further research is called for to more broadly evaluate the neural noise hypothesis in developmental dyslexia.
Impact:
This study provides an exemplar foundation for the evaluation of the neural noise hypothesis of dyslexia. Other researcher may adopt the model applied in this paper to examine neural noise in various populations with/without dyslexia, or across a continuum of reading abilities, to more thoroughly examine evidence (or lack thereof) for this hypothesis. Notably, the lack of evidence here does not rule out the possibility for a role of neural noise in dyslexia, and the authors point out that presentation with co-occurring conditions, such as ADHD, may contribute to neural noise in dyslexia. Dyslexia remains a multi-faceted and heterogenous neurodevelopmental condition, and many genetic, neurobiological and environmental factors play a role. This study demonstrates one step toward evaluating neurobiological mechanisms that may contribute to reading difficulties.
Reviewer #2 (Public review):
Summary:
This study utilized two complimentary techniques (EEG and 7T MRI/MRS) to directly test a theory of dyslexia: the neural noise hypothesis. The authors report finding no evidence to support an excitatory/inhibitory balance, as quantified by beta in EEG and Glutamate/GABA ratio in MRS. This is important work and speaks to one potential mechanism by which increased neural noise may occur in dyslexia.
Strengths:
This is a well conceived study with in depth analyses and publicly available data for independent review. The authors provide transparency with their statistics and display the raw data points along with the averages in figures for review and interpretation. The data suggest that an E/I balance issue may not underlie deficits in dyslexia and is a meaningful and needed test of a possible mechanism for increased neural noise.
Weaknesses:
The researchers did not include a visual print task in the EEG task, which limits analysis of reading specific regions such as the visual word form area, which is a commonly hypoactivated region in dyslexia. This region is a common one of interest in dyslexia, yet the researchers measured the I/E balance in only one region of interest, specific to the language network.
Reviewer #3 (Public review):
Summary:
This study by Glica and colleagues utilized EEG (i.e., Beta power, Gamma power, and aperiodic activity) and 7T MRS (i.e., MRS IE ratio, IE balance) to reevaluating the neural noise hypothesis in Dyslexia. Supported by Bayesian statistics, their results show convincing evidence of no differences in EI balance between groups, challenging the neural noise hypothesis.
Strengths:
Combining EEG and 7T MRS, this study utilized both the indirect (i.e., Beta power, Gamma power, and aperiodic activity) and direct (i.e., MRS IE ratio, IE balance) measures to reevaluating the neural noise hypothesis in Dyslexia.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the current reviews.
Public Reviews:
Reviewer #1 (Public review):
Summary:
This study by Wang et al. identifies a new type of deacetylase, CobQ, in Aeromonas hydrophila. Notably, the identification of this deacetylase reveals a lack of homology with eukaryotic counterparts, thus underscoring its unique evolutionary trajectory within the bacterial domain.
Strengths:
The manuscript convincingly illustrates CobQ's deacetylase activity through robust in vitro experiments, establishing its distinctiveness from known prokaryotic deacetylases. Additionally, the authors elucidate CobQ's potential cooperation with other deacetylases in vivo to regulate bacterial cellular processes. Furthermore, the study highlights CobQ's significance in the regulation of acetylation within prokaryotic cells.
Weaknesses:
The problem I raised has been well resolved. I have no further questions.
Thanks for your valuable comments very much.
Reviewer #2 (Public review):
In recent years, lots of researchers tried to explore the existence of new acetyltransferase and deacetylase by using specific antibody enrichment technologies and high resolution mass spectrometry. Here is an example for this effort. Yuqian Wang et al. studied a novel Zn2+- and NAD+-independent KDAC protein, AhCobQ, in Aeromonas hydrophila. They studied the biological function of AhCobQ by using biochemistry method and MS identification technology to confirm it. These results extended our understanding of the regulatory mechanism of bacterial lysine acetylation modifications. However, I find this conclusion is a little speculative, and unfortunately it also doesn't totally support the conclusion as the authors provided.
Major concerns:
- It is a little arbitrary to come to the title "Aeromonas hydrophila CobQ is a new type of NAD+- and Zn2+-independent protein lysine deacetylase in prokaryotes." It should be modified to delete the "in the prokaryotes" except that the authors get new more evidence in the other prokaryotes for the existence of the AhCobQ.
Thank you for your suggestion. However, I believe there has been some confusion regarding the title. In the revised manuscript we have already updated the title to: "Aeromonas hydrophila CobQ is a new type of NAD+- and Zn2+-independent protein lysine deacetylase."
This title does not include the phrase "in prokaryotes," as you mentioned. We kindly suggest verifying the version of the manuscript that was reviewed to ensure you are reviewing the most recent changes.
- I was confused about the arrangement of the supplementary results. Because there are no citations for Figures S9-S19.
Thank you for your feedback. It appears there may have been a misunderstanding, possibly due to reviewing an outdated version of the manuscript. In the revised manuscript we revised the supplementary figures and now have only 12 figures, all of which are correctly cited in the manuscript on pages 12 to 15. Below is a detailed list of the updated figure citations:
Figures S1: page 8, line 148;
Figures S2: page 9, line 168;
Figures S3 and S4: page 10, line 178;
Figures S5: page 10, line 186;
Figures S6: page 10, line 189;
Figures S7: page 12, line 221;
Figures S8-S10: page 13, line 245;
Figures S11: page 11, line 282;
Figures S12: page 15, line 286
- Same to the above, there are no data about Tables S1-S6.
Thank you for your attention to the supplementary materials. As with the figures, we have already uploaded the data for Tables S1-S6 in the revised manuscript on November 19, 2024, and properly cited Tables S1 – S6 in the manuscript. Below is the citation information:
Tables S1: page 10, line 194;
Tables S2: page 13, line 245;
Tables S3: page 21, line 438;
Tables S4: page 22, line 439;
Tables S5: page 22, line 445;
Tables S6: page 27, line 564.
Please note that Tables S3 – S4 include the chemical reagents, primers, and other experimental materials, which are not intended to be cited in the results section.)
- All the load control is not integrated. Please provide all of the load controls with whole PAGE gel or whole membrane western blot results. Without these whole results, it is not convincing to come the conclusion as the authors mentioned in the context.
Thank you for your comment. Please note that the full membrane western blot results were included in the revised manuscript. We hope this satisfies your request. If you need further clarification or additional data, please do not hesitate to let us know.
- Thoroughly review the materials & methods section. It is unclear to me what exactly the authors describe in the method. All the experimental designs and protocols should be described in detail, including growth conditions, assay conditions, and purification conditions, etc.
Thank you for your valuable suggestion. In response to your comment and previous feedback, we have alredy revised the Materials & Methods section thoroughly in the revised manuscript. The experimental details, including growth conditions, assay protocols, and purification procedures, are described in full on pages 22 to 30 of the revised manuscript.
- Include relevant information about the experiments performed in the figure legends, such as experimental conditions, replicates, etc. Often it is not clear what was done based on the figure legend description.
Thank you very much for your detailed feedback and suggestions. We have made sure to describe what each data point represents in the figure legends, as per the previous feedback. However, we would like to clarify that while we have provided detailed descriptions in the legends, the inclusion of every specific experimental condition in the figure legends could result in redundancy, as these details are already thoroughly outlined in the Materials & Methods section.
We hope this explanation addresses your concern.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
I have no further revision comments.
Thank you very much.
Reviewer #2 (Recommendations for the authors):
I carefully read the point-to-point response from the author. Although they listed lots of the reasons for the ugly results, it still can not persuade me to accept their conclusions. While, as I know, it is impossible to reject their work in eLife as it was sent out for peer-review. I also can't accuse them of being wrong, but I have my opinion on this point. That is not the results, but the attitude.
Thank you for your feedback. However, I must express some concerns regarding the nature of your comments. Based on the issues you've raised, it seems that you may have reviewed an outdated version of the manuscript. In the updated revision we addressed all the points you've raised, including the figure and table citations, experimental methods, and data integration.
We understand that differing opinions are part of the peer-review process, but we respectfully believe that your conclusion regarding our attitude is based on a misunderstanding, possibly caused by reviewing an incorrect version of the manuscript. We have always strived to approach this manuscript with utmost professionalism and have diligently responded to each of your concerns.
We sincerely suggest reviewing the latest version of our manuscript, and we welcome any further constructive feedback. We hope this clarifies any misunderstandings and look forward to your continued support.
Thank you for your time and thoughtful consideration.
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public review):
Summary:
This study by Wang et al. identifies a new type of deacetylase, CobQ, in Aeromonas hydrophila. Notably, the identification of this deacetylase reveals a lack of homology with eukaryotic counterparts, thus underscoring its unique evolutionary trajectory within the bacterial domain.
Strengths:
The manuscript convincingly illustrates CobQ's deacetylase activity through robust in vitro experiments, establishing its distinctiveness from known prokaryotic deacetylases. Additionally, the authors elucidate CobQ's potential cooperation with other deacetylases in vivo to regulate bacterial cellular processes. Furthermore, the study highlights CobQ's significance in the regulation of acetylation within prokaryotic cells.
Weaknesses:
The problem I raised has been well resolved. I have no further questions.
Reviewer #2 (Public review):
In recent years, lots of researchers tried to explore the existence of new acetyltransferase and deacetylase by using specific antibody enrichment technologies and high resolution mass spectrometry. Here is an example for this effort. Yuqian Wang et al. studied a novel Zn2+- and NAD+-independent KDAC protein, AhCobQ, in Aeromonas hydrophila. They studied the biological function of AhCobQ by using biochemistry method and MS identification technology to confirm it. These results extended our understanding of the regulatory mechanism of bacterial lysine acetylation modifications. However, I find this conclusion is a little speculative, and unfortunately, it also doesn't totally support the conclusion as the authors provided.
Reviewer #3 (Public review):
Summary:
This study reports on a novel NAD+ and Zn2+-independent protein lysine deacetylase (KDAC) in Aeromonas hydrophila, termed as AhCobQ (AHA_1389). This protein is annotated as a CobQ/CobB/MinD/ParA family protein and does not show similarity with known NAD+-dependent or Zn2+-dependent KDACs. The authors showed that AhCobQ has NAD+ and Zn2+-independent deacetylase activity with acetylated BSA by western blot and MS analyses. They also provided evidence that the 195-245 aa region of AhCobQ is responsible for the deacetylase activity, which is conserved in some marine prokaryotes and has no similarity with eukaryotic proteins. They identified target proteins of AhCobQ deacetylase by proteomic analysis and verified the deacetylase activity using site-specific Kac proteins. Finally, they showed that AhCobQ activates isocitrate dehydrogenase by deacetylation at K388.
Strengths:
The finding of a new type of KDAC has a valuable impact on the field of protein acetylation. The characters (NAD+ and Zn2+-independent deacetylase activity in an unknown domain) shown in this study are very unexpected.
Weaknesses:
(1) The characters (NAD+ and Zn2+-independent deacetylase activity in an unknown domain) shown in this study are very unexpected. To convince readers, MSMS data must be necessary to accurately detect (de)acetylation at the target site in the deacetylase activity assay. The authors showed the MSMS data in assays with acetylated BSA, but other assays only rely on western blot.
(2) They prepared site-specific Kac proteins and used them in deacetylase activity assays. Incorporation of acetyllysine at the target site should be confirmed by MSMS and shown as supplementary data.
(3) The authors imply that the 195-245 aa region of AhCobQ may represent a new domain responsible for deacetylase activity. The feature of the region would be of interest but is not sufficiently described in Figure 5. The amino acid sequence alignments with representative proteins with conserved residues would be informative. It would be also informative if the modeled structure predicted by AlphaFold is shown and the structural similarity with known deacetylases is discussed.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
The problem I raised has been well resolved. I have no further questions.
Reviewer #2 (Recommendations for the authors):
Questions to response of"-The load control is not all integrated. All of the load controls with whole PAGE gel or whole membrane western blot results should be provided. Without these whole results, it is not convincing to come to the conclusion that the authors have."
Just as the Authors answered. The Coomassie Blue R-350 staining outcomes from the PVDF membranes. That is a good control for the experiment. However, I still have several questions about it:
(1) The first is the quality of these Western blot. Why all the bands of these Western blot is so ugly? To tell the truth, it is very difficult to come to a conclusion from these poor western blots.
We appreciate your feedback regarding the quality of the Western blots presented in Figure 7. We believe the “ugly bands” you referred to reflect our results validating the functions of CobQ through the use of recombinant site-specific Kac protein substrates.
In our study, we meticulously engineered these recombinant site-specific Kac proteins using a two-plasmid system, based on foundational research published in Nature Chemical Biology (2017, 13(12): 1253-1260), which introduced the genetic encoding of Nε-acetyllysine into recombinant proteins. However, we faced a common challenge: protein truncation due to premature translation termination at the reassigned codon. This issue not only hampers protein yields, as discussed in ChemBioChem (2017, 18(20): 1973-1983), but also contributes to the suboptimal appearance of the Western blot results.
Despite conducting at least two independent repetitions for the Western blot analysis of the site-specific Kac proteins, which yielded consistent results, we recognize that the overall quality remains less than ideal. This variability is inherently related to the characteristics of the target proteins. Nevertheless, the primary aim of our manuscript is to validate the novel deacetylase activity of CobQ. We have provided multiple lines of evidence, including mass spectrometry (MS/MS) and Western blot analyses, to substantiate this claim. In response to your comments, we have decided to remove the ambiguous Western blot results from Figure 7, retaining only four figures that demonstrate significant differences across at least two independent replicates (Author response images 1-5). Additionally, we have included four biological replicates of the Western blot results for ICD Kac388 + CobQ in the supplementary materials (Author response image 5) to further validate the deacetylase function of CobQ.
Author response image 1.
Western blot validation of the Kac26 AcrA-2 protein substrates regulated by the three KDACs in two biological replicates.
Author response image 2.
Western blot validation of the Kac48 Sun protein substrates regulated by the three KDACs in two biological replicates.
Author response image 3.
Western blot validation of the Kac103 Sun protein substrates regulated by the three KDACs in two biological replicates.
Author response image 4.
Western blot validation of the Kac195 Eno protein substrates regulated by the three KDACs in three biological replicates.
Author response image 5.
Western blot validation of the Kac388 ICD protein substrates regulated by AhCobQ in this study. Each sample was independently repeated at least three time.
(2) The second is why some of the results are not from the same PVDF by comparing the Coommassie staining with the WB results just as authors responded. For example, the HrpA-K816 (ac), Eno-K195 (ac), ArcA-2-K26 (ac), ArcA-2-K26 (ac), IscS-K93(ac), A0KJ75-K81(ac), GyrB-K331(ac), GyrB-K449(ac), FtsA-K320(ac), FtsA-K409(ac), RecA-K279(ac), and the RecA-K306(ac). All of them are clearly not from the same staining results of PVDF membrane but from a new PVDF membrane.
We assure you that the R-350 stained PVDF membranes originate from the same Western blot membranes. However, we acknowledge that visual discrepancies may arise due to differences in imaging techniques. The Western blot results were scanned using a ChemiDoc MP (Bio-Rad, Hercules, CA, USA), while the Coomassie R-350 stained PVDF membranes were captured using a standard camera. These differences can create a misleading appearance, making it seem as though they come from different membranes.
It is also important to note that the intensity of the protein marker cannot be directly compared between the two imaging methods. As illustrated in Author response image 6, the protein marker at 70 kDa is clearly detectable in the Coomassie R-350 image, whereas it may not be as apparent in the Western blot result due to inherent differences in detection sensitivity.
Author response image 6.
The comparison of Western blotting and R-350 strained results of same protein marker in the same PVDF membrane. The protein marker located at 70 kDa can be detected easily in Coomassie R-350, while is difficult to display in WB result.
Additionally, we have removed some of the so-called "ugly" Western blot results in the updated manuscript and provided the original full film of the relevant images as an attachment. This documentation demonstrates that all the data you referenced originate from the same film, as shown in Figures 1-5.
(3) The third is why there is no replication for all these WB results. We should draw a conclusion with serious attitude, but not from the only one repeat, even say nothing about the poor results.
Thank you for your valuable suggestion. In the second version of the manuscript, we have included the original full film of the relevant images. While we previously explained the reasons behind the "ugly" Western blot results, we have decided to remove some, or even all, of these results from Figure 7 in the updated version. The related images will be updated in the supplementary materials (Figures 1-5 in responding letter and Figure 7 in the revised manuscript).
Furthermore, we have provided a more detailed discussion regarding the poor results in the updated manuscript to ensure clarity and transparency. We appreciate your understanding and hope these changes meet your expectations.
Questions to response of " L174-187, L795 (Please show the whole membrane (or PAGE gel) of the loading control of CobB, and CobQ, except for the Kac-BSA)".
(1) As we all educated that there is no control, and no biology. Where is the band of CobQ? Why do not stain the same PVDF membrane with R-350 staining but with a new membrane?
Thank you for your insightful feedback. As noted in our previous response, the absence of visible bands for AhCobQ and AhCobB on the Coomassie R-350 stained PVDF membrane is primarily due to the low loading amounts and protein loss during the Western blotting process.
To reinforce our findings, we repeated the analysis of the protein samples via SDS-PAGE, using the same loading quantity as in the previous Western blot shown in Figure 2 of the manuscript. As illustrated in Author response image 7, the bands for CobB and CobQ are discernible, albeit with significantly lower intensities compared to the Kac-BSA bands. Upon examining the full Coomassie R-350 stained PVDF membranes provided in Supplementary Material 1, we observe that the CobB and CobQ bands are not easily visible. This aligns with your observations and can be attributed to potential protein loss during the transfer from SDS-PAGE to the PVDF membrane.
Author response image 7.
The SDS-PAGE gel displayed the loading amounts of Kac-BSA and CobB/CobQ.
To enhance the visibility of the CobQ/CobB bands, we increased the loading of CobQ/CobB in a new Western blot experiment, using 2 µg of Kac-BSA in combination with 0.8 µg of CobQ/CobB. As shown in Figure 8, while the increasing amounts of Kac-BSA resulted in a more blurred signal, the bands for the recombinant CobQ and CobB proteins were clearly detectable. This indicates that both proteins were indeed involved in the in vitro protein deacetylation assay.
Author response image 8.
Western blot verified the deacetylase activity assay of AhCobQ and AhCobB on Kac-BSA.
Furthermore, we conducted a mass spectrometry analysis comparing Kac-BSA and Kac-BSA incubated with CobQ, as well as BSA without acetylation, against the A. hydrophila database with a cut-off of unique matched peptides >1. It is challenging to completely avoid contaminant detection during protein purification, especially when using high-resolution mass spectrometry. Our findings revealed that CobQ has the highest number of unique matched peptides (Author response table 1), while contaminants such as AHA_3036, AHA_0497, AHA_1279, and valS could be excluded, as they were present in Kac-BSA or BSA samples. Additionally, Tuf1, RplQ, GroEL, RpsF, RpsU, RpsB, RpsO, and RpsJ are known ribosomal subunits or chaperonins that are abundantly expressed in cells and may interact with various proteins, leading to contaminant detection.
Author response table1.
LC MS/MS results of selected peptide quantification among Kac-BSA and Kac-BSA incubated with CobQ and BSA without acetylation against A. hydrophila database (unique matched peptides>1).
Although AceE, a pyruvate dehydrogenase E1 component, theoretically possesses deacetylase activity, this possibility is low. First, in the SDS-PAGE gel of the purified recombinant protein, CobQ is the major band, with other proteins present at very low levels (less than 1/10 of CobQ). This suggests that significant deacetylation by contaminants is unlikely. Second, we purified His-tagged AhCobQ and GST-fused AhCobQ separately and tested their deacetylase activities. As shown in Figure S4 of the updated manuscript, both purified AhCobQ proteins exhibited deacetylase activity, while the negative control (purified GST protein only) did not, further supporting our conclusion that enzyme activity is not attributable to contaminating proteins (Figure S5).
(2) Without the CobB and CobQ bands, it is impossible to say the function of CobQ is a new deacetylase. To avoid this confusion, it is easy to run a new gel and stain it with anti-His antibody to show these deacetylases.
Thank you very much for your suggestion. We have performed the experiment in the comment (1) as your suggestion.
(3) The explanation about the CobB/CobQ bands are not visible is not acceptable. Because the molecular weight of the CobB and CobQ is smaller than that of BSA, it is impossible that these bands will be loss during membrane transfer.
Thank you for your valuable feedback. I completely agree that the loss of CobB and CobQ proteins during membrane transfer is unlikely due to their smaller molecular weight compared to BSA. As shown in Figure 7, the bands for CobB and CobQ are detectable in the SDS-PAGE gel but not visible on the Coomassie R-350 stained PVDF membrane.
Several factors could contribute to this issue. One possibility is that the detection sensitivity of Coomassie R-350 may be lower than that of Coomassie R-250 used in the gel. Additionally, the Western blot results using an anti-His antibody further indicate low loading amounts of CobB and CobQ proteins on the PVDF membrane (Figure 8). This suggests that the observed low levels may indeed be due to protein loss during the membrane transfer process, despite their relatively small size.
Reviewer #3 (Recommendations for the authors):
(1) I found Tables S1 and S2 in the revised manuscript. It is strange to me that the intensity of Kac-BSA+CobQ is zero, completely nothing. Typically, a portion of the acetylated peptide remains after the deacetylation reaction.
Thank you for your observation. When we report an intensity of zero, it does not imply a complete absence of signal; rather, it indicates that the signal for the target peptide is below the detectable threshold. This is likely due to the minimum cut-off setting in the MaxQuant (MQ) software, which is determined by parameters like "peptide_mass_tolerance" (as discussed in MQ user groups online, though it may not be explicitly listed in the parameters file).
In our study, we performed a deacetylase assay that demonstrated CobQ's rapid activity; for instance, it can deacetylate ICD-K388ac within just four minutes. This leads me to hypothesize that the CobQ + Kac-BSA sample may have undergone near-complete enzymatic hydrolysis during the reaction.
Furthermore, Table S1 in manuscript presents only a selection of the mass spectrometry results to illustrate CobQ's activity. In addition to the 15 acetylated peptides shown, there are many more (27 peptides) that exhibit significantly reduced acetylation levels without reaching zero intensity. The overall acetylation level of BSA peptides incubated with CobQ is calculated to be only 0.13 times that of Kac-BSA (Diagnostic peak: yes, peptide score: >100, Localization probability: >0.95) (Author response image 9).
Based on these findings, we believe our mass spectrometry results are reliable and effectively support our conclusions. Thank you for your understanding.
Author response image 9.
The intensities of all Kac peptides of Kac-BSA with or without AhCobQ incubation in LC MS/MS.
(2) It would be better to provide the information about ArcA and ArcA-2 as mentioned in the authors' response. It would be helpful for readers to understand that they are different proteins.
Thank you for your suggestion. In the A. hydrophila ATCC 7966 dataset, there are indeed two distinct proteins referred to as ArcA: ArcA-1, which functions as an aerobic respiration control protein, and ArcA-2, which acts as an arginine deiminase. Importantly, these two proteins do not share any sequence homology; they are only similarly named due to their acronyms. While we believe this distinction does not require extensive explanation in the current study, we appreciate your input. Additionally, in response to Reviewer 2’s feedback, we have decided to remove the Western blot result for ArcA-2 due to its poor quality in the updated manuscript.
(3) Line 409-416. Despite my comment, the citation of related papers on ICD acetylation in E. coli is still missing.
Thank you for your suggestion. It has been added and highlighted in red. (Venkat S, et al, 2018, 430(13): 1901-1911)
(4) The image resolution of Figure 3C and 3D is still bad. I could not evaluate that Kac was exactly incorporated at the target site.
Thank you for your feedback regarding the image resolution of Figures 3C and 3D. We have now displayed these figures with improved clarity, as you suggested.
To further validate the reliability of our MS2 data, we employed Proteome Discoverer 2.4 (Thermo) to analyze the raw data and provide theoretical mass information. As shown in Author response images 10-13, the MS2 spectra and fragment match lists for both unmodified and acetylated peptides offer additional confirmation of the reliability of our mass spectrometry results.
Author response image 10.
MS2 spectrum of unmodified peptide using PD v2.4 software.
Author response image 11.
The theoretical mass of unmodified peptide by PD 2.4
Author response image 12.
MS2 spectrum of acetylated peptide using PD v2.4 software.
Author response image 13.
The theoretical mass of acetylated peptide by PD 2.4.
(5) Again, in Figure 8D, it should be shown the significance between ICD-Kac388 and ICD-Kac388+AhCobB to support the authors' conclusion that AhCobQ activates ICD by deacetylation at K388.
Thanks for your suggestion, we have updated the figure in Figure 8D in updated manuscript.
(6) It was nice that the authors presented the mass spectrum data of ICD-K388 acetylation (Figure 2 in responding letter). However, the data did not convince me that K388 is acetylated. In the figure, two b-ion peaks are detected, 285.1557 and 386.2034, which may correspond to NK (theoretical mass, 260.15) and NKT (theoretical mass, 361.20) peptides, respectively. If K388 is acetylated, an increase in the mass of 42 should be observed, but the difference between the detected and theoretical mass is 25. I also could not understand what the peak of 126.0913 mass is, indicated with acK* in red.
Thank you for your detailed observation. The data presented in the MS2 spectrum for ICD-K388 acetylation in Figure 2 of the previous response letter were generated using Proteome Discoverer 2.4 (PD, Thermo) to ensure accurate mass calculations. Similar to the results from MaxQuant, ICD-K388 was identified again (Author response image 14).
Regarding the b-ion peaks you mentioned, the values 285.1557 and 386.2034 correspond to NK<sup>ac</sup> and NK<sup>ac</sup>T peptides, respectively. The theoretical masses for these peptides are as follows: NK<sup>ac</sup> (285.15 = 115.05020 + 128.095 + 42.01) and NK<sup>ac</sup>T (386.20 = NK<sup>ac</sup> + 101.04768). The differences between the theoretical and detected masses for the relevant b-ions (b2*-NK, b52+-NH3, and b3) are minimal, at 0.00 Da and 2.1 ppm, respectively, which is consistent with the incorporation of an NH3 group (Author response image 15).
Author response image 14.
The MS2 of ICD-K388 peptide by PD 2.4.
Author response image 15.
The theoretical mass of ICD-K388 peptide by PD 2.4.
The peak at 126.0913 m/z, indicated as acK*, represents immonium ions of ε-N-acetyllysine, which are generated during the fragmentation of acetyllysine. This diagnostic ion is widely recognized as a marker for identifying acetylated peptides (Nakayasu, et al,. A method to determine lysine acetylation stoichiometries. International journal of proteomics. 2014;2014(1):730725; Trelle et al., Utility of immonium ions for assignment of ε-N-acetyllysine-containing peptides by tandem mass spectrometry. Analytical chemistry. 2008;80(9):3422-30). Additionally, it is a default parameter in MaxQuant for identifying Kac peptides (Author response image 16).
Based on these findings, we believe the evidence supporting ICD-K388 acetylation is robust.
Author response image 16.
The default parameter in Kac peptide identification in Maxquant v1.6 software
(7) As mentioned by other reviewers, some of the figures and tables are incomplete. Some panels (ex. Figure 7C and 7D) and explanations (ex. What are lanes 1, 2, and 3 in Figure S3) are still missing.
Thank you for your suggestion. It has been added.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public Review):
The authors show that SVZ-derived astrocytes respond to a middle carotid artery occlusion (MCAO) hypoxia lesion by secreting and modulating hyaluronan at the edge of the lesion (penumbra) and that hyaluronan is a chemoattractant to SVZ astrocytes. They use lineage tracing of SVZ cells to determine their origin. They also find that SVZ-derived astrocytes express Thbs-4 but astrocytes at the MCAO-induced scar do not. Also, they demonstrate that decreased HA in the SVZ is correlated with gliogenesis. While much of the paper is descriptive/correlative they do overexpress Hyaluronan synthase 2 via viral vectors and show this is sufficient to recruit astrocytes to the injury. Interestingly, astrocytes preferred to migrate to the MCAO than to the region of overexpressed HAS2.
Strengths:
The field has largely ignored the gliogenic response of the SVZ, especially with regard to astrocytic function. These cells and especially newborn cells may provide support for regeneration. Emigrated cells from the SVZ have been shown to be neuroprotective via creating pro-survival environments, but their expression and deposition of beneficial extracellular matrix molecules are poorly understood. Therefore, this study is timely and important. The paper is very well written and the flow of results is logical.
Weaknesses:
The main problem is that they do not show that Hyaluronan is necessary for SVZ astrogenesis and or migration to MCAO lesions. Such loss of function studies have been carried out by studies they cite (e.g. Girard et al., 2014 and Benner et al., 2013). Similar approaches seem to be necessary in this work.
We appreciate the comments by the reviewer. The article is, indeed, largely descriptive since we attempt to describe in detail what happens to newborn astrocytes after MCAO. Still, we have not attempted any modification to the model, such as amelioration of ischemic damage. This is a limitation of the study that we do not hide. However, we use several experimental approaches, such as lineage tracing and hyaluronan modification, to strengthen our conclusions.
Regarding the weaknesses found by the reviewer, we do not claim that hyaluronan is necessary for SVZ astrogenesis. Indeed, we observe that when the MCAO stimulus (i.e. inflammation) is present, the HMW-HA (AAV-Has2) stimulus is less powerful (we discuss this in line 330-332). We do claim, and we believe we successfully demonstrate, the reverse situation: that SVZ astrocytes modulate hyaluronan, not at the SVZ but at the site of MCAO, i.e. the scar. However, regarding whether hyaluronan is necessary for SVZ astrogenesis, we only show a correlation between its degradation and the time-course of astrogenesis. We suggest this result as a starting point for a follow-up study. We have included a phrase in the discussion (line 310), stating that further experiments are needed to fully establish a link between hyaluronan and astrogenesis in the SVZ.
Major points:
(1) How good of a marker for newborn astrocytes is Thbs4? Did you co-label with B cell markers like EGFr? Is the Thbs4 gene expressed in B cells? Do scRNAseq papers show it is expressed in B cells? Are they B1 or B2 cells?
We chose Thbs4 as a marker of newborn astrocytes based on published research (Beckervordersanforth et al., 2010; Benner et al., 2013; Llorens-Bobadilla et al. 2015, Codega et al, 2014; Basak et al., 2018; Mizrak et al., 2019; Kjell et al., 2020; Cebrian-Silla et al., 2021). From those studies, at least 3 associate Thbs4 to B-type cells based on scRNAseq data (LlorensBobadilla et al. 2015; Cebrian-Silla et al., 2021; Basak et al., 2018). We have included a sentence about this and the associated references, in line 92.
We co-label Thbs4 with EGFR, but in the context of MCAO. We observed an increase of EGFR expression with MCAO, similar to the increase in Thbs4 alongside ischemia (see author ). We did not include this figure in the manuscript since we did not have available tissue from all the time points we used (7d, 60d post-ischemia).
Author response image 1.
Thbs4 cells, in basal and ischemic conditions, only represent a small amount of IdU-positive cells (Fig 3F), suggesting that they are mostly quiescent cells, i.e., B1 cells. However, the scRNAseq literature is not consistent about this.
(2) It is curious that there was no increase in Type C cells after MCAO - do the authors propose a direct NSC-astrocyte differentiation?
Type C cells are fast-proliferating cells, and our BrdU/IdU experiment (Fig. 3) suggests that Thbs4 cells are slow-proliferating cells. Some authors suggest (Encinas lab, Spain) that when the hippocampus is challenged by a harsh stimulus, such as kainate-induced epilepsy, the NSCs differentiate directly into reactive astrocytes and deplete the DG neurogenic niche (Encinas et al., 2011, Cell Stem Cell; Sierra et al., 2015, Cell Stem Cell). We believe this might be the case in our MCAO model and the SVZ niche, since we observe a decrease in DCX labeling in the olfactory bulb (Fig S5) and an increase in astrocytes in the SVZ, which migrate to the ischemic lesion. We did not want to overcomplicate an already complicated paper, dwelling with direct NSC-astrocyte differentiation or with the reactive status of these newborn astrocytes.
(3) The paper would be strengthened with orthogonal views of z projections to show colocalization.
We thank the reviewer for this observation. We have now included orthogonal projections in the critical colocalization IF of CD44 and hyaluronan (hyaluronan internalization) in Fig S6D, and a zoomed-in inset. Hyaluronan membrane synthesis is already depicted with orthogonal projection in Fig 6F.
(4) It is not clear why the dorsal SVZ is analysed and focused on in Figure 4. This region emanates from the developmental pallium (cerebral cortex anlagen). It generates some excitatory neurons early postnatally and is thought to have differential signalling such as Wnt (Raineteau group).
We decided to analyze in depth the dorsal SVZ after the BrdU experiment (Fig S3), where we observed an increase in BrdU+/Thbs4+ cells mostly in the dorsal area. Hence, the electrodes for electroporation were oriented in such a way as to label the dorsal area. We appreciate the paper by Raineteau lab, but we assume that this region may potentially exploit other roles (apart from excitatory neurons generated early postnatally) depending on the developmental stage (our model is in adults) and/or pathological conditions (MCAO).
(5) Several of the images show the lesion and penumbra as being quite close to the SVZ. Did any of the lesions contact the SVZ? If so, I would strongly recommend excluding them from the analysis as such contact is known to hyperactivate the SVZ.
We thank the referee for the suggestion to exclude the harsher MCAO-lesioned animals from the analysis. Indeed, the MCAO ischemia, methodologically, can generate different tissue damages that cannot be easily controlled. Thus, based on TTC staining, we had already excluded the more severe tissue damage that contacted the SVZ, based on TTC staining.
(6) The authors switch to a rat in vitro analysis towards the end of the study. This needs to be better justified. How similar are the molecules involved between mouse and rat?
We chose the rat culture since it is a culture that we have already established in our lab, and that in our own hands, is much more reproducible than the mouse brain cell culture that we occasionally use (for transgenic animals only). Benito-Muñoz et al., Glia. 2016; Cavaliere et al., Front Cell Neurosci. 2013. It is true that there could be differences between the rat and mouse Thbs4-cell physiology, despite a 96% identity between rat and mouse Thbs4 protein sequence (BLASTp). In vitro, we only confirm the capacity of astrocytes to internalize hyaluronan, which was a finding that we did not expect in our in vivo experiments. Indeed, these observations, notwithstanding the obvious differences between in vivo and in vitro scenarios, suggest that the HA internalization by astrocytes is a cross-species event, at least in rodents. Regarding HA, hyaluronan is similar in all species, since it’s a glycan (this is why there are no antibodies against HA, and ones has to rely on binding proteins such as HABP to label it).
(7) Similar comment for overexpression of naked mole rat HA.
We chose the naked mole rat Hyaluronan synthase (HAS), because it is a HAS that produces HA of very high molecular weight, similar to the one found accumulated in the glial scar, at the lesion border. The naked-mole rat HAS used in mice (Gorbunova Lab) is a known tool in the ECM field. (Zhang et al, 2023, Nature; Tian et al., 2013, Nature).
Reviewer 1 (Recommendation to authors):
(1) Line 22: most of the cells that migrate out of the SVZ are not stem cells but cells further along in the lineage - neuroblasts and glioblasts.
We thank the reviewer for this clarification. We have modified the abstract accordingly.
(2) In Figure 3d the MCAO group staining with GFAP looks suspiciously like ependymal cells which have been shown to be dramatically activated by stroke models.
The picture does show ependymal cells, which are located next to the ventricle and are indeed very proliferative in stroke. However, these cells do not express Thbs4 (Shah et al., 2018, Cell). In the quantifications from the SVZ of BrdU and IdU injected animals (Fig 3e and f), we only take into account Thbs4+ GFAP+ cells, no GFAP+ only.
(3) The TTC injury shown in Figure 5c is too low mag.
We apologize for the low mag. We have increased the magnification two-fold without compromising resolution. The problem might also have arisen from the compression of TIF into JPEG in the PDF export process. We will address this in the revised version by carefully selecting export settings. The images we used are all publication quality (300 ppi).
(4) How specific to HA is HABP?
Hyaluronic Acid Binding Protein is a canonical marker for hyaluronan that is used also in ELISA to quantify it specifically, since it does not bind other glycosaminoglycans. The label has been used for years in the field for immunochemistry, and some controls and validations have been published: Deepa et al., 2006, JBC performed appropriate controls of HABP-biotin labeling using hyaluronidase (destroys labeling) and chondroitinase (preserves labeling). Soria et al., 2020, Nat Commun checked that (i) streptavidin does not label unspecifically, and (ii) that HABP staining is reduced after hyaluronan depletion in vivo with HAS inhibitor 4MU.
(5) A number of images are out of focus and thus difficult to interpret (e.g. SFig. 4e).
This is true. We realized that the PDF conversion process for the preprint version has severely compressed the larger images, such as the one found in Fig. S4e. We have submitted a revised version in a better-quality PDF (the final paper will have the original TIFF files). We apologize for the technical problem.
(6) "restructuration" is not a word.
We apologize for the mistake and thank the reviewer for the correction. We corrected “restructuration” with “reorganization” in line 67.
(7) While much of the manuscript is well-written and logical it could use an in-depth edit to remove awkward words and phrasings.
A native English speaker has revised the manuscript to correct these awkward phrases. All changes are labeled in red in the revised version.
(8) Please describe why and how you used skeleton analysis for HABP in the methods, this will be unfamiliar to most readers. The one-sentence description in the methods is insufficient.
We have modified the text accordingly, explaining in depth the logic behind the skeleton analysis. (Line 204). We also added several lines of text describing in detail the image analysis (CD44/HABP spots, fractal dimension, masks for membranal HABP, among others, in lines 484494)
Reviewer #2 (Public Review)
Summary:
In their manuscript, Ardaya et al have addressed the impact of ischemia-induced gliogenesis from the adult SVZ and their effect on the remodeling of the extracellular matrix (ECM) in the glial scar. They use Thbs4, a marker previously identified to be expressed in astrocytes of the SVZ, to understand its role in ischemia-induced gliogenesis. First, the authors show that Thbs4 is expressed in the SVZ and that its expression levels increase upon ischemia. Next, they claim that ischemia induces the generation of newborn astrocyte from SVZ neural stem cells (NSCs), which migrate toward the ischemic regions to accumulate at the glial scar. Thbs4-expressing astrocytes are recruited to the lesion by Hyaluronan where they modulate ECM homeostasis.
Strengths:
The findings of these studies are in principle interesting and the experiments are in principle good.
Weaknesses:
The manuscript suffers from an evident lack of clarity and precision in regard to their findings and their interpretation.
We thank the reviewer for the valuable feedback. We hope the changes proposed improve clarity and precision throughout the manuscript.
(1) The authors talk about Thbs4 expression in NSCs and astrocytes, but neither of both is shown in Figure 1, nor have they used cell type-specific markers.
As we reported also to Referee #1 (major point 1), Thbs4 is widely considered in literature as a valid marker for newly formed astrocytes (Beckervordersanforth et al., 2010; Benner et al., 2013; Llorens-Bobadilla et al. 2015, Codega et al, 2014; Basak et al., 2018; Mizrak et al., 2019; Kjell et al., 2020; Cebrian-Silla et al., 2021). Some of the studies mentioned here and discussed in the manuscript text, also associate Thbs4 to B-type cells based on scRNAseq data (LlorensBobadilla et al. 2015; Cebrian-Silla et al., 2021; Basak et al., 2018). Moreover, we also showed colocalization of Thbs4 with activated stem cells marker nestin (Fig.2), glial marker GFAP (Fig. 3) and with dorsal NSCs marker tdTOM (from electroporation, Fig. 4).
(2) Very important for all following experiments is to show that Thbs4 is not expressed outside of the SVZ, specifically in the areas where the lesion will take place. If Thbs4 was expressed there, the conclusion that Thbs4+ cells come from the SVZ to migrate to the lesion would be entirely wrong.
In Figure 1a, we show that Thbs4 is expressed in the telencephalon, exclusively in the neurogenic regions like SVZ, RMS and OB, together with cerebellum and VTA, which are likely not directly topographically connected to the damaged area (cortex and striatum). Regarding the origin of Thbs4+ cells, we demonstrated their SVZ origin by lineage tracking experiments after in vivo cell labeling (Fig. 4).
(3) Next, the authors want to confirm the expression level of Thbs4 by electroporation of pThbs4-eGFP at P1 and write that this results in 20% of total cells expressing GFP, especially in the rostral SVZ. I do not understand the benefit of this sentence. This may be a confirmation of expression, but it also shows that the GFP+ cells derive from early postnatal NSCs.
Furthermore, these cells look all like astrocytes, so the authors could have made a point here that indeed early postnatal NSCs expressing Thbs4 generate astrocytes alongside development. Here, it would have been interesting to see how many of the GFP+ cells are still NSCs.
We thank the reviewer for this useful remark. We have rephrased this paragraph in the results section (Line 99).
(4) In the next chapter, the authors show that Thbs4 increases in expression after brain injury. I do not understand the meaning of the graphs showing expression levels of distinct cell types of the neuronal lineage. Please specify why this is interesting and what to conclude from that.
Also here, the expression of Thbs4 should be shown outside of the SVZ as well.
In Fig 2, we show the temporal expression of two markers (besides Thbs4) in the SVZ. Nestin and DCX are the gold standard markers for NSCs, with DCX present in neuroblasts. This is already explained in line 119. What we didn’t explain, and now we say in line 124, is that Nestin and DCX decrease immediately after ischemia (7d time-point). This probably means that the NSCs stop differentiating into neuroblast to favor glioblast formation. This is also supported by the experiments in the olfactory bulb depicted in Fig. S5C-H.
(5) Next, the origin of newborn astrocytes from the SVZ upon ischemia is revealed. The graphs indicate that the authors perfused at different time points after tMCAO. Did they also show the data of the early time points? If only of the 30dpi, they should remove the additional time points indicated in the graph. In line 127 they talk about the origin of newborn astrocytes. Until now they have not even mentioned that new astrocytes are generated. Furthermore, the following sentences are imprecise: first they write that the number of slow proliferation NSCs is increased, then they talk about astrocytes. How exactly did they identify astrocytes and separate them from NSCs? Morphologically? Because both cell types express GFAP and Thbs4.
The same problem also occurs throughout the next chapter.
We thank the reviewer for this interesting comment. The experiment in Fig 3 combines BrdU and IdU. This is a tricky experiment, since chronic BrdU is normally analyzed after 30d, since the experimenter must wait for the wash out of BrdU (it labels slow-proliferating cells). Since we also wanted to label fast proliferative cells with IdU, we used IP injections of this nucleotide at the different time points, and perfused the day after. It wouldn’t make sense to show BrdU at earlier time points. We do so in Fig 3e, just to colocalize with Thbs4 to read the tendency of the experiment. However, the quantification of BrdU (not of IdU) is done only at 30 DPI, which is explained in the methods (line 407).
“In line 127, they talk about the origin of newborn astrocytes…”
Indeed, we wanted to introduce in the paragraph title that ischemia induced the generation of new astrocytes, which is more clearly described in the text. We changed the paragraph title with “Characterization of Ischemia-induced cell populations”
“How exactly did they identify astrocytes and separate them from NSC?”
With this experiment and using two different protocols to label proliferating cells (BrdU vs IdU) we wanted to track the precursor cells that derivate to astrocytes and that already expressed the marker Thbs4. Indeed, the different increase and rate of proliferation is only related to the progenitor cells that lately will differentiate in astrocytes. In this experiment we only referred to the astrocytes in the last sentence “These results suggest that, after ischemia, Thbs4positive astrocytes derive from the slow proliferative type B cells”
(6) "These results suggest that ischemia-induced astrogliogenesis in the SVZ occurs in type B cells from the dorsal region, and that these newborn Thbs4-positive astrocytes migrate to the ischemic areas." This sentence is a bit dangerous and bares at least one conceptual difficulty: if NSCs generate astrocytes under normal conditions and along the cause of postnatal development (which they do), then local astrocytes (expressing the tdTom because they stem from a postnatal NSC ), may also react to MCAO and proliferate locally. So the astrocytes along the scar do not necessarily come from adult NSCs upon injury but from local astrocytes. If the authors state that NSCs generate astrocytes that migrate to the lesion, I would like to see that no astrocytes inside the striatum carry the tdTom reporter before MCAO is committed.
We understand the referee’s concern about the postnatal origin of astrocytes that can also be labeled with tdTom. Our hypothesis, tested at the beginning of the paper, is that SVZ-derived astrocytes derive from slow proliferative NSC. Thus, it is reasonable that Tom+ cells can reach the cortical region in such a short time frame. This is why we assumed that local astrocytes can’t be positive for tdTom. We characterized the expression of tfTom in sham animals and we observed few tdTom+ cells in the cortex and striatum (Author response image 2 and Figure S4). The expression of tdTom mainly remains in the SVZ and the corpus callosum under physiological conditions. However, proliferation of local astrocytes labeled with tdTom expression (early postnatally astrocytes) could explain the small percentage of tdTom+ cells in the ischemic regions that do not express Thbs4, even though this percentage could represent other cell types such as OPCs or oligodendrocytes.
Author response image 2.
(7) If astrocytes outside the SVZ do not express Thbs4, I would like to see it. Otherwise, the discrimination of SVZ-derive GFAP+/Thbs4+ astrocytes and local astrocytes expressing only GFAP is shaky.
Regarding Thbs4 outside the SVZ, we already answered this in point 2 (please refer to Fig 1A). We also quantified the expression of Thbs4+/GFAP+ astrocytes in the corpus callosum, cortex and striatum of sham and MCAO mice (Figure S5a-b) and we did not observe that local astrocytes express Thbs4 under physiological conditions.
(8) Please briefly explain what a Skeleton analysis and a Fractal dimension analysis is, and what it is good for.
We apologized for the brief information on Skeleton and Fractal dimension analysis. We included a detailed explanation of these analyses in methods (line 484-494).
(9) The chapter on HA is again a bit difficult to follow. Please rewrite to clarify who produces HA and who removes it by again showing all astrocyte subtypes (GFAP+/Thbs4+ and GFAP+/Thbs4-).
We apologize for the lack of clarity. We rewrote some passages of those chapters (changes in red), trying to convey the ideas more clearly. We also changed a panel in Figure S6b-c to clarify all astrocytes subtypes that internalize hyaluronan (Thbs4+/GFAP+ and Thbs4-/GFAP+). See Author response image 3.
Author response image 3.
(10) Why did the authors separate dorsal, medial, and ventral SVZ so carefully? Do they comment on it? As far as I remember, astrogenesis in physiological conditions has some local preferences (dorsal?)
We performed the electroporation protocol in the dorsal SVZ based on previous results (Figure 3 and Figure S3). NSC produce specific neurons in the olfactory bulb according to their location in the SVZ. However, postnatal production of astrocytes mainly occurs through local astrocytes proliferation and the SVZ contribution is very limited at this time point.
Reviewer #3 (Public Review)
Summary:
The authors aimed to study the activation of gliogenesis and the role of newborn astrocytes in a post-ischemic scenario. Combining immunofluorescence, BrdU-tracing, and genetic cellular labelling, they tracked the migration of newborn astrocytes (expressing Thbs4) and found that Thbs4-positive astrocytes modulate the extracellular matrix at the lesion border by synthesis but also degradation of hyaluronan. Their results point to a relevant function of SVZ newborn astrocytes in the modulation of the glial scar after brain ischemia. This work's major strength is the fact that it is tackling the function of SVZ newborn astrocytes, whose role is undisclosed so far.
Strengths:
The article is innovative, of good quality, and clearly written, with properly described Materials and Methods, data analysis, and presentation. In general, the methods are designed properly to answer the main question of the authors, being a major strength. Interpretation of the data is also in general well done, with results supporting the main conclusions of this article.
Weaknesses:
However, there are some points of this article that still need clarification to further improve this work.
(1) As a first general comment, is it possible that the increase in Thbs4-positive astrocytes can also happen locally close to the glia scar, through the proliferation of local astrocytes or even from local astrocytes at the SVZ? As it was shown in published articles most of the newborn astrocytes in the adult brain actually derive from proliferating astrocytes, and a smaller percentage is derived from NSCs. How can the authors rule out a contribution of local astrocytes to the increase of Thbs4-positive astrocytes? The authors also observed that only about one-third of the astrocytes in the glial scar derived from the SVZ.
We thank the reviewer for the interesting comment. We have extended the discussion about this topic in the manuscript, (lines 333-342), including the statement about a third of glial scar astrocytes being from the SVZ and not downplaying the role of local astrocytes. Whether the glial scar is populated by newborn astrocytes derived from SVZ or from local astrocytes is under debate, since there are groups that found astrocytes contribution from local astrocytes (Frisèn group, Magnusson et al., 2014) but there are others that observed the opposite (Li et al., 2010; Benner et al., 2013; Faiz et al., 2015; Laug et al., 2019 & Pous et al., 2020).
In our study we observed that Thbs4 expression is almost absent in the cortex and striatum of sham mice. To demonstrate that new-born astrocytes are derived from SVZ we used two techniques: the chronic BrdU treatment and the cell tracing which mainly labels SVZ neural stem cells. Fast proliferating cells lose BrdU quickly so local astrocytes under ischemic conditions do not express BrdU. In addition, we injected IdU the day before perfusion in order to see if local astrocytes express Thbs4 when they respond to the brain ischemia. However, we did not observe proliferating local astrocytes expressing Thbs4 after MCAO (see Author response image 4)
Author response image 4.
As mentioned in the response for reviewer 2, the cell tracing technique could label early postnatal astrocytes. We characterized the technique and only a small percentage of tdTom expression was found in the cortex and striatum of sham animals. This tdTom population could explain the percentage of tdTom+ cells in the ischemic regions that do not express Thbs4 even though this percentage could represent other cell types such as OPCs or oligodendrocytes. Taking all together, evidences suggest that Thbs4+ astrocyte population derived from the SVZ.
We indeed observed a small contribution of Thbs4+ astrocytes to the glial scar. However, Thbs4+ astrocytes arrive at the lesion at a critical temporal window - when local hyper-reactive astrocytes die or lose their function. We hypothesized that Thbs4+ astrocytes could help local astrocytes or replace them in reorganizing the extracellular space and the glial scar, an instrumental process for the recovery of the ischemic area.
(2) It is known that the local, GFAP-reactive astrocytes at the scar can form the required ECM. The authors propose a role of Thbs4-positive astrocytes in the modulation, and perhaps maintenance, of the ECM at the scar, thus participating in scar formation likewise. So, this means that the function of newborn astrocytes is only to help the local astrocytes in the scar formation and thus contribute to tissue regeneration. Why do we need specifically the Thbs4positive astrocytes migrating from the SVZ to help the local astrocytes? Can you discuss this further?
Unfortunately, we could not demonstrate which molecular machinery is involved in these mechanisms, and we can only speculate the functional meaning of a second wave of glial activation. We added a lengthy discussion in lines 333-342.
(3) The authors observed that the number of BrdU- and DCX-positive cells decreased 15 dpi in all OB layers (Fig. S5). They further suggest that ischemia-induced a change in the neuroblasts ectopic migratory pathway, depriving the OB layers of the SVZ newborn neurons. Are the authors suggesting that these BrdU/DCX-positive cells now migrate also to the ischemic scar, or do they die? In fact, they see an increase in caspase-3 positive cells in the SVZ after ischemia, but they do not analyse which type of cells are dying. Alternatively, is there a change in the fate of the cells, and astrogliogenesis is increased at the expense of neurogenesis? The authors should understand which cells are Cleaved-caspase-3 positive at the SVZ and clarify if there is a change in cell fate. Also please clarify what happens to the BrdU/DCX-positive cells that are born at the SVZ but do not migrate properly to the OB layers.
Actually, we cannot demonstrate the fate of missing BrdU/DCX cells in the OB. We can reasonably speculate that following the ischemic insult, the neurogenic machinery steers toward investing more energy in generating glial cells to support the lesion. We didn’t analyze the fate of the DCX that originally should migrate and differentiate to the OB, whether they die or if there is a shift in the differentiation program in the SVZ, since we consider that question is out of the study’s scope.
(4) The authors showed decreased Nestin protein levels at 15 dpi by western blot and immunostaining shows a decrease already at 7div (Figure 2). These results mean that there is at least a transient depletion of NSCs due to the promotion of astrogliogenesis. However, the authors show that at 30dpi there is an increase of slow proliferating NSCs (Figure 3). Does this mean, that there is a reestablishment of the SVZ cytogenic process? How does it happen, more specifically, how NSCs number is promoted at 30dpi? Please explain how are the NSCs modulated throughout time after ischemia induction and its impact on the cytogenic process.
Based on the chronic BrdU treatment, results suggested a restoration of SVZ cytogenic process (also observed in the nestin and DCX proteins expression at 30dpi). However, we did not analyze how it happens (from asymmetric or symmetric divisions). As suggested by Encinas group, we hypothesized that the brain ischemia induces the exhaustion of the neurogenic niche of the SVZ by symmetric divisions of NSC into reactive astrocytes.
(5) The authors performed a classification of Thbs4-positive cells in the SVZ according to their morphology. This should be confirmed with markers expressed by each of the cell subtypes.
We thank the referee for the comment. Classifying NSC based on different markers could also be tricky because different NSC cell types share markers. This classification was made considering the specific morphology of each NSC cell type. In addition, Thbs4 expression in Btype cells is also observed in other studies (Llorens-Bobadilla et al. 2015; Cebrian-Silla et al.,
2021; Basak et al., 2018).
(6) In Figure S6, the authors quantified HABP spots inside Thbs4-positive astrocytes. Please show a higher magnification picture to show how this quantification was done.
We quantified HABP area and HABP spots inside Thbs4+ astrocytes with a custom FIJI script.
Thbs4 cell mask was done via automatic thresholding within the GFAP cell mask. Threshold for HABP marker was performed and binary image was processed with 1 pixel median filter (to eliminate 1 px noise-related spots). “Analyze particles” tool was used to sort HABP spots in the cell ROI. HABP spot number per compartment and population was exported to excel and data was normalized dividing HABP spots per ROI by total HABP spots. See Author response image 5.
Author response image 5.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Public Reviews:
Reviewer #1 (Public review):
When you search for something, you need to maintain some representation (a "template") of that target in your mind/brain. Otherwise, how would you know what you were looking for? If your phone is in a shocking pink case, you can guide your attention to pink things based on a target template that includes the attribute 'pink'. That guidance should get you to the phone pretty effectively if it is in view. Most real-world searches are more complicated. If you are looking for the toaster, you will make use of your knowledge of where toasters can be. Thus, if you are asked to find a toaster, you might first activate a template of a kitchen or a kitchen counter. You might worry about pulling up the toaster template only after you are reasonably sure you have restricted your attention to a sensible part of the scene.
Zhou and Geng are looking for evidence of this early stage of guidance by information about the surrounding scene in a search task. They train Os to associate four faces with four places. Then, with Os in the scanner, they show one face - the target for a subsequent search. After an 8 sec delay, they show a search display where the face is placed on the associated scene 75% of the time. Thus, attending to the associated scene is a good idea. The questions of interest are "When can the experimenters decode which face Os saw from fMRI recording?" "When can the experimenters decode the associated scene?" and "Where in the brain can the experimenters see evidence of this decoding? The answer is that the face but not the scene can be read out during the face's initial presentation. The key finding is that the scene can be read out (imperfectly but above chance) during the subsequent delay when Os are looking at just a fixation point. Apparently, seeing the face conjures up the scene in the mind's eye.
This is a solid and believable result. The only issue, for me, is whether it is telling us anything specifically about search. Suppose you trained Os on the face-scene pairing but never did anything connected to the search. If you presented the face, would you not see evidence of recall of the associated scene? Maybe you would see the activation of the scene in different areas and you could identify some areas as search specific. I don't think anything like that was discussed here.
You might also expect this result to be asymmetric. The idea is that the big scene gives the search information about the little face. The face should activate the larger useful scene more than the scene should activate the more incidental face, if the task was reversed. That might be true if the finding is related to a search where the scene context is presumed to be the useful attention guiding stimulus. You might not expect an asymmetry if Os were just learning an association.
It is clear in this study that the face and the scene have been associated and that this can be seen in the fMRI data. It is also clear that a valid scene background speeds the behavioral response in the search task. The linkage between these two results is not entirely clear but perhaps future research will shed more light.
It is also possible that I missed the clear evidence of the search-specific nature of the activation by the scene during the delay period. If so, I apologize and suggest that the point be underlined for readers like me.
We will respond to this question by acknowledging that the reviewer is right in that the delay period activation of the scene is not necessarily search-specific. We will then discuss how this possibility affects the interpretation of our results and what kind of studies would need to be conducted in order to fully establish a causal link between delay period activity and visual search performance. We will also discuss the literature on cued attention and situate our work within the context of these other studies that have used similar task paradigms to infer attentional processes. Finally, we will discuss the interpretation of delay period activity in PPA and IFJ.
Reviewer #2 (Public review):
Summary:
This work is one of the best instances of a well-controlled experiment and theoretically impactful findings within the literature on templates guiding attentional selection. I am a fan of the work that comes out of this lab and this particular manuscript is an excellent example as to why that is the case. Here, the authors use fMRI (employing MVPA) to test whether during the preparatory search period, a search template is invoked within the corresponding sensory regions, in the absence of physical stimulation. By associating faces with scenes, a strong association was created between two types of stimuli that recruit very specific neural processing regions - FFA for faces and PPA for scenes. The critical results showed that scene information that was associated with a particular cue could be decoded from PPA during the delay period. This result strongly supports the invoking of a very specific attentional template.
Strengths:
There is so much to be impressed with in this report. The writing of the manuscript is incredibly clear. The experimental design is clever and innovative. The analysis is sophisticated and also innovative. The results are solid and convincing.
Weaknesses:
I only have a few weaknesses to point out.
This point is not so much of a weakness, but a further test of the hypothesis put forward by the authors. The delay period was long - 8 seconds. It would be interesting to split the delay period into the first 4seconds and the last 4seconds and run the same decoding analyses. The hypothesis here is that semantic associations take time to evolve, and it would be great to show that decoding gets stronger in the second delay period as opposed to the period right after the cue. I don't think this is necessary for publication, but I think it would be a stronger test of the template hypothesis.
We will conduct the suggested analysis. Depending on the outcome, we will include it in supplemental materials or the main text.
Type in the abstract "curing" vs "during."
We will fix this.
It is hard to know what to do with significant results in ROIs that are not motivated by specific hypotheses. However, for Figure 3, what are the explanations for ROIs that show significant differences above and beyond the direct hypotheses set out by the authors?
We will address how each of the ROIs wdas selected based on the use of a priori networks as masks with ROIs as sub-parcels. We will explain why specific ROIs were associated with the strongest hypotheses but how the entire networks are relevant and related to existing literatures on attentional control and working memory. This content will be included in the introduction and discussion sections.
Reviewer #3 (Public review):
The manuscript contains a carefully designed fMRI study, using MVPA pattern analysis to investigate which high-level associate cortices contain target-related information to guide visual search. A special focus is hereby on so-called 'target-associated' information, that has previously been shown to help in guiding attention during visual search. For this purpose the author trained their participants and made them learn specific target-associations, in order to then test which brain regions may contain neural representations of those learnt associations. They found that at least some of the associations tested were encoded in prefrontal cortex during the cue and delay period.
The manuscript is very carefully prepared. As far as I can see, the statistical analyses are all sound and the results integrate well with previous findings.
I have no strong objections against the presented results and their interpretation.
Thank you.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1:
Comment#1: Ren et al developed a novel computational method to investigate cell evolutionary trajectory for scRNA-seq samples. This method, MGPfact, estimates pseudotime and potential branches in the evolutionary path by explicitly modeling the bifurcations in a Gaussian process. They benchmarked this method using synthetic as well as real-world samples and showed superior performance for some of the tasks in cell trajectory analysis. They further demonstrated the utilities of MGPfact using single-cell RNA-seq samples derived from microglia or T cells and showed that it can accurately identify the differentiation timepoint and uncover biologically relevant gene signatures. Overall I think this is a useful new tool that could deliver novel insights for the large body of scRNA-seq data generated in the public domain. The manuscript is written in a logical way and most parts of the method are well described.
Thank you for reviewing our manuscript and for your positive feedback on MGPfact. We are pleased that you find it useful for identifying differentiation timepoints and uncovering gene signatures. We will continue to refine MGPfact and explore its applications across diverse datasets. Your insights are invaluable, and we appreciate your support.
Comment#2: Some parts of the methods are not clear. It should be outlined in detail how pseudo time T is updated in Methods. It is currently unclear either in the description or Algorithm 1.
Thanks to the reviewers' comments. We've added a description of how pseudotime T is obtained between lines 138 and 147 in the article. In brief, the pseudotime of MGPfact is inferred through Gaussian process regression on the downsampled single-cell transcriptomic data. Specifically, T is treated as a continuous variable representing the progression of cells through the differentiation process. We describe the relationship between pseudotime and expression data using the formula:
Where f(T) is a Gaussian Process (GP) with covariance matrix S, and Ɛ represents the error term. The Gaussian process is defined as:
Where
is the variance set to 1e-6.
During inference, we update the pseudotime by maximizing the posterior likelihood. Specifically, the posterior distribution of pseudotime T can be represented as:
Where
is the likelihood function of the observed data Y*, and
is the prior distribution of the Gaussian process. This posterior distribution integrates the observed data with model priors, enabling inference of pseudotime and trajectory simultaneously. Due to the high autocorrelation of in the posterior distribution, we use Adaptive Metropolis within Gibbs (AMWG) sampling (Roberts and Rosenthal, 2009; Tierney, 1994). Other parameters are estimated using the more efficient SLICE sampling technique (Neal, 2003).
Comment#3: There should be a brief description in the main text of how synthetic data were generated, under what hypothesis, and specifically how bifurcation is embedded in the simulation.
Thank you for the reviewers' comments. We have added descriptions regarding the synthetic dataset in the methods section. The revised content is from line 487 to 493:
“The synthetic datasets were generated using four simulators: dyngen (Saelens et al., 2019), dyntoy (Saelens et al., 2019), PROSSTT (Papadopoulos et al., 2019), and Splatter (Zappia et al., 2017), each modeling different trajectory topologies such as linear, branching, and cyclic. Splatter simulates branching events by setting expression states and transition probabilities, dyntoy generates random expression gradients to reflect dynamic changes, and dyngen focuses on complex branching structures within gene regulatory networks.”
Comment#4: Please explain what the abbreviations mean at their first occurrence.
We appreciate the reviewers' feedback. We have thoroughly reviewed the entire manuscript and made sure that all abbreviations have had their full forms provided upon their first occurrence.
Comment#5: In the benchmark analysis (Figures 2/3), it would be helpful to include a few trajectory plots of the real-world data to visualize the results and to evaluate the accuracy.
We appreciate the reviewer's feedback. To more clearly demonstrate the performance of MGPfact, we selected three representative cases from the dataset for visual comparison. These cases represent different types of trajectory structures: linear, bifurcation, and multifurcation. The revised content is between line 220 and 226.
As shown in Supplementary Fig. 5, it is evident that MGPfact excels in capturing main developmental paths and identifying key bifurcation points. In the linear trajectory structure, MGPfact accurately predicted the linear structure without bifurcation events, showing high consistency with the ground truth (overall\=0.871). In the bifurcation trajectory structure, MGPfact accurately captured the main bifurcation event (overall\=0.636). In the multifurcation trajectory structure, although MGPfact predicted only one bifurcation point, its overall structure remains close to the ground truth, as evidenced by its high overall score (overall\=0.566). Overall, MGPfact demonstrates adaptability and accuracy in reconstructing various types of trajectory structures.
Comment#6: It is not clear how this method selects important genes/features at bifurcation. This should be elaborated on in the main text.
Thanks to the reviewers' comments. To enhance understanding, we've added detailed descriptions of gene selection in the main text and appendix, specifically from lines 150 to 161. In brief, MGPfact employs a Gaussian process mixture model to infer cell fate trajectories and identify independent branching events. We calculate load matrices using formulas 1 and 14 to assess each gene's contribution to the trajectories. Genes with an absolute weight greater than 0.05 are considered predominant in specific branching processes. Subsequently, SCENIC (Aibar et al., 2017; Bravo González-Blas et al., 2023) analysis was conducted to further infer the underlying regulons and annotate the biological processes of these genes.
Comment#7: It is not clear how survival analysis was performed in Figure 5. Specifically, were critical confounders, such as age, clinical stage, and tumor purity controlled?
To evaluate the predictive and prognostic impacts of the selected genes, we utilized the Cox multivariate regression model, where the effects of relevant covariates, including age, clinical stage, and tumor purity, were adjusted. We then conducted the Kaplan-Meier survival analysis again to ensure the reliability of the results. The revisions mainly include the following sections:
(1) We modified the description of adjusting for confounding factors in the survival analysis, from line 637 to 640:
“To adjust for possible confounding effects, the relevant clinical features including age, sex and tumor stage were used as covariates. The Cox regression model was implemented using R-4.2 package “survival”. And we generated Kaplan-Meier survival curves based on different classifiers to illustrate differences in survival time and report the statistical significance based on Log-rank test.”
(2) We updated the images in the main text regarding the survival analysis, including Fig. 5a-b, Fig. 6c, and Supplementary Fig. 8e.
Comment#8: I recommend that the authors perform some sort of 'robustness' analysis for the consensus tree built from the bifurcation Gaussian process. For example, subsample 80% of the cells to see if the bifurcations are similar between each bootstrap.
We appreciate the reviewers' feedback. We performed a robustness analysis of the consensus tree using 100 training datasets. This involved sampling the original data at different proportions, and then calculating the topological similarity between the consensus trajectory predictions of MGPfact and those without sampling, using the Hamming-Ipsen-Mikhailov (HIM ) metric. A higher score indicates greater robustness. The relevant figure is in Supplementary Fig. 4, and the description is in the main text from line 177 to 182.
The results indicate that the consensus trajectory predictions based on various sampling proportions of the original data maintain a high topological similarity with the unsampled results (HIM<sub>mean</sub>=0.686). This demonstrates MGPfact’s robustness and generalizability under different data conditions, hence the capability of capturing bifurcative processes in the cells’ trajectory.
Reviewer #2:
Comment#1: The authors present MGPfact<sup>XMBD</sup>, a novel model-based manifold-learning framework designed to address the challenges of interpreting complex cellular state spaces from single-cell RNA sequences. To overcome current limitations, MGPfact<sup>XMBD</sup> factorizes complex development trajectories into independent bifurcation processes of gene sets, enabling trajectory inference based on relevant features. As a result, it is expected that the method provides a deeper understanding of the biological processes underlying cellular trajectories and their potential determinants. MGPfact<sup>XMBD</sup> was tested across 239 datasets, and the method demonstrated similar to slightly superior performance in key quality-control metrics to state-of-the-art methods. When applied to case studies, MGPfact<sup>XMBD</sup> successfully identified critical pathways and cell types in microglia development, validating experimentally identified regulons and markers. Additionally, it uncovered evolutionary trajectories of tumor-associated CD8+ T cells, revealing new subtypes with gene expression signatures that predict responses to immune checkpoint inhibitors in independent cohorts. Overall, MGPfact<sup>XMBD</sup> represents a relevant tool in manifold learning for scRNA-seq data, enabling feature selection for specific biological processes and enhancing our understanding of the biological determinants of cell fate.
Thank you for your thoughtful review of our manuscript. We are thrilled to hear that you find MGPfact<sup>XMBD</sup> beneficial for exploring cellular evolutionary paths in scRNA-seq data. Your insights are invaluable, and we look forward to incorporating them to further enrich our study. Thank you once again for your support and constructive feedback.
Comment#2: How the methods compare with existing Deep Learning based approaches such as TIGON is a question mark. If a comparison would be possible, it should be conducted; if not, it should be clarified why.
We appreciate the reviewer's comments. We have added a comparison with the sctour (Li, 2023) and TIGON methods (Sha, 2024).
It is important to note that the encapsulation and comparison of MGPfact are based on traditional differentiation trajectory construction. Saelens et al. established a systematic evaluation framework that categorizes differentiation trajectory structures into topological subtypes such as linear, bifurcation, multifurcation, graph, and tree, focusing on identifying branching structures in the cell differentiation process (Saelens et al., 2019). The sctour and TIGON methods mentioned by the reviewer are primarily used for estimating RNA velocity, focusing on continuous temporal evolution rather than explicit branching structures, and do not explicitly model branches. Therefore, we considered the predictions of these two methods as linear trajectories and compared them with MGPfact. While scTour explicitly estimates pseudotime, TIGON uses the concept of "growth," which is analogous to pseudotime, so we made the necessary adaptations.
Author response image 1 show that within this framework, compared to scTour (overall<sub>mean</sub>=0.448) and TIGON (overall<sub>mean</sub>=0.263), MGPfact still maintains a relatively high standard (overall<sub>mean</sub>=0.534). This indicates that MGPfact has a significant advantage in accurately capturing branching structures in cell differentiation, especially in applications where explicit modeling of branches is required.
Author response image 1.
Comparison of MGPfact with scTour and TIGON in trajectory inference performance across 239 test datasets. a. Overall scores; b.F1<sub>branches</sub>; c.HIM; d. cor<sub>dist</sub>; e. wcor<sub>features</sub>. All results are color-coded based on the trajectory types, with the black line representing the mean value. The “Overall” assessment is calculated as the geometric mean of all four metrics.
Comment#3: Missing Methods:
- The paper lacks a discussion of Deep Learning approaches for bifurcation analysis. e.g. scTour, Tigon.
- I am missing comments on methods such CellRank, and alternative approaches to delineate a trajectory.
We thank the reviewer for these comments.
(1) As mentioned in response to Comments#2, the scTour and TIGON methods are primarily used for estimating RNA velocity, focusing on continuous temporal evolution rather than explicit branching structures, and they do not explicitly model branches. We consider the predictions of these two methods as linear trajectories and compare them with MGPfact. The relevant description and discussion have been addressed in the response.
(2) We have added a description of RNA velocity estimation methods (scTour, TIGON, CellRank) in the introduction section. The revised content is from line 66 to 71:
“Moreover, recent studies based on RNA velocity has provided insights into cell state transitions. These methods measure RNA synthesis and degradation rates based on the abundance of spliced and unspliced mRNA, such as CellRank (Lange et al., 2022). Nevertheless, current RNA velocity analyses are still unable to resolve cell-fates with complex branching trajectory. Deep learning methods such as scTour (Li, 2023) and TIGON (Sha, 2024) circumvent some of these limitations, offering continuous state assumptions or requiring prior cell sampling information.”
Comment#4: Impact of MURP:
The rationale for using MURP is well-founded, especially for trajectory definition. However, its impact on the final results needs evaluation.
How does the algorithm compare with a random subselection of cells or the entire cell set?
Thank you for the comments. We fully agree that MURP is crucial in trajectory prediction. As a downsampling method, MURP is specifically designed to address noise issues in single-cell data by dividing the data into several subsets, thereby maximizing noise reduction while preserving the main structure of biological variation (Ren et al., 2022). In MGPfact, MURP typically reduces the data to fewer than 100 downsampled points, preserving the core biological structure while lowering computational complexity. To assess MURP's impact, we conducted experiments by randomly selecting 20, 40, 60, 80, and 100 cells for trajectory inference. These results were mapped back to the original data using the KNN graph structure for final predictions, which were then compared with the MURP downsampling results. Supplementary results can be found in Supplementary Fig. 3, with additional descriptions in the main text from line 170 to 176.
The results indicate that trajectory inference using randomly sampled cells has significantly lower prediction accuracy compared to that using MURP. This is particularly evident in branch assignment (F1<sub>branches</sub>) and correlation cor<sub>dist</sub>, where the average levels decrease by 20.5%-64.9%. In contrast, trajectory predictions using MURP for downsampling show an overall score improvement of 5.31%-185%, further highlighting MURP's role in enhancing trajectory inference within MGPfact.
Comment#5: What is the impact of the number of components selected?
Thank you for the comments. In essence, MGPfact consists of two main steps: 1) trajectory inference; 2) calculation of factorized scores and identification of high-weight genes. After step 1, MGPfact estimates parameters such as pseudotime T and bifurcation points B. In step 2, we introduce a rotation matrix
to obtain factor scores W<sub>l</sub> for each trajectory l by rotating Y*.
For all trajectories,
where e<sub>l</sub> is the error term for the -th trajectory. The number of features in Y* must match the dimensions of the rotation matrix R to ensure the factorized score matrix W contains factor scores for trajectories, achieving effective feature representation and interpretation in the model.
Additionally, to further illustrate the impact of the number of principal components (PCs) on model performance in step 1, we conducted additional experiments. We used 3 PCs as the default and adjusted the number to evaluate changes from this baseline. As shown in Author response image 2, setting the number of PCs to 1 significantly decreases the overall performance score (overall<sub>mean</sub>=0.363), as well as the wcor<sub>features</sub>
and wcor<sub>dist</sub>
metrics. In contrast, increasing the number of PCs does not significantly affect the metrics. It ought to be mentioned that number of components used should be determined by the intrinsic biological characteristics of the cell fate-determination. Our experiment based on a limited number of datasets may not represent more complex scenarios in other cell types.
Author response image 2.
Robustness testing of the number of MURP PCA components on 100 training datasets. With the number of principal components (PCs) set to 3 by default; we tested the impact of different number of components (1-10) on the prediction results. In all box plots, the asterisk represents the mean value, while the whiskers extend to the farthest data points within 1.5 times the interquartile range. Significance is denoted as follows: not annotated indicates non-significant; * P < 0.05; ** P < 0.01; *** P < 0.001; two-sided paired Student’s T-tests.
Comment#6: Please comment on the selection of the kernel functions (rbf and polynomial) and explain why other options were discarded.
Thank you for the comments. We have added a description regarding the selection of radial basis functions and polynomial kernels in lines 126-130. As the reviewers mentioned, the choice of kernel functions is crucial in the MGPfact analysis pipeline for constructing the covariance matrix of the Gaussian process. We selected the radial basis function (RBF) kernel and the polynomial kernel to balance capturing data complexity and computational efficiency. The RBF kernel is chosen for its ability to effectively model smooth functions and capture local variations in the data, making it well-suited to the continuous and smooth characteristics of biological processes; its hyperparameters offer modeling flexibility. The polynomial kernel is used to capture more complex nonlinear relationships between input features, with its hyperparameters also allowing further customization of the model. In contrast, other complex kernels, such as Matérn or spectral kernels, were omitted due to their interpretability challenges and the risk of overfitting with limited data. However, as suggested by the reviewers, we will consider and test the impact of other kernel functions on the covariance matrix of the Gaussian process and their role in trajectory inference in our subsequent phases of algorithm design.
Comment#7: What is the impact of the Pseudotime method used initially? This section should be expanded with clear details on the techniques and parameters used in each analysis.
We are sorry for the confusion. We've added a description of how pseudotime T is obtained between line 138 and 147 in the main text. And the specific hyperparameters involved in the model and their prior settings are detailed in the supplementary information.
In brief, the pseudotime and related topological parameters of the bifurcative trajectories in MGPfact are inferred by Gaussian process regression from downsampled single-cell transcriptomic data (MURP). Specifically, T is treated as a continuous variable representing the progression of cells through the differentiation process. We describe the relationship between pseudotime and expression data as:
where f(T) is a Gaussian Process (GP) with covariance matrix S, and ε represents the error term. The Gaussian process is defined as:
where
is the variance set to 1e-6. During inference, we update the pseudotime by maximizing the posterior liklihood. Specifically, the posterior distribution of pseudotime is obtained by combining the observed data Y* with the prior distribution of the Gaussian process model.
We use the Markov Chain Monte Carlo method for parameter estimation, particularly employing the adaptive Metropolis-within-Gibbs (AMWG) sampling to handle the high autocorrelation of pseudotime.
Comment#8: Enhancing Readability: For clarity, provide intuitive descriptions of each evaluation function used in simulated and real data. The novel methodology performs well for some metrics but less so for others. A clear understanding of these measurements is essential.
To address the concern of readability, we have added descriptions of 5 evaluation metrics in the methodology section (Benchmarking MGPfact to state-of-the-art methods) in line 494 to 515. Additionally, we have included a summary and discussion of these metrics in the conclusion section in line 214-240 to help the readers better understand the significance and impact of these measurements.
(1) In brief, the Hamming-Ipsen-Mikhailov (HIM) distance measures the similarity between topological structures, combining the normalized Hamming distance and the Ipsen-Mikhailov distance, which focus on edge length differences and degree distribution similarity, respectively. The F1<sub>branches</sub> is used to assess the accuracy of a model's branch assignment via Jaccard similarity between branch pairs. In trajectory inference, cor<sub>dist</sub> quantifies the similarity of inter-cell distances between predicted and true trajectories, evaluating the accuracy of cell ordering. The wcor<sub>features</sub> assesses the similarity of key features through weighted Pearson correlation, capturing biological variation. The Overall score is calculated as the geometric mean of these metrics, providing an assessment of overall performance.
(2) For MGPfact and the other seven methods included in the comparison, each has its own focus. MGPfact specializes in factorizing complex cell trajectories using Gaussian process mixture models, making it particularly capable of identifying bifurcation events. Therefore, it excels in the accuracy of branch partitioning and similarity of trajectory topology. Among other methods, scShaper (Smolander et al., 2022) and TSCAN(Ji and Ji, 2016) are more suited for generating linear trajectories and excel in linear datasets, accurately predicting pseudotime. The Monocle series, as typical representatives of tree methods, effectively capture complex topologies and are suitable for analyzing cell data with diversified differentiation paths.
Comment#9: Microglia Analysis:In Figures 3A-C, the genes mentioned in the text for each bifurcation do not always match those shown in the panels. Please confirm this.
Thank you for pointing this out. We have carefully reviewed the article and corrected the error where the genes shown in the figures did not correspond to the descriptions in the article. The specific corrections have been made between line 257 and 264:
“The first bifurcation determines the differentiated cell fates of PAM and HM, which involves a set of notable marker genes of both cell types, such as Apoe, Selplg (HM), and Gpnmb (PAM). The second bifurcation determines the proliferative status, which is crucial for the development and function of PAM and HM (Guzmán, n.d.; Li et al., 2019). The genes affected by the second bifurcation are associated with cell cycle and proliferation, such as Mki67, Tubb5, Top2a. The third bifurcation influences the development and maturity of microglia, of which the highly weighted genes, such as Tmem119, P2ry12, and Sepp1 are all previously annotated markers for establishment of the fates of microglia (Anderson et al., 2022; Li et al., 2019) (Supplementary Table 4).”
Comment#10: Regulons:
- The conclusions rely heavily on regulons. The Methods section describes using SCENIC, GENIE3, RCisTarget, and AUCell, but their relation to bifurcation analysis is unclear.
- Do you perform trajectory analysis on all MURP-derived cells or within each identified trajectory based on bifurcation? This point needs clarification to make the outcomes comprehensible. The legend of Figure 4 provides some ideas, but further clarity is required.
Thank you for the comments.
(1) To clarify, we used the tools like SCENIC to annotate the highly weighted genes (HWG) resulted from the bifurcation analysis for transcription factor regulation activity and possible impacts on biological processes. We have added descriptions to the analysis of our microglial data. The revised content is between line 265 and 266:
“Moreover, we retrieved highly active regulons from the HWG by MGPfact, of which the significance is quantified by the overall weights of the member genes.”
(2) We apologize for any confusion caused by our description. It is important to clarify that we performed an overall trajectory analysis on all MURP results, rather than analyzing within each identified trajectory. Specifically, we first used MURP to downsample all preprocessed cells, where each MURP subset represents a group of cells. We then conducted trajectory inference on all MURP subsets and identified bifurcation points. This process generated multiple independent differentiation trajectories, encompassing all MURP subsets. To clearly convey this point, we have added descriptions in the legend of Figure 4. The revised content is between line 276 and 283:
“Fig. 4. MGPfact reconstructed the developmental trajectory of microglia, recovering known determinants of microglia fate. a-c. The inferred independent bifurcation processes with respect to the unique cell types (color-coded) of microglia development, where phase 0 corresponds to the state before bifurcation; and phases 1 and 2 correspond to the states post-bifurcation. Each colored dot represents a metacell of unique cell type defined by MURP. The most highly weighted regulons in each trajectory were labeled by the corresponding transcription factors (left panels). The HWG of each bifurcation process include a set of highly weighted genes (HWG), of which the expression levels differ significantly among phases 1, 2, and 3 (right panels).”
Comment#11: CD8+ T Cells: The comparison is made against Monocle2, the method used in the publication, but it would be beneficial to compare it with more recent methods. Otherwise, the added value of MGPfact is unclear.
Per your request, we have expanded our comparative analysis to include not only Monocle2 but also more recent methods such as Monocle3 (Cao et al., 2019) and scFates Tree (Faure et al., 2023). We used adjusted R-squared values to evaluate each method's ability to explain trajectory variation. The results have been added to Table 2 and Supplementary Table 6. The revised content is between line 318 and 326:
We assessed the goodness-of-fit (adjusted R-square) of the consensus trajectory derived by MGPfact and three methods (Monocle 2, Monocle 3 and scFates Tree) for the CD8+ T cell subtypes described in the original studies (Guo et al., 2018; Zhang et al., 2018). The data showed that MGPfact significantly improved the explanatory power for most CD8+ T cell subtypes over Monocle 2, which was used in the original studies (P < 0.05, see Table 2 and Supplementary Table 6), except for the CD8-GZMK cells in the CRC dataset. Additionally, MGPfact demonstrated better explanatory power in specific cell types when compared to Monocle 3 and scFates Tree. For instance, in the NSCLC dataset, MGPfact exhibited higher explanatory power for CD8-LEF1 cells (Table 2, R-squared = 0.935), while Monocle 3 and scFates Tree perform better in other cell types.
Comment#12: Consensus Trajectory: A panel explaining how the consensus trajectory is generated would be helpful. Include both visual and textual explanations tailored to the journal's audience.
Thank you for the comments. Regarding how the consensus trajectory is constructed, we have illustrated and described this in Figure 1 and the supplementary methods. Taking the reviewers' suggestions into account, we have added more details about the generation process of the consensus trajectory in the methods section to enhance the completeness of the manuscript. The revised content is from line 599 to 606:
“Following MGPfact decomposition, we obtained multiple independent bifurcative trajectories, each corresponds to a binary tree within the temporal domain. These trajectories were then merged to construct a coherent diffusion tree, representing the consensus trajectory of cells’ fate. The merging process involves initially sorting all trajectories by their bifurcation time. The first (earliest) bifurcative trajectory is chosen as the initial framework, and subsequent trajectories are integrated to the initial framework iteratively by adding the corresponding branches at the bifurcation timepoints. As a result, the trajectories are ultimately merged into a comprehensive binary tree, serving as the consensus trajectory.”
Comment#13: Discussion:
- Check for typos, e.g., line 382 "pseudtime.".
- Avoid considering HVG as the entire feature space.
- The first three paragraphs are too similar to the Introduction. Consider shortening them to succinctly state the scenario and the implications of your contribution.
Thank you for pointing out the typos.
(1) We conducted a comprehensive review of the document to ensure there are no typographical errors.
(2) We restructured the first three paragraphs of the discussion section to clarify the limitations in the use of current manifold-learning methods and removed any absolute language regarding treating HVGs as the entire feature space. The revised content is from line 419 to 430:
“Single-cell RNA sequencing (scRNA-seq) provides a direct, quantitative snapshot of a population of cells in certain biological conditions, thereby revealing the actual cell states and functions. Although existing clustering and embedding algorithms can effectively reveal discrete biological states of cells, these methods become less efficient when depicting continuous evolving of cells over the temporal domain. The introduction of manifold learning offers a new dimension for discovery of relevant biological knowledge in cell fate determination, allowing for a better representation of continuous changes in cells, especially in time-dependent processes such as development, differentiation, and clonal evolution. However, current manifold learning methods face major limitations, such as the need for prior information on pseudotime and cell clustering, and lack of explainability, which restricts their applicability. Additionally, many existing trajectory inference methods do not support gene selection, making it difficult to annotate the results to known biological entities, thereby hindering the interpretation of results and subsequent functional studies.”
Comment#14: Minor Comments:
(1) Review the paragraph regarding the "current manifold-learning methods are faced with two major challenges." The message needs clarification.
(2) Increase the quality of the figures.
(3) Update the numbering of equations from #(.x) to (x).
We thank the reviewer for these detailed suggestions.
(1) We have thoroughly revised the discussion section, addressing overly absolute statements. The revised content is from line 426 to 428:
“However, current manifold learning methods face major limitations, such as the need for prior information on pseudotime and cell clustering, and lack of explainability, which restricts their applicability.”
(2) We conducted a comprehensive review of the figures in the article to more clearly present our results.
(3) We have meticulously reviewed the equations in the article to ensure there are no display issues with the indices.
Reference
Aibar S, González-Blas CB, Moerman T, Huynh-Thu VA, Imrichova H, Hulselmans G, Rambow F, Marine J-C, Geurts P, Aerts J, van den Oord J, Atak ZK, Wouters J, Aerts S. 2017. SCENIC: single-cell regulatory network inference and clustering. Nat Methods 14:1083–1086. doi:10.1038/nmeth.4463
Anderson SR, Roberts JM, Ghena N, Irvin EA, Schwakopf J, Cooperstein IB, Bosco A, Vetter ML. 2022. Neuronal apoptosis drives remodeling states of microglia and shifts in survival pathway dependence. Elife 11:e76564.
Bravo González-Blas C, De Winter S, Hulselmans G, Hecker N, Matetovici I, Christiaens V, Poovathingal S, Wouters J, Aibar S, Aerts S. 2023. SCENIC+: single-cell multiomic inference of enhancers and gene regulatory networks. Nat Methods. doi:10.1038/s41592-023-01938-4
Cao J, Spielmann M, Qiu X, Huang X, Ibrahim DM, Hill AJ, Zhang F, Mundlos S, Christiansen L, Steemers FJ, Trapnell C, Shendure J. 2019. The single-cell transcriptional landscape of mammalian organogenesis. Nature 566:496–502. doi:10.1038/s41586-019-0969-x
Faure L, Soldatov R, Kharchenko PV, Adameyko I. 2023. scFates: a scalable python package for advanced pseudotime and bifurcation analysis from single-cell data. Bioinformatics 39:btac746. doi:10.1093/bioinformatics/btac746
Guo X, Zhang Y, Zheng L, Zheng C, Song J, Zhang Q, Kang B, Liu Z, Jin L, Xing R, Gao R, Zhang L, Dong M, Hu X, Ren X, Kirchhoff D, Roider HG, Yan T, Zhang Z. 2018. Global characterization of T cells in non-small-cell lung cancer by single-cell sequencing. Nat Med 24:978–985. doi:10.1038/s41591-018-0045-3
Guzmán AU. n.d. Single-cell RNA sequencing of spinal cord microglia in a mouse model of neuropathic pain.
Ji Z, Ji H. 2016. TSCAN: Pseudo-time reconstruction and evaluation in single-cell RNA-seq analysis. Nucleic Acids Res 44:e117–e117. doi:10.1093/nar/gkw430
Lange M, Bergen V, Klein M, Setty M, Reuter B, Bakhti M, Lickert H, Ansari M, Schniering J, Schiller HB, Pe’er D, Theis FJ. 2022. CellRank for directed single-cell fate mapping. Nat Methods 19:159–170. doi:10.1038/s41592-021-01346-6
Li Q. 2023. scTour: a deep learning architecture for robust inference and accurate prediction of cellular dynamics. Genome Biology.
Li Q, Cheng Z, Zhou L, Darmanis S, Neff NF, Okamoto J, Gulati G, Bennett ML, Sun LO, Clarke LE, Marschallinger J, Yu G, Quake SR, Wyss-Coray T, Barres BA. 2019. Developmental Heterogeneity of Microglia and Brain Myeloid Cells Revealed by Deep Single-Cell RNA Sequencing. Neuron 101:207-223.e10. doi:10.1016/j.neuron.2018.12.006
Neal RM. 2003. Slice sampling. The annals of statistics 31:705–767.
Papadopoulos N, Gonzalo PR, Söding J. 2019. PROSSTT: probabilistic simulation of single-cell RNA-seq data for complex differentiation processes. Bioinformatics 35:3517–3519. doi:10.1093/bioinformatics/btz078
Ren J, Zhang Q, Zhou Y, Hu Y, Lyu X, Fang H, Yang J, Yu R, Shi X, Li Q. 2022. A downsampling method enables robust clustering and integration of single-cell transcriptome data. Journal of Biomedical Informatics 130:104093. doi:10.1016/j.jbi.2022.104093
Roberts GO, Rosenthal JS. 2009. Examples of adaptive MCMC. Journal of computational and graphical statistics 18:349–367.
Saelens W, Cannoodt R, Todorov H, Saeys Y. 2019. A comparison of single-cell trajectory inference methods. Nat Biotechnol 37:547–554. doi:10.1038/s41587-019-0071-9
Sha Y. 2024. Reconstructing growth and dynamic trajectories from single-cell transcriptomics data 6.
Smolander J, Junttila S, Venäläinen MS, Elo LL. 2022. scShaper: an ensemble method for fast and accurate linear trajectory inference from single-cell RNA-seq data. Bioinformatics 38:1328–1335. doi:10.1093/bioinformatics/btab831
Tierney L. 1994. Markov chains for exploring posterior distributions. the Annals of Statistics 1701–1728.
Zappia L, Phipson B, Oshlack A. 2017. Splatter: simulation of single-cell RNA sequencing data. Genome Biol 18:174. doi:10.1186/s13059-017-1305-0
Zhang L, Yu X, Zheng L, Zhang Y, Li Y, Fang Q, Gao R, Kang B, Zhang Q, Huang JY, Konno H, Guo X, Ye Y, Gao S, Wang S, Hu X, Ren X, Shen Z, Ouyang W, Zhang Z. 2018. Lineage tracking reveals dynamic relationships of T cells in colorectal cancer. Nature 564:268–272. doi:10.1038/s41586-018-0694-x
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
We would like to thank the editors and the reviewers for constructive feedback on our first version of the manuscript. Before submitting a fully revised version with detailed response to each point, we would like to provide a brief clarification on some of the key issues.
Reviewer 2 raised a concern about the precision and specificity of holographic stimulation, regarding its potential effect on out-of-focus stimulation points and planes. We further verified whether the laser power at the targeted z-plane influences cells’ activity at nearby z-planes. As the Reviewer pointed out, the previous x- and y-axis shifts were tested by single-cell stimulation. This time, we stimulated five cells simultaneously, to match the actual experiment setup and assess potential artifacts in other planes. We observed no stimulation-driven activity increase in cells at a z-planed shifted by 20 µm (Author response image 1). This confirms the holographic stimulation accurately manipulates the pre-selected target cells and the effects we observe is not likely due to out-of-focus stimulation artifacts. It is true that not all of pre-selected cells showing significant response changes prior to the main experiment are effectively activated t every trial during the experiments. While further analyses will be included in the revised manuscript, we varied the target cell distances across FOVs, from nearby cells to those farther apart within the FOV. We have not observed a significant relationship between the target cell distances and stimulation effect. Lastly, cells within < 15 µm of the target were excluded to prevent potential excitation due to the holographic stimulation power. Given the spontaneous movements of the FOV during imaging sessions due to animal’s movement, despite our efforts to minimize them, we believe that any excitation from these neighboring neurons would be directly from the stimulation rather than the light pattern artifact itself.
Author response image 1.
Stimulation effect on five pre-selected cells at the target z-plane (left) and 20 µm off-target z-plane (right). No stimulation-driven effect was observed on the off-target cells.
Reviewers also raised concerns regarding the interpretation of homeostatic balance. While we are working on further analyses to strengthen our findings based on the reviewers’ suggestions, the observed response changes in co-tuned neuronal ensembles, specifically during the processing of their preferred frequency information, highlights an interaction between sensory processing and network dynamics. We believe this specificity indicates a functional mechanism beyond broad suppression or simple inhibitory effects, possibly aligning with homeostatic principles in cortical circuits. Regarding the post-stimulation effect, it is true neither the stimulation nor the control condition showed further response changes during the post-stimulation session. For the control condition, this is likely due to the repetitive tone presentation that could already triggered neural adaptation to a plateau by first two imaging sessions (baseline and stimulation sessions), preventing further changes in the last session. However, as the stimulation condition induced a greater amplitude decrease during the stimulation session compared to the control condition, if this extra suppression had not persisted during the post-stimulation session, we would have expected response amplitudes to rebound, increasing between the stimulation and post-stimulation sessions, which was not the case. Therefore, we propose that the persistence of this rebalanced network state is more indicative of a potential homeostatic mechanism in response to the activity manipulation within the network.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
In summary, the changes made in the revision process include:
An addition of a paragraph in the result section that discusses the absolute values of measured Young’s moduli in the light of probing frequencies, accompanied by a new supplementary figure and a supplementary table that support that discussion
- Fig. S10. Absolute Young’s modulus values across the frequencies characteristic for the three measurement methods.
- Table S9. Operation parameters of the three methods used for characterizing the mechanical properties of cells.
Three new supplementary figures that display the expression matrices for the genes from the identified modules in carcinoma datasets used for validation:
- Fig. S4. Expression of identified target genes in the CCLE microarray dataset used for validation.
- Fig. S5. Expression of identified target genes in the CCLE RNA-Seq dataset used for validation.
- Fig. S6. Expression of identified target genes in the Genentech dataset used for validation.
An addition of a paragraph in the discussion section that discusses the intracellular origins of resistance to deformation and the dominance of actin cortex at low deformations.
- Refinement of the manuscript text and figures based on the specific feedback from the Reviewers.
Please see below for detailed responses to the Reviewers’ comments.
Reviewer #1 (Public Review)
In this work, Urbanska and colleagues use a machine-learning based crossing of mechanical characterisations of various cells in different states and their transcriptional profiles. Using this approach, they identify a core set of five genes that systematically vary together with the mechanical state of the cells, although not always in the same direction depending on the conditions. They show that the combined transcriptional changes in this gene set is strongly predictive of a change in the cell mechanical properties, in systems that were not used to identify the genes (a validation set). Finally, they experimentally after the expression level of one of these genes, CAV1, that codes for the caveolin 1 protein, and show that, in a variety of cellular systems and contexts, perturbations in the expression level of CAV1 also induce changes in cell mechanics, cells with lower CAV1 expression being generally softer.
Overall the approach seems accessible, sound and is well described. My personal expertise is not suited to judge its validity, novelty or relevance, so I do not make comments on that. The results it provides seem to have been thoroughly tested by the authors (using different types of mechanical characterisations of the cells) and to be robust in their predictive value. The authors also show convincingly that one of the genes they identified, CAV1, is not only correlated with the mechanical properties of cells, but also that changing its expression level affects cell mechanics. At this stage, the study appears mostly focused on the description and validation of the methodological approach, and it is hard to really understand what the results obtain really mean, the importance of the biological finding - what is this set of 5 genes doing in the context of cell mechanics? Is it really central, or is it just one of the set of knobs on which the cell plays - and it is identified by this method because it is systematically modulated but maybe, for any given context, it is not the dominant player - all these fundamental questions remain unanswered at this stage. On one hand, it means that the study might have identified an important novel module of genes in cell mechanics, but on the other hand, it also reveals that it is not yet easy to interpret the results provided by this type of novel approach.
We thank the Reviewer #1 for the thoughtful evaluation of our manuscript. The primary goal of the manuscript was to present a demonstration of an unbiased approach for the identification of genes involved in the regulations of cell mechanics. The manuscript further provides a comprehensive computational validation of all genes from the identified network, and experimental validation of a selected gene, CAV1.
We agree that at the current stage, far-reaching conclusions about the biological meaning of the identified network cannot be made. We are, however, convinced that the identification of an apparently central player such as CAV1 across various cellular systems is per se meaningful, in particular since CAV1 modulation shows clear effects on the cell mechanical state in several cell types.
We anticipate that our findings will encourage more mechanistic studies in the future, investigating how these identified genes regulate mechanical properties and interact with each other. Notwithstanding, the identified genes (after testing in specific system of interest) can be readily used as genetic targets for modulating mechanical properties of cells. Access to such modifications is of huge relevance not only for performing further research on the functional consequence of cell mechanics changes (in particular in in-vivo systems where using chemical perturbations is not always possible), but also for the potential future implementation in modulating mechanical properties of the cells to prevent disease (for example to inhibit cancer metastasis or increase efficacy of cancer cell killing by cytotoxic T cells).
We have now added a following sentence in the first paragraph of discussion to acknowledge the open ends of our study:
“(...). Here we leveraged this opportunity by performing discriminative network analysis on transcriptomes associated with mechanical phenotype changes to elucidate a conserved module of five genes potentially involved in cell mechanical phenotype regulation. We provided evidence that the inferred conserved functional network module contains an ensemble of five genes that, in particular when combined in a unique combinatorial marker, are universal, specific and trustworthy markers of mechanical phenotype across the studied mouse and human systems. We further demonstrated on the example of a selected marker gene, CAV1, that its experimental up- and downregulation impacts the stiffness of the measured cells. This demonstrates that the level of CAV1 not only correlates with, but also is causative of mechanical phenotype change. The mechanistic insights into how precisely the identified genes are involved in regulating mechanical properties, how they interact with each other, and whether they are universal and dominant in various contexts all remain to be established in
future studies.”
Reviewer #2 (Public Review)
A key strength is the quantitative approaches all add rigor to what is being attempted. The approach with very different cell culture lines will in principle help identify constitutive genes that vary in a particular and predictable way. To my knowledge, one other study that should be cited posed a similar pan-tissue question using mass spectrometry proteomics instead of gene expression, and also identified a caveolae component (cavin-1, PTRF) that exhibited a trend with stiffness across all sampled tissues. The study focused instead on a nuclear lamina protein that was also perturbed in vitro and shown to follow the expected mechanical trend (Swift et al 2013).
We thank the Reviewer #2 for the positive evaluation of the breadth of the results and for pointing us to the relevant reference for the proteomic analysis related to tissue stiffness (Swift et al., 2013). This study, which focused primarily on the tissue-level mechanical properties, identifying PTRF, a caveolar component, which links to our observation of another caveolar component, CAV1, at the single-cell level.
We have now included the citation in the following paragraph of the discussion:
“To our knowledge, there are no prior studies that aim at identifying gene signatures associated with single-cell mechanical phenotype changes, in particular across different cell types. There are, however, several studies that investigated changes in expression upon exposure of specific cell types to mechanical stimuli such as compression (87, 88) or mechanical stretch (22, 80, 89), and one study that investigated difference in expression profiles between stiffer and softer cells sorted from the same population (90). Even though the studies concerned with response to mechanical stimuli answer a fundamentally different question (how gene expression changes upon exposure to external forces vs which genes are expressed in cells of different mechanical phenotype), we did observe some similarities in the identified genes. For example, in the differentially expressed genes identified in the lung epithelia exposed to compression (87), three genes from our module overlapped with the immediate response (CAV1, FHL2, TGLN) and four with the long-term one (CAV1, FHL2, TGLN, THBS1). We speculate that this substantial overlap is caused by the cells undergoing change in their stiffness during the response to compression (and concomitant unjamming transition). Another previous study explored the association between the stiffness of various tissues and their proteomes. Despite the focus on the tissue-scale rather than single-cell elasticity, the authors identified polymerase I and transcript release factor (PTRF, also known as cavin 1 and encoding for a structural component of the caveolae) as one of the proteins that scaled with tissue stiffness across samples (91).”
Reviewer #3 (Public Review)
In this work, Urbanska et al. link the mechanical phenotypes of human glioblastoma cell lines and murine iPSCs to their transcriptome, and using machine learning-based network analysis identify genes with putative roles in cell mechanics regulation. The authors identify 5 target genes whose transcription creates a combinatorial marker which can predict cell stiffness in human carcinoma and breast epithelium cell lines as well as in developing mouse neurons. For one of the target genes, caveolin1 (CAV1), the authors perform knockout, knockdown, overexpression and rescue experiments in human carcinoma and breast epithelium cell lines. They determine the cell stiffness via RT-DC, AFM indentation and AFM rheology and confirm that high CAV1 expression levels correlate with increased stiffness in those model systems. This work brings forward an interesting approach to identify novel genes in an unbiased manner, but surprisingly the authors validate caveolin 1, a target gene with known roles in cell mechanics regulation.
I have two main concerns with the current version of this work:
(1) The authors identify a network of 5 genes that can predict mechanics. What is the relationship between the 5 genes? If the authors aim to highlight the power of their approach by knockdown, knockout or over-expression of a single gene why choose CAV1 (which has an individual p-value of 0.16 in Fig S4)? To justify their choice, the authors claim that there is limited data supporting the direct impact of CAV1 on mechanical properties of cells but several studies have previously shown its role in for example zebrafish heart stiffness, where a knockout leads to higher stiffness (Grivas et al., Scientific Reports 2020), in cancer cells, where a knockdown leads to cell softening (Lin et al., Oncotarget 2015), or in endothelial cell, where a knockout leads to cell softening (Le Master et al., Scientific Reports 2022).
We thank the reviewer for their comments. First, we do acknowledge that studying the relationship between the five identified genes is an intriguing question and would be a natural extension of the currently presented work. It is, however, beyond the scope of presented manuscript, in which our primarily goal was to introduce a general pipeline for de novo identification of genes related to cell mechanics. We did add a following statement in the discussion (yellow highlight) to acknowledge the open ends of our study:
“The mechanical phenotype of cells is recognized as a hallmark of many physiological and pathological processes. Understanding how to control it is a necessary next step that will facilitate exploring the impact of cell mechanics perturbations on cell and tissue function (76).
The increasing availability of transcriptional profiles accompanying cell state changes has recently been complemented by the ease of screening for mechanical phenotypes of cells thanks to the advent of high-throughput microfluidic methods (77). This provides an opportunity for data-driven identification of genes associated with the mechanical cell phenotype change in a hypothesis-free manner. Here we leveraged this opportunity by performing discriminative network analysis on transcriptomes associated with mechanical phenotype changes to elucidate a conserved module of five genes potentially involved in cell mechanical phenotype regulation. We provided evidence that the inferred conserved functional network module contains an ensemble of five genes that, in particular when combined in a unique combinatorial marker, are universal, specific and trustworthy markers of mechanical phenotype across the studied mouse and human systems. We further demonstrated on the example of a selected marker gene, CAV1, that its experimental up- and downregulation impacts the stiffness of the measured cells. This demonstrates that the level of CAV1 not only correlates with, but also is causative of mechanical phenotype change. The mechanistic insights into how precisely the identified genes are involved in regulating mechanical properties, how they interact with each other, and whether they are universal and dominant in various contexts all remain to be established in future studies.”
Regarding the selection of CAV1 as the gene that we used for validation experiment; as mentioned in the introductory paragraph of the result section “Perturbing expression levels of CAV1 changes cells stiffness” (copied below), we were encouraged by the previous data already linking CAV1 with cell mechanics when selecting it as our first target. The relationship between CAV1 and cell mechanics regulation, however, is not very well established (of note, two of the latest manuscripts came out after the initial findings of our study).
Regarding the citations suggested by the reviewer: two are already included in the original manuscript (Lin et al., Oncotarget 2015 – Ref (63), Le Master –2022 Ref (67)), along with an additional one (Hsu et al 2018 (66)), and the third one (Grivas et al, 2020 (68)) is now also added to the manuscript. Though, we would like to highlight that even though Grivas et al state that the CAV1 KO cells are stiffer, the AFM indentation measurements were performed on the cardiac tissue, with a spherical tip of 30 μm radius and likely reflect primarily supracelluar, tissue-scale properties, as opposed to cell-scale measurements performed in our study (we used cultured cells which mostly lack the extracellular tissue structures, deformability cytometry was performed on dissociated cells and picks up on cell properties exclusively, and in case of AFM measurements a spherical tip with 5 μm radius was used).
“We decided to focus our attention on CAV1 as a potential target for modulating mechanical properties of cells, as it has previously been linked to processes intertwined with cell mechanics. In the context of mechanosensing, CAV1 is known to facilitate buffering of the membrane tension (45), play a role in β1-inegrin-dependent mechanotransduction (58) and modulate the mechanotransduction in response to substrate stiffness (59). CAV1 is also intimately linked with actin cytoskeleton — it was shown to be involved in cross-talk with Rho-signaling and actin cytoskeleton regulation (46, 60–62), filamin A-mediated interactions with actin filaments (63), and co-localization with peripheral actin (64). The evidence directly relating CAV1 levels with the mechanical properties of cells (47, 62, 65, 66) and tissues (66, 67) , is only beginning to emerge.”
Regarding the cited p-value of 0.16, we would like to clarify that it is the p-value associated with the coefficient of the crude linear regression model fitted to the data for illustrative purposes in Fig S4. This value only says that from the linear fit we cannot conclude much about the correlation of the level of Cav1 with the Young’s modulus change. Much more relevant parameters to look at are the AUC-ROC values and associated p-values reported in the Table 4 in the main text (see below), which show good performance of CAV1 in separating soft and stiff cell states.
The positive hypothesis I assumes that markers are discriminative of samples with stiff/soft mechanical phenotype regardless of the studied biological system, and CAV1 has a clear trend with the minimum AUC-ROC on 3 datasets of 0.78, even though the p-value is below the significance level. The positive hypothesis II assumes that markers are discriminative of samples with stiff/soft mechanical phenotype in carcinoma regardless of data source, and CAV1 has a clear significance because the minimum AUC-ROC on 3 datasets is 0.89 and the p-value is 0.02.
(2) The authors do not show how much does PC-Corr outperforms classical co-expression network analysis or an alternative gold standard. It is worth noting that PC-Corr was previously published by the same authors to infer phenotype-associated functional network modules from omics datasets (Ciucci et al., Scientific Reports 2017).
As pointed out by the Reviewer, PC-corr has been introduced and characterized in detail in a previous publication (Ciucci et al, 2017, Sci. Rep.), where it was compared against standard co-expression analysis (below reported as: p-value network) on molecules selected using univariate statistical analysis.
See the following fragment of Discussion in Ciucci et al, 2017:
“The PC-corr networks were always compared to P-value networks. The first strategical difference lies in the way features are selected: while the PC-corr adopts a multivariate approach, i.e. it uses a combination of features that are responsible for the sample discrimination, in the P-value network the discriminating features are singly selected (one by one) with each Mann-Whitney test (followed by Benjamini-Hochberg procedure). The second strategical difference lies in the generation of the correlation weights in the network. PC-corr combines in parallel and at the same time in a unique formula the discrimination power of the PC-loadings and the association power of the Pearson correlation, directly providing in output discriminative omic associations. These are generated using a robust (because we use as merging factor the minimum operator, which is a very penalizing operator) mathematical trade-off between two important factors: multivariate discriminative significance and correlation association. In addition, as mentioned above, the minimum operator works as an AND logical gate in a digital circuit, therefore in order to have a high link weight in the PCcorr network, both the discrimination (the PC-loadings) and the association (the Pearson correlations) of the nodes adjacent to the link should be simultaneously high. Instead, the Pvalue procedure begins with the pre-selection of the significant omic features and, only in a second separated step, computes the associations between these features. Therefore, in P-value networks, the interaction weights are the result neither of multivariate discriminative significance, nor of a discrimination/association interplay.”
Here we implement PC-corr for a particular application and do not see it as central to the message of the present manuscript to compare it with other available methods. We considered it much more relevant to focus on an in-silico validation on dataset not used during the PCcorr analysis (see Table 3 and 4 for details).
Altogether, the authors provide an interesting approach to identify novel genes associated with cell mechanics changes, but the current version does not fulfill such potential by focusing on a single gene with known roles in cell mechanics.
Our manuscript presents a demonstration of an overall approach for the identification of genes involved in the regulation of cell mechanics, and the perturbations performed on CAV1 have a demonstrative role (please also refer to the explanations of why we decided to perform the verification focused on CAV1 above). The fact that we identify CAV1, which has been implicated in regulating cell mechanics in a handful of studies, de novo and in an unbiased way speaks to the power of our approach. We do agree that investigation into the effect of manipulating the expression of the remaining genes from the identified network module, as well as into the mutual relationships between those genes and their covariance in perturbation experiments, constitutes a desirable follow-up on the presented results. It is, however, beyond the scope of the current manuscript. Regardless, the other genes identified can be readily tested in systems of interest and used as potential knobs for tuning mechanical properties on demand.
Reviewer #1 (Recommendations For Authors)
I am not a specialist of the bio-informatics methods used in this study, so I will not make any specific technical comments on them.
In terms of mechanical characterisation of cells, the authors use well established methods and the fact that they systematically validate their findings with at least two independent methods (RT-DC and AFM for example) makes them very robust. So I have no concerns with this part. The experiments of perturbations of CAV 1 are also performed to the best standards and the results are clear, no concern on that.
My main concerns are rather questions I was asking myself and could not answer when reading the article. Maybe the authors could find ways to clarify them - the discussion of their article is already very long and maybe it should not be lengthened to much. In my opinion, some of the points discussed are not really essential and rather redundant with other parts of the paper. This could be improved to give some space to clarify some of the points below:
We thank the Reviewer #1 for an overall positive evaluation of the manuscript as well as the points of criticism which we addressed in a point-by-point manner below.
(1) This might be a misunderstanding of the method on my side, but I was wondering whether it is possible to proceed through the same steps but choose other pairs of training datasets amongst the 5 systems available (there are 10 such pairs if I am not mistaken) and ask whether they always give the same set of 5 genes. And if not, are the other sets also then predictive, robust, etc. Or is it that there are 'better' pairs than others in this respect. Or the set of 5 genes is the only one that could be found amongst these 5 datasets - and then could it imply that it is the only group 'universal' group of predictive genes for cell mechanics (when applied to any other dataset comprising similar mechanical measures and expression profiles, for other cells, other conditions)?
I apologize in case this question is just the result of a basic misunderstanding of the method on my side. But I could not answer the question myself based on what is in the article and it seems to be important to understand the significance of the finding and the robustness of the method.
We thank the Reviewer for this question. To clarify: while in general it is possible to proceed through the same analysis steps choosing a different pair of datasets (see below for examples), we have purposefully chosen those two and not any other datasets because they encompassed the highest number of samples per condition in the RNAseq data (see Fig 4 and Table R1 below), originated from two different species and concerned least related tissues (the other option for mouse would be neural progenitors which in combination with the glioblastoma would likely result in focusing on genes expressed in neural tissues). This is briefly explained in the following fragment of the manuscript on Page 10:
“For the network construction, we chose two datasets that originate from different species, concern unrelated biological processes, and have a high number of samples included in the transcriptional analysis: human glioblastoma and murine iPSCs (Table 1).”
To further address the comment of the reviewer: there is indeed a total of 10 possible two-set combinations of datasets, 6 of those pairs are human-mouse combinations (highlighted in orange in Author response Table 1), 3 are human-human combinations (highlighted in blue), and 1 is mousemouse (marked in green).
Author response table 1.
Possible two-set combinations of datasets. For each combination, the number of common genes is indicated. The number on the diagonal represents total number of transcripts in the individual datasets, n corresponds to the number of samples in the respective datasets. * include non-coding genes.
To reiterate, we have chosen the combination of set A (glioblastoma) and set D (iPSCs) to choose datasets from different species and with highest sample number.
As for the other combinations of human-mouse datasets:
• set A & E lead to derivation of a conserved module, however as expected this module includes genes specific for neuronal tissues (such as brain & testis specific immunoglobulin IGSF11, or genes involved in neuronal development such as RFX4, SOX8)
Author response image 1.
• the remaining combinations (set B&D, B&E, C&D and C&E) do not lead to a derivation of a highly interconnected module
Author response image 2.
Author response image 3.
Author response image 4.
Author response image 5.
Finally, it would have also been possible to perform the combined PC-corr procedure on all 5 datasets. However, this would prevent us from doing validation using unknown datasets.
Hence, we decided to proceed with the 2 discovery and 4 validation datasets.
For the sake of completeness, we present below some of the networks obtained from the analysis performed on all 5 datasets (which intersect at 8059 genes).
Author response image 6.
The above network was created by calculating mean/minimum PC-corr among all five datasets and applying the threshold. The thresholding can be additionally restricted in that we:
a. constrain the directionality of the correlation between the genes (𝑠𝑔𝑛(𝑐) ) to be the same among all or at least n datasets
b. constrain the directionality of the correlation between the cell stiffness and gene expression level (𝑠𝑔𝑛(𝑉)) for individual genes.
Some of the resulting networks for such restrictions are presented below.
Author response image 7.
Author response image 8.
Of note, some of the nodes from the original network presented in the paper (CAV1, FHL2, and IGFBP7) are preserved in the 5-set network (and highlighted with blue rims),
(2) The authors already use several types of mechanical characterisation of the cells, but there are even more of them, in particular, some that might not directly correspond to global cell stiffness but to other aspects, like traction forces, or cell cortex rheology, or cell volume or passage time trough constrictions (active or passive) - they might all be in a way or another related, but they are a priori independent measures. Would the authors anticipate finding very different 'universal modules' for these other mechanical properties, or again the same one? Is there a way to get at least a hint based on some published characterisations for the cells used in the study? Basically, the question is whether the gene set identified is specific for a precise type of mechanical property of the cell, or is more generally related to cell mechanics modulation - maybe, as suggested by the authors because it is a set of molecular knobs acting upstream of general mechanics effectors like YAP/TAZ or acto-myosin?
We thank the Reviewer for this comment. We would like to first note that in our study, we focused on single-cell mechanical phenotype understood as a response of the cells to deformation at a global (RT-DC) or semi-local (AFM indentation with 5-μm bead) level and comparatively low deformations (1-3 μm, see Table S9). There is of course a variety of other methods for measuring cell mechanics and mechanics-related features, such as traction force microscopy mentioned by the reviewer. Though, traction force microscopy probes how the cells apply forces and interact with their environment rather than the inherent mechanical properties of the cells themselves which were the main interest of our study.
Nevertheless, as mentioned in the discussion, we found some overlap with the genes identified in other mechanical contexts, for example in the context of mechanical stretching of cells:
“Furthermore, CAV1 is known to modulate the activation of transcriptional cofactor yesassociated protein, YAP, in response to changes in stiffness of cell substrate (60) and in the mechanical stretch-induced mesothelial to mesenchymal transition (74).”
Which suggests that the genes identified here may be more broadly related to mechanical aspects of cells.
Of note, we do have some insights connected to the changes of cell volume — one of the biophysical properties mentioned by the reviewer — from our experiments. For all measurements performed with RT-DC, we can also calculate cell volumes from 2D cell contours (see Author response images 9, 10, and 11). For most of the cases (all apart from MEF CAV1KO), the stiffer phenotype of the cells, associated with higher levels of CAV1, shows a higher volume.
Author response image 9.
Cell volumes for the divergent cell states in the five characterized biological systems. (A) Glioblastoma. (B) Carcinoma, (C) MCF10A, (D) iPSCs, (E) Developing neurons. Data corresponds to Figure 2. Cell volumes were estimated using Shape-Out 1.0.10 by rotation of the cell contours.
Author response image 10.
Cell volumes for CAV1 perturbation experiments. (A) CAV1 knock down performed in TGBC cells. (B) CAV1 overexpression in ECC4 and TGBC cells. Data corresponds to Figure 5. Cell volumes were estimated using Shape-Out 1.0.10 by rotation of the cell contours.
Author response image 11.
Cell volumes for WT and CAV1KO MEFs. Data corresponds to Figure S9. Cell volumes were estimated using Shape-Out 1.0.10 by rotation of the cell contours.
(3) The authors have already tested a large number of conditions in which perturbations of the level of expression of CAV1 correlates with changes in cell mechanics, but I was wondering whether it also has some direct explanatory value for the initial datasets used - for example for the glioblastoma cells from Figure 2, in the different media, would a knock-down of CAV1 prevent the increase in stiffness observed upon addition of serum, or for the carcinoma cells from different tissues treated with different compounds - if I understand well, the authors have tested a subset of these (ECC4 versus TGBC in figure 5) - how did they choose these and how general is it that the mechanical phenotype changes reported in Figure 2 are all mostly dependant on CAV1 expression level? I must say that the way the text is written and the results shown, it is hard to tell whether CAV1 is really having a dominant effect on cell mechanics in most of these contexts or only a partial effect. I hope I am being clear in my question - I am not questioning the conclusions of Figures 5 and 6, but asking whether the level of expression of CAV1, in the datasets reported in Figure 2, is the dominant explanatory feature for the differences in cell mechanics.
We thank reviewer for this comment and appreciate the value of the question about the generality and dominance of CAV1 in influencing cell mechanics.
On the computational side, we have addressed these issues by looking at the performance of CAV1 (among other identified genes) in classifying soft and stiff phenotypes across biological systems (positive hypothesis I), as well as across data of different type (sequencing vs microarray data) and origin (different research institutions) (positive hypothesis II). CAV1 showed strong classification performance (Table 4), suggesting it is a general marker of stiffness changes.
On the experimental side, we conducted the perturbation experiments in two systems of choice: two intestinal carcinoma cell lines (ECC4 and TGBC) and the MCF10A breast epithelial cell line. These choices were driven by ease of handling, accessibility, as well as (for MCF10A) connection with a former study (Taveres et al, 2017). While we observed correlations between CAV1 expression and cell mechanics in wide range of datasets, the precise role of CAV1 in each system may vary, and further perturbation experiments in specific systems could be performed to solidify the direct/dominant role of CAV1 in cell mechanics. We hypothesize that the suggested knockdown of CAV1 upon serum addition in glioblastoma cells could reduce or prevent the increase in stiffness observed, though this experiment has not been performed.
In conclusion, while the computational analysis gives us confidence that CAV1 is a good indicator of cell stiffness, we predict that it acts in concert with other genes and in specific context could be replaced by other changes. We suggest that the suitability of CAV1 for manipulation of the mechanical properties should be tested in each system of interested before use.
To highlight the fact that the relevance of CAV1 for modulating cell mechanics in specific systems of interest should be tested and the mechanistic insights into how CAV1 regulates cell mechanics are still missing, we have added the following sentence in the discussion:
“The mechanical phenotype of cells is recognized as a hallmark of many physiological and pathological processes. Understanding how to control it is a necessary next step that will facilitate exploring the impact of cell mechanics perturbations on cell and tissue function (76). The increasing availability of transcriptional profiles accompanying cell state changes has recently been complemented by the ease of screening for mechanical phenotypes of cells thanks to the advent of high-throughput microfluidic methods (77). This provides an opportunity for data-driven identification of genes associated with the mechanical cell phenotype change in a hypothesis-free manner. Here we leveraged this opportunity by performing discriminative network analysis on transcriptomes associated with mechanical phenotype changes to elucidate a conserved module of five genes potentially involved in cell mechanical phenotype regulation. We provided evidence that the inferred conserved functional network module contains an ensemble of five genes that, in particular when combined in a unique combinatorial marker, are universal, specific and trustworthy markers of mechanical phenotype across the studied mouse and human systems. We further demonstrated on the example of a selected marker gene, CAV1, that its experimental up- and downregulation impacts the stiffness of the measured cells. This demonstrates that the level of CAV1 not only correlates with, but also is causative of mechanical phenotype change. The mechanistic insights into how precisely the identified genes are involved in regulating mechanical properties, how they interact with each other, and whether they are universal and dominant in various contexts all remain to be established in future studies.”
(4) It would be nice that the authors try to more directly address, in their discussion, what is the biological meaning of the set of 5 genes that they found - is it really mostly a product of the methodology used, useful but with little specific relevance to any biology, or does it have a deeper meaning? Either at a system level, or at an evolutionary level.
We would like to highlight that our manuscript is focused on the method that we introduce to identify sets of genes involved in the regulation of cell mechanics. The first implementation included here is only the beginning of this line of work which, in the future, will include looking in detail at the biological meaning and the interconnectivity of the genes identified. Most likely, there is a deeper meaning of the identified module which could be revealed with a lot of dedicated future work. As it is a mere speculation at this point, we would like to refrain from going into more detail about it in the current manuscript. We provide below a few words of extended explanation and additional analysis that can shed light on the current limited knowledge of the connections between the genes and evolutionary preservation of the genes.
While it is difficult to prove at present, we do believe that the identified node of genes may have an actual biological meaning and is not a mere product of the used methodology. The PC-corr score used for applying the threshold and obtaining the gene network is high only if the Pearson’s correlation between the two genes is high, meaning that the high connected module of genes identified show corelated expression and is likely co-regulated. Additionally, we performed the GO Term analysis using DAVID to assess the connections between the genes (Figure S3). We have now performed an additional analysis using two orthogonal tools the functional protein association tool STRING and KEGG Mapper.
With STRING, we found a moderate connectivity using the five network nodes identified in our study, and many of the obtained connections were based on text mining and co-expression, rather than direct experimental evidence (Author response image 12A). A more connected network can be obtained by allowing STRING to introduce further nodes (Author response image 12B). Interestingly, some of the nodes included by STRING in the extended network are nodes identified with milder PCcorr thresholds in our study (such as CNN2 or IGFBP3, see Table S3).
With KEGG Mapper, we did not find an obvious pathway-based clustering of the genes from the module either. A maximum of two genes were assigned to one pathway and those included:
• focal adhesions (pathway hsa04510): CAV1 and THBS1
• cytoskeleton in muscle cells (pathway hsa04820): FHL2 and THBS1
• proteoglycans in cancer (pathway hsa05205): CAV1 and THBS1.
As for the BRITE hierarchy, following classification was found:
• membrane trafficking(hsa04131): CAV1, IGFBP7, TAGLN, THBS, with following subcategories:
- endocytosis / lipid raft mediated endocytosis/caveolin-mediated endocytosis:
CAV1
- endocytosis / phagocytosis / opsonins: THBS1
- endocytosis / others/ insulin-like growth factor-binding proteins: IGFBP7 o others / actin-binding proteins/others: TAGLN.
Taken together, all that analyses (DAVID, STRING, KEGG) show that at present no direct relationship/single pathway can be found that integrates all the genes from the identified modules. Future experiments, including investigations of how other module nodes are affected when one of the genes is manipulated, will help to establish actual physical or regulatory interactions between the genes from our module.
To touch upon the evolutionary perspective, we provide an overview of occurrence of the genes from the identified module across the evolutionary tree. This overview shows that the five identified genes are preserved in phylum Chordata with quite high sequence similarity, and even more so within mammals (Author response image 13).
Author response image 12.
Visualisation of interactions between the nodes in the identified module using functional protein association networks tool STRING. (A) Connections obtained using multiple proteins search and entering the five network nodes. (B) Extended network that includes further genes to increase indirect connectivity. The genes are added automatically by STRING. Online version of STRING v12.0 was used with Homo sapiens as species of interest.
Author response image 13.
Co-occurrence of genes from the network module across the evolutionary tree. Mammals are indicated with the green frame, glires (include mouse), as well as primates (include human) are indicated with yellow frames. The view was generated using online version of STRING 12.0.
Reviewer #2 (Recommendations For Authors)
(1) The authors need to discuss the level of sensitivity of their mechanical measurements with RT-DC for changes to the membrane compared to changes in microtubules, nucleus, etc. The limited AFM measurements also seem membrane/cortex focused. For these and further reasons below, "universal" doesn't seem appropriate in the title or abstract, and should be deleted.
We thank the reviewer for this comment. Indeed, RT-DC is a technique that deforms the entire cell to a relatively low degree (inducing ca 17% mean strain, i.e. a deformation of approximately 2.5 µm on a cell with a 15 µm diameter, see Table S9 and Urbanska et al., Nat Methods 2020). Similarly, the AFM indentation experiments performed in this study (using a 5-µm diameter colloidal probe and 1 µm indentation) induce low strains, at which, according to current knowledge, the actin cortex dominates the measured deformations. However, other cellular components, including the membrane, microtubules, intermediate filaments, nucleus, other organelles, and cytoplasmic packing, can also contribute. We have reviewed these contributions in detail in a recent publication (Urbanska and Guck, 2024, Ann Rev Biophys., PMID 38382116). For a particular system, it is hard to speculate without further investigation which parts of the cell have a dominant effect on the measured deformability. We have added now a following paragraph in the discussion to include this information:
“The mechanical phenotype of single cells is a global readout of cell’s resistance to deformation that integrates contributions from all cellular components. The two techniques implemented for measuring cell mechanical in this study — RT-DC and AFM indentation using a spherical indenter with 5 µm radius — exert comparatively low strain on cells (< 3 µm, see Table S9), at which the actin cortex is believed to dominate the measured response. However, other cellular components, including the membrane, microtubules, intermediate filaments, nucleus, other organelles, and cytoplasmic packing, also contribute to the measured deformations (reviewed in detail in (79)) and, for a particular system, it is hard to speculate without further investigation which parts of the cell have a dominant effect on the measured deformability.”
The key strength of measuring the global mechanics is that such measurements are agnostic of the specific origin of the resistance to shape change. As such, the term “universal” could be seen as rather appropriate, as we are not testing specific contributions to cell mechanics, and we see the two methods used (RT-DC and AFM indentation) as representative when it comes to measuring global cell mechanics. And we highlighted many times throughout the text that we are measuring global single-cell mechanical phenotype.
Most importantly, however, we have used the term “universal” to capture that the genes are preserved across different systems and species, not in relation to the type of mechanical measurements performed and as such we would like to retain the term in the title.
(2) Fig.2 cartoons of tissues is a good idea to quickly illustrate the range of cell culture lines studied. However, it obligates the authors to examine the relevant primary cell types in singlecell RNAseq of human and/or mouse tissues (e.g. Tabula Muris). They need to show CAV1 is expressed in glioblastoma, iPSCs, etc and not a cell culture artifact. CAV1 and the other genes also need to be plotted with literature values of tissue stiffness.
We thank the reviewer for this the comment; however, we do believe that the cartoons in Figure 2 should assist the reader to readily understand whether cultured cells derived from the respective tissues were used (see cartoons representing dishes), or the cells directly isolated from the tissue were measured (this is the case for the developing neurons dataset).
We did, however, follow the suggestion of the reviewer to use available resources and checked the expression of genes from the identified network module across various tissues in mouse and human. We first used the Mouse Genome Informatics (MGI; https://www.informatics.jax.org/) to visualize the expression of the genes across organs and organ systems (Author response image 14) as well as across more specific tissue structures (Author response image 15). These two figures show that the five identified genes are expressed quite broadly in mouse. We next looked at the expression of the five genes in the scRNASeq dataset from Tabula Muris (Author response image 16). Here, the expression of respective genes seemed more restricted to specific cell clusters. Finally, we also collected the cross-tissue expression of the genes from our module in human tissues from Human Protein Atlas v23 at both mRNA (Author response image 17) and protein (Author response image 18) levels. CAV1, IGFBP7, and THBS1 showed low tissue specificity at mRNA level, FHL2 was enriched in heart muscle and ovary (the heart enrichment is also visible in Author response image 15 for mouse) and TAGLN in endometrium and intestine. Interestingly, the expression at the protein level (Author response image 18) did not seem to follow faithfully the mRNA levels (Author response image 17). Overall, we conclude that the identified genes are expressed quite broadly across mouse and human tissues.
Author response image 14.
Expression of genes from the identified module across various organ and organ systems in mouse. The expression matrices for organs (A) and organ systems (B) were generated using Tissue x Gene Matrix tool of Gene eXpression Database (https://www.informatics.jax.org/gxd/, accessed on 22nd September 2024). No pre-selection of stage (age) and assay type (includes RNA and protein-based assays) was applied. The colors in the grid (blues for expression detected and reds for expression not detected) get progressively darker when there are more supporting annotations. The darker colors do not denote higher or lower levels of expression, just more evidence.
Author response image 15.
Expression of genes from the identified module across various mouse tissue structures. The expression matrices for age-selected mouse marked as adult (A) or young individuals (collected ages labelled P42-84 / P w6-w12 / P m1.5-3.0) (B) are presented and were generated using RNASeq Heatmap tool of Gene eXpression Database (https://www.informatics.jax.org/gxd/, accessed on 2nd October 2024).
Author response image 16.
Expression of genes from the identified module across various cell types and organs in t-SNE embedding of Tabula Muris dataset. (A) t-SNE clustering color-coded by organ. (B-F) t-SNE clustering colorcoded for expression of CAV1 (B), IGFBP7 (C), FHL2 (D), TAGLN (E), and THBS1 (F). The plots were generated using FACS-collected cells data through the visualisation tool available at https://tabulamuris.sf.czbiohub.org/ (accessed on 22nd September 2024).
Author response image 17.
Expression of genes from the identified module at the mRNA level across various human tissues. (A-E) Expression levels of CAV1 (A), IGFBP7 (B), FHL2 (C), TAGLN (D), and THBS1 (E). The plots were generated using consensus dataset from Human Protein Atlas v23 https://www.proteinatlas.org/ (accessed on 22nd September 2024).
Author response image 18.
Protein levels of genes from the identified module across various human tissues. (A-E) Protein levels of CAV1 (A), IGFBP7 (B), FHL2 (C), TAGLN (D), and THBS1 (E). The plots were generated using Human Protein Atlas v23 https://www.proteinatlas.org/ (accessed on 22nd September 2024).
Regarding literature values and tissue stiffness, we would like to argue that cell stiffness is not equivalent to tissue stiffness, and we are interested in the former. Tissue stiffness is governed by a combination of cell mechanical properties, cell adhesions, packing and the extracellular matrix. There can be, in fact, mechanically distinct cell types (for example characterized by different metabolic state, malignancy level etc) within one tissue of given stiffness. Hence, we consider that testing for the correlation between tissue stiffness and expression of identified genes is not immediately relevant.
(3) Fig.5D,H show important time-dependent mechanics that need to be used to provide explanations of the differences in RT-DC (5B,F) and in standard AFM indentation expts (5C,G). In particular, it looks to me that RT-DC is a high-f/short-time measurement compared to the AFM indentation, and an additional Main or Supp Fig needs to somehow combine all of this data to clarify this issue.
We thank the reviewer for this comment. It is indeed the case, that cells typically display higher stiffness when probed at higher rates. We have now expanded on this aspect of the results and added a supplementary figure (Fig. S10) that illustrates the frequencies used in different methods and summarizes the apparent Young’s moduli values into one plot in a frequencyordered manner. Of note, we typically acquire RT-DC measurements at up to three flowrates, and the increase in measurement flow rates accompanying increase in flow rate also results in higher extracted apparent Young’s moduli (see Fig. S10 B,D). We have further added Table S9 that summarizes operating parameters of all three methods used for probing cell mechanics in this manuscript:
“The three techniques for characterizing mechanical properties of cells — RT-DC, AFM indentation and AFM microrheology — differ in several aspects (summarized in Table S9), most notably in the frequency at which the force is applied to cells during the measurements, with RT-DC operating at the highest frequency (~600 Hz), AFM microrheology at a range of frequencies in-between (3–200 Hz), and AFM indentation operating at lowest frequency (5 Hz) (see Table S9 and Figure S10A). Even though the apparent Young’s moduli obtained for TGBCS cells were consistently higher than those for ECC4 cells across all three methods, the absolute values measured for a given cell line varied depending on the methods: RT-DC measurements yielded higher apparent Young’s moduli compared to AFM indentation, while the apparent Young’s moduli derived from AFM microrheology measurements were frequency-dependent and fell between the other two methods (Fig. 5B–D, Fig. S10B). The observed increase in apparent Young’s modulus with probing frequency aligns with previous findings on cell stiffening with increased probing rates observed for both AFM indentation (68, 69) and microrheology assays (70–72).”
(4) The plots in Fig.S4 are important as main Figs, particularly given the cartoons of different tissues in Fig.1,2. However, positive correlations for a few genes (CAV1, IGFBP7, TAGLN) are most clear for the multiple lineages that are the same (stomach) or similar (gli, neural & pluri). The authors need to add green lines and pink lines in all plots to indicate the 'lineagespecific' correlations, and provide measures where possible. Some genes clearly don't show the same trends and should be discussed.
We thank reviewer for this comment. It is indeed an interesting observation (and worth highlighting by adding the fits to lineage-restricted data) that the relationship between relative change in Young’s modulus and the selected gene expression becomes steeper for samples from similar tissue contexts.
For the sake of keeping the main manuscript compact, we decided to keep Fig. S7 (formerly Fig. S4) in the supplement, however, we did add the linear fit to the glioblastoma dataset (pink line) and a fit to the related neural/embryonic datasets (gli, neural & pluri – purple line) as advised — see below.
We did not pool the stomach data since it is represented by a single point in the figure, aligning with how the data is presented in the main text—stomach adenocarcinoma cell lines (MKN1 and MKN45) are pooled in Fig. 1B (see below).
We have also amended the respective results section to emphasize that, in certain instances, the correlation between changes in mechanical phenotype and alterations in the expression of analysed genes may be less pronounced:
“The relation between normalized apparent Young’s modulus change and fold-change in the expression of the target genes is presented in Fig. S7. The direction of changes in the expression levels between the soft and stiff cell states in the validation datasets was not always following the same direction (Fig. 4, C to F, Fig. S7). This suggests that the genes associated with cell mechanics may not have a monotonic relationship with cell stiffness, but rather are characterized by different expression regimes in which the expression change in opposite directions can have the same effect on cell stiffness. Additionally, in specific cases a relatively high change in Young’s modulus did not correspond to marked expression changes of a given gene — see for example low CAV1 changes observed in MCF10A PIK3CA mutant (Fig. S7A), or low IGFBP7 changes in intestine and lung carcinoma samples (Fig. S7C). This indicates that the importance of specific targets for the mechanical phenotype change may vary depending on the origin of the sample.”
(5) Table-1 neuro: Perhaps I missed the use of the AFM measurements, but these need to be included more clearly in the Results somewhere.
To clarify: there were no AFM measurements performed for the developing neurons (neuro) dataset, and it is not marked as such in Table 1. There are previously published AFM measurements for the iPSCs dataset (maybe that caused the confusion?), and we referred to them as such in the table by citing the source (Urbanska et al (30)) as opposed to the statement “this paper” (see the last column of Table 1). We did not consider it necessary to include these previously published data. We have added additional horizontal lines to the table that will hopefully help in the table readability.
Reviewer #3 (For Authors)
Major
- I strongly encourage the authors to validate their approach with a gene for which mechanical data does not exist yet, or explore how the combination of the 5 identified genes is the novel regulator of cell mechanics.
We appreciate the reviewer’s insightful comment and agree that it would be highly interesting to validate further targets and perform combinatorial perturbations. However, it is not feasible at this point to expand the experimental data beyond the one already provided. We hope that in the future, the collective effort of the cell mechanics community will establish more genes that can be used for tuning of mechanical properties of cells.
- If this paper aims at highlighting the power of PC-Corr as a novel inference approach, the authors should compare its predictive power to that of classical co-expression network analysis or an alternative gold standard.
We thank the reviewer for the suggestion to compare the predictive power of PC-Corr with classical co-expression network analysis or an alternative gold standard. PC-corr has been introduced and characterized in detail in a previous publication (Ciucci et al, 2017, Sci. Rep.), where it was compared against standard co-expression analysis methods. Here we implement PC-corr for a particular application. Thus, we do not see it as central to the message of the present manuscript to compare it with other available methods again.
- The authors call their 5 identified genes "universal, trustworthy and specific". While they provide a great amount of data all is derived from human and mouse cell lines. I suggest toning this down.
We thank the reviewers for this comment. To clarify, the terms universal, trustworthy and specific are based on the specific hypotheses tested in the validation part of the manuscript, but we understand that it may cause confusion. We have now toned that the statement by adding “universal, trustworthy and specific across the studied mouse and human systems” in the following text fragments:
(1) Abstract
“(…) We validate in silico that the identified gene markers are universal, trustworthy and specific to the mechanical phenotype across the studied mouse and human systems, and demonstrate experimentally that a selected target, CAV1, changes the mechanical phenotype of cells accordingly when silenced or overexpressed. (...)”
(2) Last paragraph of the introduction
“(…) We then test the ability of each gene to classify cell states according to cell stiffness in silico on six further transcriptomic datasets and show that the individual genes, as well as their compression into a combinatorial marker, are universally, specifically and trustworthily associated with the mechanical phenotype across the studied mouse and human systems. (…)”
(3) First paragraph of the discussion
“We provided strong evidence that the inferred conserved functional network module contains an ensemble of five genes that, in particular when combined in a unique combinatorial marker, are universal, specific and trustworthy markers of mechanical phenotype across the studied mouse and human systems.”
Minor suggestions
- The authors point out how genes that regulate mechanics often display non-monotonic relations with their mechanical outcome. Indeed, in Fig.4 developing neurons have lower CAV1 in the stiff group. Perturbing CAV1 expression in that model could show the nonmonotonic relation and strengthen their claim.
We thank reviewer for highlighting this important point. It would indeed be interesting to explore the changes in cell stiffness upon perturbation of CAV1 in a system that has a potential to show an opposing behavior. Unfortunately, we are unable to expand the experimental part of the manuscript at this time. We do hope that this point can be addressed in future research, either by our team or other researchers in the field.
- In their gene ontology enrichment assay, the authors claim that their results point towards reduced transcriptional activity and reduced growth/proliferation in stiff compared to soft cells. Proving this with a simple proliferation assay would be a nice addition to the paper.
This is a valuable suggestion that should be followed up on in detail in the future. To give a preliminary insight into this line of investigation, we have had a look at the cell count data for the CAV1 knock down experiments in TGBC cells. Since CAV1 is associated with the GO Term “negative regulation of proliferation/transcription” (high CAV1 – low proliferation), we would expect that lowering the levels of CAV1 results in increased proliferation and higher cell counts at the end of experiment (3 days post transfection). As illustrated in Author response image 19 below, the cell counts were higher for the samples treated with CAV1 siRNAs, though, not in a statistically significant way. Interestingly, the magnitude of the effect partially mirrored the trends observed for the cell stiffness (Figure 5F).
Author response image 19.
The impact of CAV1 knock down on cell counts in TGBC cells. (A) Absolute cell counts per condition in a 6-well format. Cell counts were performed when harvesting for RT-DC measurements using an automated cell counter (Countess II, Thermo Fisher Scientific). (B) The event rates observed during the RT-DC measurements. The harvested cells are resuspended in a specific volume of measuring buffer standardized per experiment (50-100 μl); thus, the event rates reflect the absolute cell numbers in the respective samples. Horizontal lines delineate medians with mean absolute deviation (MAD) as error, datapoints represent individual measurement replicates, with symbols corresponding to matching measurement days. Statistical analysis was performed using two sample two-sided Wilcoxon rank sum test.
Methods
- The AFM indentation experiments are performed with a very soft cantilever at very high speeds. Why? Also, please mention whether the complete AFM curve was fitted with the Hertz/Sneddon model or only a certain area around the contact point.
We thank the reviewer for this comment. However, we believe that the spring constants and indentation speeds used in our study are typical for measurements of cells and not a cause of concern.
For the indentation experiments, we used Arrow-TL1 cantilevers (nominal spring constant k = 0.035-0.045 N m<sup>−1</sup>, Nanoworld, Switzerland) which are used routinely for cell indentation (with over 200 search results on Google Scholar using the term: "Arrow-TL1"+"cell", and several former publications from our group, including Munder et al 2016, Tavares et al 2017, Urbanska et al 2017, Taubenberger et al 2019, Abuhattum et al 2022, among others). Additionally, cantilevers with the spring constants as low as 0.01 N m−1 can be used for cell measurements (Radmacher 2002, Thomas et al, 2013).
The indentation speed of 5 µm s<sup>−1</sup> is not unusually high and does not result in significant hydrodynamic drag.
For the microrheology experiments, we used slightly stiffer and shorter (100/200 µm compared to 500 µm for Arrow-TL1) cantilevers: PNP-TR-TL (nominal spring constant k = 0.08 N m<sup>−1</sup>, Nanoworld, Switzerland). The measurement frequencies of 3-200 Hz correspond to movements slightly faster than 5 µm s<sup>−1</sup>, but cells were indented only to 100 nm, and the data were corrected for the hydrodynamic drag (see equation (8) in Methods section).
Author response image 20.
Exemplary indentation curve obtained using arrow-TL1 decorated with a 5-µm sphere on a ECC4 cell. The shown plot is exported directly from JPK Data Processing software. The area shaded in grey is the area used for fitting the Sneddon model.
In the indentation experiments, the curves were fitted to a maximal indentation of 1.5 μm (rarely exceeded, see Author response image 20). We have now added this information to the methods section:
- Could the authors include the dataset wt #1 in Fig 4D? Does it display the same trend?
We thank the reviewer for this comment. To clarify: in the MCF10A dataset (GEO: GSE69822) there are exactly three replicates of each wt (wild type) and ki (knock-in, referring to the H1047R mutation in the PIK3CA) samples. The numbering wt#2, wt#3, wt#4 originated from the short names that were used in the working files containing non-averaged RPKM (possibly to three different measurement replicates that may have not been exactly paired with the ki samples). We have now renamed the samples as wt#1, wt#2 and wt#3 to avoid the confusion. This naming also reflects better the sample description as deposited in the GSE69822 dataset (see Author response table 2).
Author response table 2.
- Reference (3) is an opinion article with the last author as the sole author. It is used twice as a self-standing reference, which is confusing, as it suggests there is previous experimental evidence.
We thank the reviewer for pointing this out and agree that it may not be appropriate to cite the article (Guck 2019 Biophysical Reviews, formerly Reference (3), currently Reference (76)) in all instances. The references to this opinion article have now been removed from the introduction:
“The extent to which cells can be deformed by external loads is determined by their mechanical properties, such as cell stiffness. Since the mechanical phenotype of cells has been shown to reflect functional cell changes, it is now well established as a sensitive label-free biophysical marker of cell state in health and disease (1-2).”
“Alternatively, the problem can be reverse-engineered, in that omics datasets for systems with known mechanical phenotype changes are used for prediction of genes involved in the regulation of mechanical phenotype in a mechanomics approach.”
But has been kept in the discussion:
“The mechanical phenotype of cells is recognized as a hallmark of many physiological and pathological processes. Understanding how to control it is a necessary next step that will facilitate exploring the impact of cell mechanics perturbations on cell and tissue function
(76).”.
This reference seems appropriate to us as it expands on the point that our ability to control cell mechanics will enable the exploration of its impact on cell and tissue function, which is central to the discussion of the current manuscript.
-The authors should mention what PC-corr means. Principle component correlation? Pearson's coefficient correlation?
PC-corr is a combination of loadings from the principal component (PC) analysis and Pearson’s correlation for each gene pair. We have aimed at conveying this in the “Discriminative network analysis on prediction datasets” result section. We have now added and extra sentence at the first appearance of PC-corr to clarify that for the readers from the start:
“After characterizing the mechanical phenotype of the cell states, we set out to use the accompanying transcriptomic data to elucidate genes associated with the mechanical phenotype changes across the different model systems. To this end, we utilized a method for inferring phenotype-associated functional network modules from omics datasets termed PCCorr (28), that relies on combining loadings obtained from the principal component (PC) analysis and Pearson’s correlation (Corr) for every pair of genes. PC-Corr was performed individually on two prediction datasets, and the obtained results were overlayed to derive a conserved network module. Owing to the combination of the Pearson’s correlation coefficient and the discriminative information included in the PC loadings, the PC-corr analysis does not only consider gene co-expression — as is the case for classical co-expression network analysis — but also incorporates the relative relevance of each feature for discriminating between two or more conditions; in our case, the conditions representing soft and stiff phenotypes. The overlaying of the results from two different datasets allows for a multi-view analysis (utilizing multiple sets of features) and effectively merges the information from two different biological systems.”
- The formatting of Table 1 is confusing. Horizontal lines should be added to make it clear to the reader which datasets are human and which mouse as well as which accession numbers belong to the carcinomas.
Horizontal lines have now been added to improve the readability of Table 1. We hope that makes the table easier to follow and satisfies the request. We assume that further modifications to the table appearance may occur during publishing process in accordance with the publisher’s guidelines.
- In many figures, data points are shown in different shapes without an explanation of what the shapes represent.
We thank the reviewer for this comment and apologize for not adding this information earlier. We have added explanations of the symbols to captions of Figures 2, 3, 5, and 6 in the main text:
“Fig. 2. Mechanical properties of divergent cell states in five biological systems. Schematic overviews of the systems used in our study, alongside with the cell stiffness of individual cell states parametrized by Young’s moduli E. (…) Statistical analysis was performed using generalized linear mixed effects model. The symbol shapes represent measurements of cell lines derived from three different patients (A), matched experimental replicates (C), two different reprogramming series (D), and four different cell isolations (E). Data presented in (A) and (D) were previously published in ref (29) and (30), respectively.”
“Fig. 3. Identification of putative targets involved in cell mechanics regulation. (A) Glioblastoma and iPSC transcriptomes used for the target prediction intersect at 9,452 genes. (B, C) PCA separation along two first principal components of the mechanically distinct cell states in the glioblastoma (B) and iPSC (C) datasets. The analysis was performed using the gene expression data from the intersection presented in (A). The symbol shapes in (B) represent cell lines derived from three different patients. (…)”
“Fig. 5. Perturbing levels of CAV1 affects the mechanical phenotype of intestine carcinoma cells. (…) In (E), (F), (I), and (J), the symbol shapes represent experiment replicates.”
“Fig. 6. Perturbations of CAV1 levels in MCF10A-ER-Src cells result in cell stiffness changes. (…) Statistical analysis was performed using a two-sided Wilcoxon rank sum test. In (B), (D), and (E), the symbol shapes represent experiment replicates.”
As well as to Figures S2, S9, and S11 in the supplementary material (in Figure S2, the symbol explanation was added to the legends in the figure panels as well):
“Fig. S2. Plots of area vs deformation for different cell states in the characterized systems. Panels correspond to the following systems: (A) glioblastoma, (B) carcinoma, (C) non-tumorigenic breast epithelia MCF10A, (D) induced pluripotent stem cells (iPSCs), and (E) developing neurons. 95%- and 50% density contours of data pooled from all measurements of given cell state are indicated by shaded areas and continuous lines, respectively. Datapoints indicate medians of individual measurements. The symbol shapes represent cell lines derived from three different patients (A), two different reprogramming series (D), and four different cell isolations (E), as indicated in the respective panels. (…).”
“Fig. S9. CAV1 knock-out mouse embryonic fibroblasts (CAV1KO) have lower stiffness compared to the wild type cells (WT). (…) (C) Apparent Young’s modulus values estimated for WT and CAV1KO cells using areadeformation data in (B). The symbol shapes represent experimental replicates. (…)”
“Fig. S11. Plots of area vs deformation from RT-DC measurements of cells with perturbed CAV1 levels. Panels correspond to the following experiments: (A and B) CAV1 knock-down in TGBC cells using esiRNA (A) and ONTarget siRNA (B), (C and D) transient CAV1 overexpression in ECC4 cells (C) and TGBC cells (D). Datapoints indicate medians of individual measurement replicates. The isoelasticity lines in the background (gray) indicate regions of of same apparent Young’s moduli. The symbol shapes represent experimental replicates.”
- In Figure 2, the difference in stiffness appears bigger than it actually is because the y-axes are not starting at 0.
While we acknowledge that starting the y-axes at a value other than 0 is generally not ideal, we chose this approach to better display data variability and minimize empty space in the plots.
A similar effect can be achieved with logarithmic scaling, which is a common practice (see Author response image 21 for visualization). We believe our choice of axes cut-off enhances the interpretability of the data without misleading the viewer.
Author response image 21.
Visualization of different axis scaling strategies applied to the five datasets presented in Figure 2 of the manuscript.
Of note, apparent Young’s moduli obtained from RT-DC measurements typically span 0.5-3.0 kPa (see Figure 2.3 from Urbanska et al 2021, PhD thesis). Differences between treatments rarely exceed a few hundred pascals. For example, in an siRNA screen of mitotic cell mechanics regulators in Drosophila cells (Kc167), the strongest hits (e.g., Rho1, Rok, dia) showed changes in stiffness of 100-150 Pa (see Supplementary Figure 11 from Rosendahl, Plak et al 2018, Nature Methods 15(5): 355-358).
- In Figure 3, I don't personally see the benefit of showing different cut-offs for PC-corr. In the end, the paper focuses on the 5 genes in the pentagram. I think only showing one of the cutoffs and better explaining why those target genes were picked would be sufficient and make it clearer for the reader.
We believe it is beneficial to show the extended networks for a few reasons. First, it demonstrates how the selected targets connect to the broader panel of the genes, and that the selected module is indeed much more interconnected than other nodes. Secondly, the chosen PC-corr cut-off is somewhat arbitrary and it may be interesting to look through the genes from the extended network as well, as they are likely also important for regulating cell mechanics. This broader view may help readers identify familiar genes and recognizing the connections to relevant signaling networks and processes of interest.
- In Figure 4C, I suggest explaining why the FANTOM5 and not another dataset was used for the visualization here and mentioning whether the other datasets were similar.
In Figure 4C, we have chosen to present data corresponding to FANTOM5, because that was the only carcinoma dataset in which all the cell lines tested mechanically are presented. We have now added this information to the caption of Figure 4. Additionally, the clustergrams corresponding to the remaining carcinoma datasets (CCLE RNASeq, Genetech ) are presented in supplementary figures S4-S6.
“The target genes show clear differences in expression levels between the soft and stiff cell states and provide for clustering of the samples corresponding to different cell stiffnesses in both prediction and validation datasets (Fig. 4, Figs. S4-S6).”
Typos
We would like to thank the Reviewer#3 for their detailed comments on the typos and details listed below. This is much appreciated as it improved the quality of our manuscript.
- In the first paragraph of the results section the 'and' should be removed from this sentence: Each dataset encompasses two or more cell states characterized by a distinct mechanical phenotype, and for which transcriptomic data is available.
The sentence has been corrected and now reads:
“Each dataset encompasses two or more cell states characterized by a distinct mechanical phenotype, and for which transcriptomic data is available.”
- In the methods in the MCF10A PIK3CA cell lines part, it says cell liens instead of cell lines.
The sentence has been corrected and now reads:
“The wt cells were additionally supplemented with 10 ng ml<sup>−1</sup> EGF (E9644, Sigma-Aldrich), while mutant cell lienes were maintained without EGF.”
- In the legend of Figure 6 "accession number: GSE17941, data previously published in ())" the reference is missing.
The reference has been added.
- In the legend of Figure 5 "(E) Verification of CAV1 knock-down in TGBC cells using two knock-down system" 'a' between using and two is missing.
The legend has been corrected (no ‘a’ is missing, but it should say systems (plural)):
- In Figure 5B one horizontal line is missing.
The Figure 5B has been corrected accordingly.
- Terms such as de novo or in silico should be written in cursive.
We thank the Reviewer for this comment; however, we believe that in the style used by eLife, common Latin expressions such as de novo or in vitro are used in regular font.
- In the heading of Table 4 "The results presented in this table can be reproducible using the code and data available under the GitHub link reported in the methods section." It should say reproduced instead of reproducible.
Yes, indeed. It has been corrected.
- The citation of reference 20 contains several author names multiple times.
Indeed, it has been fixed now:
- In Figure S2 there is a vertical line in the zeros of the y axis labels.
I am not sure if there was some rendering issue, but we did not see a vertical line in the zeros of the y axis label in Figure S2.
- The Text in Figure S4 is too small.
We thank the reviewer for pointing this out. We have now revised Figure S7 (formerly Figure S4) to increase the text size, ensuring better readability. (It has also been updated to include additional fits as requested by Reviewer #2).
- In Table 3 "positive hypothesis II markers are discriminative of samples with stiff/soft independent of data source" the words 'mechanical phenotype' are missing.
The column headings in Table 3 have now been updated accordingly.
- In Table S3 explain in the table headline what vi1, vi2 and v are. I assume the loading for PC1, the loading for PC2 and the average of the previous two values. But it should be mentioned somewhere.
The caption of table S3 has been updated to explain the meaning of vi1, vi2 and v.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
eLife Assessment
This study addresses a novel and interesting question about how the rise of the Qinghai-Tibet Plateau influenced patterns of bird migration, employing a multi-faceted approach that combines species distribution data with environmental modeling. The findings are valuable for understanding avian migration within a subfield, but the strength of evidence is incomplete due to critical methodological assumptions about historical species-environment correlations, limited tracking data, and insufficient clarity in species selection criteria. Addressing these weaknesses would significantly enhance the reliability and interpretability of the results.
We would like to thank you and two anonymous reviewers for your careful, thoughtful, and constructive feedback on our manuscript. These reviews made us revisit a lot of our assumptions and we believe the paper will be much improved as a result. In addition to minor points, we will make three main changes to our manuscript in response to the reviews. First, we will address the concerns on the assumptions of historical species-environment correlations from perspectives of both theoretical and empirical evidence. Second, we will discuss the benefits and limitations of using tracking data in our study and demonstrate how the findings of our study are consolidated with results of previous studies. Third, we will clarify our criteria for selecting species in terms of both eBird and tracking data.
Below, we respond to each comment in turn. Once again, we thank you all for your feedback.
Reviewer #1 (Public review):
Strengths:
This is an interesting topic and a novel theme. The visualisations and presentation are to a very high standard. The Introduction is very well-written and introduces the main concepts well, with a clear logical structure and good use of the literature. The methods are detailed and well described and written in such a fashion that they are transparent and repeatable.
We appreciate the reviewer’s careful reading of our manuscript, encouraging comments and constructive suggestions.
Weaknesses:
I only have one major issue, which is possibly a product of the structure requirements of the paper/journal. This relates to the Results and Discussion, line 91 onwards. I understand the structure of the paper necessitates delving immediately into the results, but it is quite hard to follow due to a lack of background information. In comparison to the Methods, which are incredibly detailed, the Results in the main section reads as quite superficial. They provide broad overviews of broad findings but I found it very hard to actually get a picture of the main results in its current form. For example, how the different species factor in, etc.
Yes, it is the journal request to format in this way (Methods follows the Results and Discussion) for the article type of short reports. As suggested, in the revision we will elaborate on details of our findings, especially the species-specific responses, in terms of (i) shifts of distribution of avian breeding and wintering areas under the influence of the uplift of the Qinghai-Tibetan Plateau, and (ii) major factors that shape current migration patterns of birds in the Plateau. We will also better reference the approaches we used in the study.
Reviewer #2 (Public review):
Summary:
The study tries to assess how the rise of the Qinghai-Tibet Plateau affected patterns of bird migration between their breeding and wintering sites. They do so by correlating the present distribution of the species with a set of environmental variables. The data on species distributions come from eBird. The main issue lies in the problematic assumption that species correlations between their current distribution and environment were about the same before the rise of the Plateau. There is no ground truthing and the study relies on Movebank data of only 7 species which are not even listed in the study. Similarly, the study does not outline the boundaries of breeding sites NE of the Plateau. Thus it is absolutely unclear potentially which breeding populations it covers.
We are very grateful for the careful review and helpful suggestions. We will revise the manuscript carefully in response to the reviewer’s comments and believe that it will be much improved as a result. Below are our point-by-point replies to the comments.
Strengths:
I like the approach for how you combined various environmental datasets for the modelling part.
We appreciate the reviewer’s encouragement.
Weaknesses:
The major weakness of the study lies in the assumption that species correlations between their current distribution and environments found today are back-projected to the far past before the rise of the Q-T Plateau. This would mean that species responses to the environmental cues do not evolve which is clearly not true. Thus, your study is a very nice intellectual exercise of too many ifs.
This is a valid concern. We will address this from both the perspectives of the theoretical design of our study and empirical evidence.
First, we agree with the reviewer that species responses to environmental cues might vary over time. Nonetheless, the simulated environments before the uplift of the plateau serve as a counterfactual state in our study. Counterfactual is an important concept to support causation claims by comparing what happened to what would have happened in a hypothetical situation: “If event X had not occurred, event Y would not have occurred” (Lewis 1973). Recent years have seen an increasing application of the counterfactual approach to detect biodiversity change, i.e., comparing diversity between the counterfactual state and real estimates to attribute the factors causing such changes (e.g., Gonzalez et al. 2023). Whilst we do not aim to provide causal inferences for avian distributional change, using the counterfactual approach, we are able to estimate the influence of the plateau uplift by detecting the changes of avian distributions, i.e., by comparing where the birds would have distributed without the plateau to where they currently distributed. We regard the counterfactual environments as a powerful tool for eliminating, to the extent possible, vagueness, as opposed to simply description of current distributions of birds. Therefore, we assume species’ responses to environments are conservative and their evolution should not discount our findings. We will clarify this in both the Introduction and Methods.
Second, we used species distribution modelling to contrast the distributions of birds before and after the uplift of the plateau under the assumption that species tend to keep their ancestral ecological traits over time (i.e., niche conservatism). This indicates a high probability for species to distribute in similar environments wherever suitable. Particularly, considering birds are more likely to be influenced by food resources (Martins et al. 2024), and the distribution of available food before the uplift (Jia et al. 2020), we believe the findings can provide valuable insights into the influence of the plateau on avian migratory patterns. Having said that, we acknowledge other factors, e.g., carbon dioxide concentrations (Zhang et al. 2022), can influence the simulations of environments and our prediction of avian distribution. We will clarify the assumptions and evidence we have for the modelling in Methods. We will further point out the direction for future studies in the Discussion.
The second major drawback lies in the way you estimate the migratory routes of particular birds. No matter how good the data eBird provides is, you do not know population-specific connections between wintering and breeding sites. Some might overwinter in India, some populations in Africa and you will never know the teleconnections between breeding and wintering sites of particular species. The few available tracking studies (seven!) are too coarse and with limited aspects of migratory connectivity to give answer on the target questions of your study.
We agree with the reviewer that establishing interconnections for birds is important for estimating the migration patterns of birds. We employed a dynamic model to assess their weekly distributions. Thus, we can track the movement of species every week, and capture the breeding and wintering areas for specific populations. That being said, we acknowledge that our approach can be subjected to the patchy sampling of eBird data. We will better demonstrate this in the main text.
Tracking data can provide valuable insights into the movement patterns of species but are limited to small numbers of species due to the considerable costs and time needed. We aimed to adopt the tracking data to examine the influence of focal factors on avian migration patterns, but only seven species, to the best of our ability, were acquired. Moreover, similar results were found in studies that used tracking data to estimate the distribution of breeding and wintering areas of birds in the plateau (e.g., Prosser et al. 2011, Zhang et al. 2011, Zhang et al. 2014, Liu et al. 2018, Kumar et al. 2020, Wang et al. 2020, Pu and Guo 2023, Yu et al. 2024, Zhao et al. 2024). We believe the conclusions based on seven species are rigour, but their implications could be restricted by the number of tracking species we obtained. We will demonstrate how our findings on breeding and wintering areas of birds are reinforced by other studies reporting the locations of those areas. We will also add a separate caveat section to discuss the limitations stated above.
Your set of species is unclear, selection criteria for the 50 species are unknown and variability in their migratory strategies is likely to affect the direction of the effects.
We will clarify the selection criteria for the 50 species). We first obtained a full list of birds in the plateau from Prins and Namgail (2017). We then extracted species identified as full migrants in Birdlife International (https://datazone.birdlife.org/species/spcdistPOS) from the full list.
In addition, the position of the breeding sites relative to the Q-T plate will affect the azimuths and resulting migratory flyways. So in fact, we have no idea what your estimates mean in Figure 2.
We calculated the azimuths not only by the angles between breeding sites and wintering sites but also based on the angles between the stopovers of birds. Therefore, the azimuths are influenced by the relative positions of breeding, wintering and stopover sites. We will better explain this both in the Methods and legend of Figure 2.
There is no way one can assess the performance of your statistical exercises, e.g. performances of the models.
As suggested, we will add the AUC values to assess the performances of the models.
References
Gonzalez, A., J. M. Chase, and M. I. O'Connor. 2023. A framework for the detection and attribution of biodiversity change. Philosophical Transactions of the Royal Society B: Biological Sciences 378: 20220182.
Jia, Y., H. Wu, S. Zhu, Q. Li, C. Zhang, Y. Yu, and A. Sun. 2020. Cenozoic aridification in Northwest China evidenced by paleovegetation evolution. Palaeogeography, Palaeoclimatology, Palaeoecology 557:109907.
Kumar, N., U. Gupta, Y. V. Jhala, Q. Qureshi, A. G. Gosler, and F. Sergio. 2020. GPS-telemetry unveils the regular high-elevation crossing of the Himalayas by a migratory raptor: implications for definition of a “Central Asian Flyway”. Scientific Reports 10:15988.
Lewis, D. 1973. Counterfactuals. Oxford: Blackwell.
Liu, D., G. Zhang, H. Jiang, and J. Lu. 2018. Detours in long-distance migration across the Qinghai-Tibetan Plateau: individual consistency and habitat associations. PeerJ 6:e4304.
Martins, L. P., D. B. Stouffer, P. G. Blendinger, K. Böhning-Gaese, J. M. Costa, D. M. Dehling, C. I. Donatti, C. Emer, M. Galetti, R. Heleno, Í. Menezes, J. C. Morante-Filho, M. C. Muñoz, E. L. Neuschulz, M. A. Pizo, M. Quitián, R. A. Ruggera, F. Saavedra, V. Santillán, M. Schleuning, L. P. da Silva, F. Ribeiro da Silva, J. A. Tobias, A. Traveset, M. G. R. Vollstädt, and J. M. Tylianakis. 2024. Birds optimize fruit size consumed near their geographic range limits. Science 385:331-336.
Prins, H. H. T., and T. Namgail. 2017. Bird migration across the Himalayas : wetland functioning amidst mountains and glaciers. Cambridge University Press, Cambridge.
Prosser, D. J., P. Cui, J. Y. Takekawa, M. Tang, Y. Hou, B. M. Collins, B. Yan, N. J. Hill, T. Li, Y. Li, F. Lei, S. Guo, Z. Xing, Y. He, Y. Zhou, D. C. Douglas, W. M. Perry, and S. H. Newman. 2011. Wild bird migration across the Qinghai-Tibetan Plateau: a transmission route for highly pathogenic H5N1. PloS One 6:e17622.
Pu, Z., and Y. Guo. 2023. Autumn migration of black-necked crane (Grus nigricollis) on the Qinghai-Tibetan and Yunnan-Guizhou plateaus. Ecology and Evolution 13:e10492.
Wang, Y., C. Mi, and Y. Guo. 2020. Satellite tracking reveals a new migration route of black-necked cranes (Grus nigricollis) in Qinghai-Tibet Plateau. PeerJ 8:e9715.
Yu, X., G. Song, H. Wang, Q. Wei, C. Jia, and F. Lei. 2024. Migratory flyways and connectivity of brown headed gulls (Chroicocephalus brunnicephalus) revealed by GPS tracking. Global Ecology and Conservation 56:e03340.
Zhang, G.G., D.P. Liu, Y.Q. Hou, H.X. Jiang, M. Dai, F.W. Qian, J. Lu, T. Ma, L.X. Chen, and Z. Xing. 2014. Migration routes and stopover sites of Pallas’s gulls Larus ichthyaetus breeding at Qinghai Lake, China, determined by satellite tracking. Forktail 30:104-108.
Zhang, G.G., D.P. Liu, Y.Q. Hou, H.X. Jiang, M. Dai, F.W. Qian, J. Lu, Z. Xing, and F.S. Li. 2011. Migration routes and stop-over sites determined with satellite tracking of bar-headed geese (Anser indicus) breeding at Qinghai Lake, China. Waterbirds 34:112-116, 115.
Zhang, R., D. Jiang, C. Zhang, and Z. Zhang. 2022. Distinct effects of Tibetan Plateau growth and global cooling on the eastern and central Asian climates during the Cenozoic. Global and Planetary Change 218:103969.
Zhao, T., W. Heim, R. Nussbaumer, M. van Toor, G. Zhang, A. Andersson, J. Bäckman, Z. Liu, G. Song, M. Hellström, J. Roved, Y. Liu, S. Bensch, B. Wertheim, F. Lei, and B. Helm. 2024. Seasonal migration patterns of Siberian Rubythroat (Calliope calliope) facing the Qinghai–Tibet Plateau. Movement Ecology 12:54.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Reviewer #2 (Public review):
(1) Given their results the authors conclude that upregulation of Frizzled on the plasma membrane is not sufficient to explain the stabilization of beta-catenin seen in the ZNRF3/RNF43 mutant cells. This interpretation is sound, and they suggest in the discussion that ZNRF3/RNF43-mediated ubiquitination could serve as a sorting signal to sort endocytosed FZD to lysosomes for degradation and that absence or inhibition of this process would promote FZD recycling. This should be relatively easy to test using surface biotinylation experiments and would considerably strengthen the manuscript.
Thank you for your valuable suggestions and comments. We will perform cell surface biotinylation experiments.
(2) The authors show that the FZD5 CRD domain is required for endocytosis since a mutant FZD5 protein in which the CRD is removed does not undergo endocytosis. This is perhaps not surprising since this is the site of Wnt binding, but the authors show that a chimeric FZD5CRD-FZD4 receptor can confer Wnt-dependent endocytosis to an otherwise endocytosis incompetent FZD4 protein. Since the linker region between the CRD and the first TM differs between FZD5 and FZD4 it would be interesting to understand whether the CRD specifically or the overall arrangement (such as the spacing) is the most important determinant.
Our results in Fig. 1F-G clearly show that the CRD of FZD5 specifically is both necessary and sufficient for Wnt3a/5a-induced FZD5 endocytosis, as replacing the CRD alone in FZD5 with the CRD from either FZD4 or FZD7 completely abolished Wnt-induced endocytosis, whereas replacing the CRD alone in FZD4 or FZD7 with the FZD5 CRD alone could confer Wnt-induced endocytosis.
(3) I find it surprising that only FZD5 and FZD8 appear to undergo endocytosis or be stabilized at the cell surface upon ZNRF3/RNF43 knockout. Is this consistent with previous literature? Is that a cell-specific feature? These findings should be tested in a different cell line, with possibly different relative levels of ZNRF3 and RNF43 expression.
Thank you for your comments and suggestions. Our finding that ZNRF3/RNF43 specifically regulates FZD5/8 degradation is consistent with recent published studies in which FZD5 is required for the survival of RNF43-mutant PDAC or colorectal cancer cells (Nature Medicine, 2017, PMID: 27869803) and FZD5 is required for the maintenance of intestinal stem cells (Developmental Cell, 2024, PMID: 39579768 and 39579769), and in both cases, FZDs other than FZD5/8 are also expressed but not sufficient to compensate for the function of FZD5. The mechanism by which Wnt3a/5a specifically induces FZD5/8 endocytosis and degradation is currently unknown and needs to be explored in the future. We speculate that Wnt binding to FZD5/8 may recruit another protein on the cell surface to specifically facilitate FZD5/8 endocytosis. On the other hand, we cannot exclude the possibility that Wnts other than Wnt3a/5a may induce the endocytosis and degradation of FZDs other than FZD5/8 since there are 19 Wnts and 10 FZDs in humans. We will perform flow cytometry experiments using FZD5/8-specific antibodies to examine whether Wnt3a/5a induces FZD5/8 endocytosis in more cell lines.
(4) If FZD7 is not a substrate of ZNRF3/RNF43 and therefore is not ubiquitinated and degraded, how do the authors reconcile that its overexpression does not lead to elevated cytosolic beta-catenin levels in Figure 5B?
We are currently not sure of the mechanism underlying this result. Considering that most FZDs are expressed in 293A cells, we do not know how much of the mature form of overexpressed FZD7 was presented to the plasma membrane.
(5) For Figure 5B, it would be interesting if the authors could evaluate whether overexpression of FZD5 in the ZNRF3/RNF43 double knockout lines would synergize and lead to further increase in cytosolic beta-catenin levels. As control if the substrate selectivity is clear FZD7 overexpression in that line should not do anything.
We will perform these experiments as you suggested.
(6) In Figure 6G, the authors need to show cytosolic levels of beta-catenin in the absence of Wnt in all cases.
We did not add Wnt CM in this experiment. RSPO1 activity, which relies on endogenous Wnt, has been well documented in previous studies.
(7) Since the authors show that DVL is not involved in the Wnt and ZRNF3-dependent endocytosis they should repeat the proximity biotinylation experiment in figure 7 in the DVL triple KO cells. This is an important experiment since previous studies showed that DVL was required for the ZRNF3/RNF43-mediated ubiqtuonation of FZD.
Thank you for your valuable suggestions. We will perform the proximity biotinylation experiment in DVL TKO cells.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
Summary
This manuscript aimed to study the role of Rudhira (also known as Breast Carcinoma Amplified Sequence 3), an endothelium-restricted microtubules-associated protein, in regulating of TGFβ signaling. The authors demonstrate that Rudhira is a critical signaling modulator for TGFβ signaling by releasing Smad2/3 from cytoskeletal microtubules and how Rudhira is a Smad2/3 target gene. Taken together, the authors provide a model of how Rudhira contributes to TGFβ signaling activity to stabilize the microtubules, which is essential for vascular development.
Strengths
The study used different methods and techniques to achieve aims and support conclusions, such as Gene Ontology analysis, functional analysis in culture, immunostaining analysis, and proximity ligation assay. This study provides an unappreciated additional layer of TGFβ signaling activity regulation after ligand receptor interaction.
We thank the reviewer for acknowledging the importance of our study and providing a clear summary of our findings.
Weaknesses
(1) It is unclear how current findings provide a beVer understanding of Rudhira KO mice, which the authors published some years ago.
Our previous study demonstrated that Rudhira KO mice have a predominantly developmental cardiovascular phenotype that phenocopies TGFβ loss of function (Shetty, Joshi et al., 2018). Additionally, we found that at the molecular level, Rudhira regulates cytoskeletal organization (Jain et al., 2012; Joshi and Inamdar, 2019). Our current study builds upon these previous findings, showing an essential role of Rudhira in maintaining TGFβ signaling and controlling the microtubule cytoskeleton during vascular development. On one hand Rudhira regulates TGFβ signaling by promoting the release of Smads from microtubules, while on the other, Rudhira is a TGFβ target essential for stabilizing microtubules. Thus, our current study provides a molecular basis for Rudhira function in cardiovascular development.
(2) Why do they use HEK cells instead of SVEC cells in Figure 2 and 4 experiments?
Our earlier studies have characterized the role of Rudhira in detail using both loss and gain of function methods in multiple cell types (Jain et al., 2012; SheVy, Joshi et al., 2018; Joshi and Inamdar, 2019). As endothelial cells are particularly difficult to transfect, and because the function of Rudhira in promoting cell migration is conserved in HEK cells, it was practical and relevant to perform these experiments in HEK cells (Figures 2 and 4E).
(3) A model shown in Figure 5E needs improvement to grasp their findings easily.
We have modified Figure 5E for clarity.
Reviewer #2 (Public Review):
Summary
It was first reported in 2000 that Smad2/3/4 are sequestered to microtubules in resting cells and TGF-β stimulation releases Smad2/3/4 from microtubules, allowing activation of the Smad signaling pathway. Although the finding was subsequently confirmed in a few papers, the underlying mechanism has not been explored. In the present study, the authors found that Rudhira/breast carcinoma amplified sequence 3 is involved in the release of Smad2/3 from microtubules in response to TGF-β stimulation. Rudhira is also induced by TGF-β and is probably involved in the stabilization of microtubules in the delayed phase after TGF-β stimulation. Therefore, Rudhira has two important functions downstream of TGF-β in the early as well as delayed phase.
Strengths:
This work aimed to address an unsolved question on one of the earliest events after TGF-β stimulation. Based on loss-of-function experiments, the authors identified a novel and potentially important player, Rudhira, in the signal transmission of TGF-β.
We thank the reviewer for the critical evaluation and appreciation of our findings.
Weaknesses:
The authors have identified a key player that triggers Smad2/3 released from microtubules after TGF-β stimulation probably via its association with microtubules. This is an important first step for understanding the regulation of Smad signaling, but underlying mechanisms as well as upstream and downstream events largely remain to be elucidated.
We acknowledge that the mechanisms regulating cytoskeletal control of Smad signaling are far from clear, but these are out of scope of this manuscript. This manuscript rather focuses on Rudhira/Bcas3 as a pivot to understand vascular TGFβ signaling and microtubule connections.
(1) The process of how Rudhira causes the release of Smad proteins from microtubules remains unclear. The statement that "Rudhira-MT association is essential for the activation and release of Smad2/3 from MTs" (lines 33-34) is not directly supported by experimental data.
We agree with the reviewer’s comment. Although we provide evidence that the loss of Rudhira (and thereby deduced loss of Rudhira-MT association) prevents release of Smad2/3 from MTs (Fig 3C), it does not confirm the requirement of Rudhira-MT association for this. In light of this, we have modified the statement to ‘Rudhira associates with MTs and is essential for the activation and release of Smad2/3 from MTs”.
(2) The process of how Rudhira is mobilized to microtubules in response to TGF-β remains unclear.
Our previous study showed that Rudhira associates with microtubules, and preferentially binds to stable microtubules (Jain et al., 2012; Joshi and Inamdar, 2019). Since TGFβ stimulation is known to stabilize microtubules, we hypothesize that TGFβ stimulation increases Rudhira binding to stable microtubules. We have mentioned this in our revised manuscript.
(3) After Rudhira releases Smad proteins from microtubules, Rudhira stabilizes microtubules. The process of how cells return to a resting state and recover their responsiveness to TGF-β remains unclear.
We show that dissociation of Smads from microtubules is an early response and stabilization of microtubules is a late TGFβ response. However, we agree that the sequence of these molecular events has not been characterized in-depth in this or any other study, making it difficult to assign causal roles (eg. whether release of Smads from MTs is a pre-requisite for MT stabilization by Rudhira) or reversibility. However, the TGFβ pathway is auto regulatory, leading to increased turnover of receptors and Smads and increased expression of inhibitory Smads, which may recover responsiveness to TGFβ. Additionally, the still short turnover time of stable microtubules (several minutes to hours) may also promote quick return to resting state. We have discussed this in our revised manuscript.
Recommendations for the authors:
Reviewer #2 (Recommendations for The Authors):
(1) Overall: Duration of TGF-β stimulation in cell-based assays should be described in the legends for readers' convenience. Avoid simple bar graphs because sample numbers are only 3. A scaVer plot should be super-imposed.
Details added, as suggested. Duration of treatment is mentioned in Materials and methods section for figures 1C-D; 2A-B; 3; 4A-C; 5A-C; S2D; S3A-C; S4C, D. Bar graphs have been replaced with a bar + scatter plot. Note that, as the Excel file for data related to fig 4A was corrupted, we repeated the experiments to generate fresh data. Hence the graph had to be replaced. However, the result holds true as before.
(2) Figure 1A: This panel is too small. Gene names are almost invisible.
Modified for clarity.
(3) Figure 1B: Show TGFβRI expression by immunoblomng (re-probing) to verify that it is expressed in the rightmost lane.
TGFβRI overexpression was confirmed by qPCR in a replicate in the same experiment (Fig S2C).
(4) Figure 1C: Show expression of Rudhira. In addition, confirm the positions of molecular weight markers. Smad2 migrated slower than pSmad2.
Rudhira expression is shown in Fig S1B. Molecular weight markers have been corrected.
(5) Figure 3A: This panel shows a negative result that Smad2/3 fails to interact with Rudhira. A positive control, for example, Smad4, would make the data convincing.
Although it would be nice to have a positive control for interaction, we do not agree that a positive control of Smad4 is essential for our conclusion from this experiment, which is that ‘we were unable to detect an interaction between Rudhira and Smad2/3’.
(6) Fig. 3B: Show Rudhira blot. If possible, show that the Rudhira-MT association precedes Smad phosphorylation by a time course experiment. This is an important point but not experimentally demonstrated.
The interaction between Rudhira and microtubules with or without TGFβ is demonstrated by PLA (Fig 3E). Although important, the suggested time course experiments to assess the sequence of events are beyond the scope of this manuscript.
(7) Figure 3E: Does the process require the type I receptor kinase activity or non-Smad signaling pathways?
Since TGFβ pathway is complex and is regulated at multiple steps, this possibility has not been tested and is beyond the scope of current study.
(8) Figure 4A: The authors did not examine if these elements are functional. Therefore, this panel can be presented as a supplementary figure.
As suggested, the panel has been moved to supplementary information.
(9) Figure 4E: The figure legend does not say that cells were TGF-β-stimulated. It remains unclear if Smad2 and Smad3 are involved in Rudhira expression as phosphorylated or non-phosphorylated forms. Therefore, the authors should show a pSmad2 blot. In the absence of TGF-β stimulation, Smad2 and Smad3 are expected to be sequestrated to microtubules and therefore not phosphorylated. In the case that cells were stimulated with TGF-β, show if Rudhira is induced by TGF-β in HEK293T cells. This is not shown in this manuscript.
This experiment was not performed under regulated conditions with or without TGFβ, hence the sensitivity to TGFβ could not be assessed. Cells were not stimulated with exogenous TGFβ, but cultured in regular medium with serum, which can have up to ~40 ng/ml of TGFβ (latent and active). Additionally, owing to severe depletion of Smad2 or Smad3 by shRNAs we expect sufficient loss of phospho-Smads2/3.
(10) Figure S1A: Rudhira migrated at the position corresponding to 91 kD only in this panel.
Corrected the position of molecular weight marker.
(11) Line 205-206, "Since in vivo studies indicate that rudhira depletion severely affects the TGFβ pathway [11]": Refer to Reference 11. The paper does not say anything about TGFβ.
Reference corrected to Ref #14.
(12) Smad4 was previously reported to be sequestered to microtubules [Ref. 7]. Does Rudhira release Smad4 also?
This is an interesting point which could be followed up on our future studies.
(13) It would be nice if the authors examined how Rudhira causes the release of Smad2/3 from microtubules. Currently, it remains unclear whether the association of Rudhira to microtubules is required for the release of Smad2/3. Does a Rudhira mutant lacking microtubule binding fail to induce the release of Smad2/3 after TGF-β stimulation? If so, do Rudhira and Smad2/3 share the same binding site on microtubules? In that case, the mechanism can be regarded as "competitive".
This is a thoughtful experiment much beyond the scope of current manuscript. In our previous study we were able to localize the Tubulin binding sites of Rudhira primarily to its Bcas3 domain (Joshi and Inamdar, 2019), however the equivalent sites in Tubulin were not assessed. While MH2 domains of Smad2/3 bind β-tubulin, amino acids 114-243 of β-tubulin bind to Smad2/3 (Dai et al., 2007). A systematic study of these tripartite interactions including Rudhira would be an interesting follow up for our future study.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the previous reviews.
Reviewer #1 (Public review):
Summary:
The authors show that SVZ derived astrocytes respond to a middle carotid artery occlusion
(MCAO) hypoxia lesion by secreting and modulating hyaluronan at the edge of the lesion (penumbra) and that hyaluronan is a chemoattractant to SVZ astrocytes. They use lineage tracing of SVZ cells to determine their origin. They also find that SVZ derived astrocytes express Thbs-4 but astrocytes at the MCAO-induced scar do not. Also, they demonstrate that decreased HA in the SVZ is correlated with gliogenesis. While much of the paper is descriptive/correlative they do overexpress Hyaluronan synthase 2 via viral vectors and show this is sufficient to recruit astrocytes to the injury. Interestingly, astrocytes preferred to migrate to the MCAO than to the region of overexpressed HAS2.
Strengths:
The field has largely ignored the gliogenic response of the SVZ, especially with regards to astrocytic function. These cells and especially newborn cells may provide support for regeneration. Emigrated cells from the SVZ have been shown to be neuroprotective via creating pro-survival environments, but their expression and deposition of beneficial extracellular matrix molecules is poorly understood. Therefore, this study is timely and important. The paper is very well written and the flow of results logical.
Comments on revised version:
The authors have addressed my points and the paper is much improved. Here are the salient remaining issues that I suggest be addressed.
We appreciate the feedback by the reviewer, and we are glad that the paper is considered to be much improved. We have done our best to address the remaining issues in this 2nd revision.
The authors have still not shown, using loss of function studies, that Hyaluronan is necessary for SVZ astrogenesis and or migration to MCAO lesions.
This is true. Unfortunately, complete removal of hyaluronan (via Hyase) triggers epilepsy, already described in 1963 by James Young (Exp Neurol paper). Degradation by Hyase also provokes neuroinflammation (Soria et al., 2020 Nat Commun). Two alternatives could be 1) partial depletion with Has inhibitor 4MU (but it is also associated with increased inflammation) or 2) a Has-KO mouse, such as Has3-/- (Arranz et al., 2014), although, to our knowledge, this mouse line is not openly available. We have added a sentence in line 332 addressing this shortcoming: “Loss-of-function studies, using HA-depletion models or HA synthase (Has)deficient mice are still needed to corroborate this finding, although the inflammation associated with HA deficiency might confound interpretation.”
(1) The co-expression of EGFr with Thbs4 and the literature examination is useful.
We thank the reviewer for the kind comment.
(2) Too bad they cannot explain the lack of effect of the MCAO on type C cells. The comparison with kainate-induced epilepsy in the hippocampus may or may not be relevant.
As stated in the previous response, we also found this interesting, and it does warrant further exploration by looking into possible direct NSC-astrocyte differentiation. But we believe that both this possible direct differentiation and the reactive status for these astrocytes are out of the scope of the study. We will not speculate about this in the discussion, either.
(3) Thanks for including the orthogonal confocal views in Fig S6D.
(4) The statement that "BrdU+/Thbs4+ cells mostly in the dorsal area" and therefore they mostly focused on that region is strange. Figure 8 clearly shows Thbs4 staining all along the striatal SVZ. Do they mean the dorsal segment of the striatal SVZ or the subcallosal SVZ? Fig. 4b and Fig 4f clearly show the "subcallosal" area as the one analysed but other figures show the dorsal striatal region (Fig. 2a). This is important because of the well-known embryological and neurogenic differences between the regions.
While it is true that Thbs4 is also expressed in the other subregions of the SVZ (lateral, ventral and medial), as observed in Fig 8. we chose the dorsal area because it is the subregion where we observed the larger increase in slow proliferative NSCs (Thbs4/GFAP/BrdU-positive cells) after MCAO (Fig S3). As observed in the quantifications in Fig S3, we found Thbs4/GFAP/BrdUpositive cells increase in lateral, medial and ventral SVZ, but it is not significant. Therefore, from Fig 4 onwards, we focused on the dorsal SVZ, which the reviewer mentions as “subcallosal” area. We chose the term “dorsal” as stated in single-cell studies (Cebrian-Silla et al, 2021, eLife; Marcy et al., 2023, Sci Adv) and reviews (Sequerra 2014 Front Cell Neurosci) that investigate or mention this subregion, respectively. In the abstract, we are perfectly clear stating that newborn astrocytes migrate frm both dorsal and medial areas.
In Fig 2a, the immunofluorescence image shows medial and lateral SVZ, but at this point in the paper, we have not yet made specific subregional quantifications, and the Nestin, DCX and Thbs4 quantifications refer to the SVZ as a whole, both in the IF and in the WB (Fig 2e-g). We apologize for the confusion. We have clarified this in the text (line 119).
(5) It is good to know that the harsh MCAO's had already been excluded.
(6) Sorry for the lack of clarity - in addition to Thbs4, I was referring to mouse versus rat Hyaluronan degradation genes (Hyal1, Hyal2 and Hyal3) and hyaluronan synthase genes (HAS1 and HAS2) in order to address the overall species differences in hyaluronan biology thus justifying the "shift" from mouse to rat. You examine these in the (weirdly positioned) Fig. 8h,i. Please add a few sentences on mouse vs rat Thbs4 and Hyaluronan relevant genes.
We thank the reviewer for these remarks. We have now added a sentence pointing to the similar internalization and degradation in rat and mouse (reviewed by Sherman et al., 2015). This correction is in line 233. Hyaluronan is, in evolutionary terms, a very “old” molecule, part of the “ancient” glycan-based matrix, before the evolution of proteoglycans and fibrous proteins such as collagen, laminin etc. Hence, its machinery is highly conserved across species.
We have also reorganized the panels in Fig 8, where 8h and 8i were indeed weirdly positioned. We hope that the new version of this figure is more easily readable.
(7) Thank you for the better justification of using the naked mole rat HA synthase.
Reviewer #3 (Public review):
Summary:
The authors aimed to study the activation of gliogenesis and the role of newborn astrocytes in a post-ischemic scenario. Combining immunofluorescence, BrdU-tracing and genetic cellular labelling, they tracked the migration of newborn astrocytes (expressing Thbs4) and found that Thbs4-positive astrocytes modulate the extracellular matrix at the lesion border by synthesis but also degradation of hyaluronan. Their results point to a relevant function of SVZ newborn astrocytes in the modulation of the glial scar after brain ischemia. This work's major strength is the fact that it is tackling the function of SVZ newborn astrocytes, whose role is undisclosed so far.
Strengths:
The article is innovative, of good quality, and clearly written, with properly described Materials and Methods, data analysis and presentation. In general, the methods are designed properly to answer the main question of the authors, being a major strength. Interpretation of the data is also in general well done, with results supporting the main conclusions of this article.
In this revised version, the points raised/weaknesses were clarified and discussed in the article.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
Minor points:
(1) Thanks for the clarification.
(2) Thanks for the clarification.
(3) The magnification is not apparent in Fig. 5.
We had removed two brain slices (from 4 to 2) in order to increase the size of the image 2-fold. We have now further increased the TTC panel, 25% from the revised version, 125% from the original.
(4) Thanks for the clarification.
(5) Thanks for the clarification.
(6) Thanks for the clarification.
(7) Thanks for the clarification.
(8) Thanks for the clarification.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the previous reviews.
Public Reviews:
Reviewer #1 (Public Review):
(1) As VRMate (a component of behaviorMate) is written using Unity, what is the main advantage of using behaviorMate/VRMate compared to using Unity alone paired with Arduinos (e.g. Campbell et al. 2018), or compared to using an existing toolbox to interface with Unity (e.g. Alsbury-Nealy et al. 2022, DOI: 10.3758/s13428-021-01664-9)? For instance, one disadvantage of using Unity alone is that it requires programming in C# to code the task logic. It was not entirely clear whether VRMate circumvents this disadvantage somehow -- does it allow customization of task logic and scenery in the GUI? Does VRMate add other features and/or usability compared to Unity alone? It would be helpful if the authors could expand on this topic briefly.
We have updated the manuscript (lines 412-422) to clarify the benefits of separating the VR system as an isolated program and a UI that can be run independently. We argue that “…the recommended behaviorMate architecture has several important advantages. Firstly, by rendering each viewing angle of a scene on a dedicated device, performance is improved by splitting the computational costs across several inexpensive devices rather than requiring specialized or expensive graphics cards in order to run…, the overall system becomes more modular and easier to debug [and] implementing task logic in Unity would require understanding Object-Oriented Programming and C# … which is not always accessible to researchers that are typically more familiar with scripting in Python and Matlab.”
VRMate receives detailed configuration info from behaviorMate at runtime as to which VR objects to display and receives position updates during experiments. Any other necessary information about triggering rewards or presenting non-VR cues is still handled by the UI so no editing of Unity is necessary. Scene configuration information is in the same JSON format as the settings files for behaviorMate, additionally there are Unity Editor scripts which are provided in the VRmate repository which permit customizing scenes through a “drag and drop” interface and then writing the scene configuration files programmatically. Users interested in these features should see our github page to find example scene.vr files and download the VRMate repository (including the editor scripts). We provided 4 vr contexts, as well as a settings file that uses one of them which can be found on the behaviorMate github page (https://github.com/losonczylab/behaviorMate) in the “vr_contexts” and “example_settigs_files” directories. These examples are provided to assist VRMate users in getting set up and could provide a more detailed example of how VRMate and behaviorMate interact.
(2) The section on "context lists", lines 163-186, seemed to describe an important component of the system, but this section was challenging to follow and readers may find the terminology confusing. Perhaps this section could benefit from an accompanying figure or flow chart, if these terms are important to understand.
We maintain the use of the term context and context list in order to maintain a degree of parity with the java code. However, we have updated lines 173-175 to define the term context for the behaviorMate system: “... a context is grouping of one or more stimuli that get activated concurrently. For many experiments it is desirable to have multiple contexts that are triggered at various locations and times in order to construct distinct or novel environments.”
a. Relatedly, "context" is used to refer to both when the animal enters a particular state in the task like a reward zone ("reward context", line 447) and also to describe a set of characteristics of an environment (Figure 3G), akin to how "context" is often used in the navigation literature. To avoid confusion, one possibility would be to use "environment" instead of "context" in Figure 3G, and/or consider using a word like "state" instead of "context" when referring to the activation of different stimuli.
Thank you for the suggestion. We have updated Figure 3G to say “Environment” in order to avoid confusion.
(3) Given the authors' goal of providing a system that is easily synchronizable with neural data acquisition, especially with 2-photon imaging, I wonder if they could expand on the following features:
a. The authors mention that behaviorMate can send a TTL to trigger scanning on the 2P scope (line 202), which is a very useful feature. Can it also easily generate a TTL for each frame of the VR display and/or each sample of the animal's movement? Such TTLs can be critical for synchronizing the imaging with behavior and accounting for variability in the VR frame rate or sampling rate.
Different experimental demands require varying levels of precision in this kind of synchronization signals. For this reason, we have opted against a “one-size fits all” for synchronization with physiology data in behaviorMate. Importantly this keeps the individual rig costs low which can be useful when constructing setups specifically for use when training animals. behaviorMate will log TTL pulses sent to GPIO pins setup as sensors, and can be configured to generate TTL pulses at regular intervals. Additionally all UDP packets received by the UI are time stamped and logged. We also include the output of the arduino millis() function in all UDP packets which can be used for further investigation of clock drift between system components. Importantly, since the system is event driven there cannot be accumulating drift across running experiments between the behaviorMate UI and networked components such as the VR system.
For these reasons, we have not needed to implement a VR frame synchronization TTL for any of our experiments, however, one could extend VRMate to send "sync" packets back to behaviorMate to log when each frame was displayed precisely or TTL pulses (if using the same ODROID hardware we recommend in the standard setup for rendering scenes). This would be useful if it is important to account for slight changes in the frame rate at which the scenes are displayed. However, splitting rendering of large scenes between several devices results in fast update times and our testing and benchmarks indicate that display updates are smooth and continuous enough to appear coupled to movement updates from the behavioral apparatus and sufficient for engaging navigational circuits in the brain.
b. Is there a limit to the number of I/O ports on the system? This might be worth explicitly mentioning.
We have updated lines 219-220 in the manuscript to provide this information: Sensors and actuators can be connected to the controller using one of the 13 digital or 5 analog input/output connectors.
c. In the VR version, if each display is run by a separate Android computer, is there any risk of clock drift between displays? Or is this circumvented by centralized control of the rendering onset via the "real-time computer"?
This risk is mitigated by the real-time computer/UI sending position updates to the VR displays. The maximum amount scenes can be out of sync is limited because they will all recalibrate on every position update – which occurs multiple times per second as the animal is moving. Moreover, because position updates are constantly being sent by behaviorMate to VRMate and VRMate is immediately updating the scene according to this position, the most the scene can become out of sync with the mouse's position is proportional to the maximum latency multiplied by the running speed of the mouse. For experiments focusing on eliciting an experience of navigation, such a degree of asynchrony is almost always negligible. For other experimental demands it could be possible to incorporate more precise frame timing information but this was not necessary for our use case and likely for most other use cases. Additionally, refer to the response to comment 3a.
Reviewer #2 (Public review):
(1) The central controlling logic is coupled with GUI and an event loop, without a documented plugin system. It's not clear whether arbitrary code can be executed together with the GUI, hence it's not clear how much the functionality of the GUI can be easily extended without substantial change to the source code of the GUI. For example, if the user wants to perform custom real-time analysis on the behavior data (potentially for closed-loop stimulation), it's not clear how to easily incorporate the analysis into the main GUI/control program.
Without any edits to the existing source code behaviorMate is highly customizable through the settings files, which allow users to combine the existing contexts and decorators in arbitrary combinations. Therefore, users have been able to perform a wide variety of 1D navigation tasks, well beyond our anticipated use cases by generating novel settings files. The typical method for providing closed-loop stimulation would be to set up a context which is triggered by animal behavior using decorators (e.g. based on position, lap number and time) and then trigger the stimulation with a TTL pulse. Rarely, if users require a behavioral condition not currently implemented or composable out of existing decorators, it would require generating custom code in Java to extend the UI. Performing such edits requires only knowledge of basic object-oriented programming in Java and generating a single subclass of either the BasicContextList or ContextListDecorator classes. In addition, the JavaFX (under development) version of behaviorMate incorporates a plugin which doesn't require recompiling the code in order to make these changes. However, since the JavaFX software is currently under development, documentation does not yet exist. All software is open-sourced and available on github.com for users interested in generating plugins or altering the source code.
We have added the additional caveat to the manuscript in order to clarify this point (Line 197-202): “However, if the available set of decorators is not enough to implement the required task logic, some modifications to the source code may be necessary. These modifications, in most cases, would be very simple and only a basic understanding of object-oriented programming is required. A case where this might be needed would be performing novel customized real-time analysis on behavior data and activating a stimulus based on the result”
(2) The JSON messaging protocol lacks API documentation. It's not clear what the exact syntax is, supported key/value pairs, and expected response/behavior of the JSON messages. Hence, it's not clear how to develop new hardware that can communicate with the behaviorMate system.
The most common approach for adding novel hardware is to use TTL pulses (or accept an emitted TTL pulse to read sensor states). This type of hardware addition is possible through the existing GPIO without the need to interact with the software or JSON API. Users looking to take advantage of the ability to set up and configure novel behavioral paradigms without the need to write any software would be limited to adding hardware which could be triggered with and report to the UI with a TTL pulse (however fairly complex actions could be triggered this way).
For users looking to develop more customized hardware solutions that interact closely with the UI or GPIO board, additional documentation on the JSON messaging protocol has been added to the behaviormate-utils repository (https://github.com/losonczylab/behaviormate_utils). Additionally, we have added a link to this repository in the Supplemental Materials section (line 971) and referenced this in the manuscript (line 217) to make it easier for readers to find this information.
Furthermore, developers looking to add completely novel components to the UI can implement the interface described by Context.java in order to exchange custom messages with hardware. (described in the JavaDoc: https://www.losonczylab.org/behaviorMate-1.0.0/) These messages would be defined within the custom context and interact with the custom hardware (meaning the interested developer would make a novel addition to the messaging API). Additionally, it should be noted that without editing any software, any UDP packets sent to behaviorMate from an IP address specified in the settings will get time stamped and logged in the stored behavioral data file meaning that are a large variety of hardware implementation solutions using both standard UDP messaging and through TTL pulses that can work with behaviorMate with minimal effort. Finally, see response to R2.1 for a discussion of the JavaFX version of the behaviorMatee UI including plugin support.
(3) It seems the existing control hardware and the JSON messaging only support GPIO/TTL types of input/output, which limits the applicability of the system to more complicated sensor/controller hardware. The authors mentioned that hardware like Arduino natively supports serial protocols like I2C or SPI, but it's not clear how they are handled and translated to JSON messages.
We provide an implementation for an I2C-based capacitance lick detector which interested developers may wish to copy if support for novel I2C or SPI. Users with less development experience wishing to expand the hardware capabilities of behaviorMatecould also develop adapters which can be triggered on a TTL input/output. Additionally, more information about the JSON API and how messages are transmitted to the PC by the arduino is described in point (2) and the expanded online documentation.
a. Additionally, because it's unclear how easy to incorporate arbitrary hardware with behaviorMate, the "Intranet of things" approach seems to lose attraction. Since currently, the manuscript focuses mainly on a specific set of hardware designed for a specific type of experiment, it's not clear what are the advantages of implementing communication over a local network as opposed to the typical connections using USB.
As opposed to serial communication protocols as typical with USB, networking protocols seamlessly function based on asynchronous message passing. Messages may be routed internally (e.g. to a PCs localhost address, i.e. 0.0.0..0) or to a variety of external hardware (e.g. using IP addresses such as those in the range 192.168.1.2 - 192.168.1.254). Furthermore, network-based communication allows modules, such as VR, to be added easily. behavoirMate systems can be easily expanded using low-cost Ethernet switches and consume only a single network adapter on the PC (e.g. not limited by the number of physical USB ports). Furthermore, UDP message passing is implemented in almost all modern programming languages in a platform independent manner (meaning that the same software can run on OSX, Windows, and Linux). Lastly, as we have pointed out (Line 117) a variety of tools exist for inspecting network packets and debugging; meaning that it is possible to run behaviorMate with simulated hardware for testing and debugging.
The IOT nature of behaviorMate means there is no requirement for novel hardware to be implemented using an arduino, since any system capable of UDP communication can be configured. For example, VRMate is usually run on Odroid C4s, however one could easily create a system using Raspberry Pis or even additional PCs. behaviorMate is agnostic to the format of the UDP messages, but packaging any data in the JSON format for consistency would be encouraged. If a new hardware is a sensor that has input requiring it to be time stamped and logged then all that is needed is to add the IP address and port information to the ‘controllers’ list in a behaviorMate settings file. If more complex interactions are needed with novel hardware than a custom implementation of ContextList.java may be required (see response to R2.2). However, the provided UdpComms.java class could be used to easily send/receive messages from custom Context.java subclasses.
Solutions for highly customized hardware do require basic familiarity with object-oriented programming using the Java programming language. However, in our experience most behavioral experiments do not require these kinds of modifications. The majority of 1D navigation tasks, which behaviorMate is currently best suited to control, require touch/motion sensors, LEDs, speakers, or solenoid valves, which are easily controlled by the existing GPIO implementation. It is unlikely that custom subclasses would even be needed.
Reviewer #3 (Public review):
(1) While using UDP for data transmission can enhance speed, it is thought that it lacks reliability. Are there error-checking mechanisms in place to ensure reliable communication, given its criticality alongside speed?
The provided GPIO/behavior controller implementation sends acknowledgement packets in response to all incoming messages as well as start and stop messages for contexts and “valves”. In this way the UI can update to reflect both requested state changes as well as when they actually happen (although there is rarely a perceptible gap between these two states unless something is unplugged or not functioning). See Line 85 in the revised manuscript “acknowledgement packets are used to ensure reliable message delivery to and from connected hardware”.
(2) Considering this year's price policy changes in Unity, could this impact the system's operations?
VRMate is not affected by the recent changes in pricing structure of the Unity project.
The existing compiled VRMate software does not need to be regenerated to update VR scenes, or implement new task logic (since this is handled by the behaviorMate GUI). Therefore, the VRMate program is robust to any future pricing changes or other restructuring of the Unity program and does not rely on continued support of Unity. Additionally, while the solution presented in VRMate has many benefits, a developer could easily adapt any open-source VR Maze project to receive the UDP-based position updates from behaviorMate or develop their own novel VR solutions.
(3) Also, does the Arduino offer sufficient precision for ephys recording, particularly with a 10ms check?
Electrophysiology recording hardware typically has additional I/O channels which can provide assistance with tracking behavior/synchronization at a high resolution. While behaviorMate could still be used to trigger reward valves, either the ephys hardware or some additional high-speed DAQ would be recommended to maintain accurately with high-speed physiology data. behaviorMate could still be set up as normal to provide closed and open-loop task control at behaviorally relevant timescales alongside a DAQ circuit recording events at a consistent temporal resolution. While this would increase the relative cost of the individual recording setup, identical rigs for training animals could still be configured without the DAQ circuit avoiding unnecessary cost and complexity.
(4) Could you clarify the purpose of the Sync Pulse? In line 291, it suggests additional cues (potentially represented by the Sync Pulse) are needed to align the treadmill screens, which appear to be directed towards the Real-Time computer. Given that event alignment occurs in the GPIO, the connection of the Sync Pulse to the Real-Time Controller in Figure 1 seems confusing.
A number of methods exist for synchronizing recording devices like microscopes or electrophysiology recordings with behaviorMate’s time-stamped logs of actuators and sensors. For example, the GPIO circuit can be configured to send sync triggers, or receive timing signals as input. Alternatively a dedicated circuit could record frame start signals and relay them to the PC to be logged independently of the GPIO (enabling a high-resolution post-hoc alignment of the time stamps). The optimal method to use varies based on the needs of the experiment. Our setups have a dedicated BNC output and specification in the settings file that sends a TTL pulse at the start of an experiment in order to trigger 2p imaging setups (see line 224, specifically that this is a detail of “our” 2p imaging setup). We provide this information as it might be useful suggesting how to have both behavior and physiology data start recording at the same time. We do not intend this to be the only solution for alignment. Figure 1 indicates an “optional” circuit for capturing a high speed sync pulse and providing time stamps back to the real time PC. This is another option that might be useful for certain setups (or especially for establishing benchmarks between behavior and physiology recordings). In our setup event alignment does not exclusively occur on the GPIO.
a. Additionally, why is there a separate circuit for the treadmill that connects to the UI computer instead of the GPIO? It might be beneficial to elaborate on the rationale behind this decision in line 260.
Event alignment does not occur on the GPIO, separating concerns between position tracking and more general input/output features which improves performance and simplifies debugging. In this sense we maintain a single event loop on the Arduino, avoiding the need to either run multithreaded operations or rely extensively on interrupts which can cause unpredictable code execution (e.g. when multiple interrupts occur at the same time). Our position tracking circuit is therefore coupled to a separate,low-cost arduino mini which has the singular responsibility of position-tracking.
b. Moreover, should scenarios involving pupil and body camera recordings connect to the Analog input in the PCB or the real-time computer for optimal data handling and processing?
Pupil and body camera recordings would be independent data streams which can be recorded separately from behaviorMate. Aligning these forms of full motion video could require frame triggers which could be configured on the GPIO board using single TTL like outputs or by configuring a valve to be “pulsed” which is a provided type customization.
We also note that a more advanced developer could easily leverage camera signals to provide closed loop control by writing an independent module that sends UDP packets to behavoirMate. For example a separate computer vision based position tracking module could be written in any preferred language and use UDP messaging to send body tracking updates to the UI without editing any of the behaviorMate source code (and even used for updating 1D location).
(5) Given that all references, as far as I can see, come from the same lab, are there other labs capable of implementing this system at a similar optimal level?
To date two additional labs have published using behaviorMate, the Soltez and Henn labs (see revised lines 341-342). Since behaviorMate has only recently been published and made available open source, only external collaborators of the Losonczy lab have had access to the software and design files needed to do this. These collaborators did, however, set up their own behavioral setups in separate locations with minimal direct support from the authors–similar to what would be available to anyone seeking to set a behaviorMate system would find online on our github page or by posting to the message board.
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
(4) To provide additional context for the significance of this work, additional citations would be helpful to demonstrate a ubiquitous need for a system like behaviorMate. This was most needed in the paragraph from lines 46-65, specifically for each sentence after line 55, where the authors discuss existing variants on head-fixed behavioral paradigms. For instance, for the clause "but olfactory and auditory stimuli have also been utilized at regular virtual distance intervals to enrich the experience with more salient cues", suggested citations include Radvansky & Dombeck 2018 (DOI: 10.1038/s41467-018-03262-4), Fischler-Ruiz et al. 2021 (DOI: 10.1016/j.neuron.2021.09.055).
We thank the reviewer for the suggested missing citations and have updated the manuscript accordingly (see line 58).
(5) In addition, it would also be helpful to clarify behaviorMate's implementation in other laboratories. On line 304 the authors mention "other labs" but the following list of citations is almost exclusively from the Losonczy lab. Perhaps the citations just need to be split across the sentence for clarity? E.g. "has been validated by our experimental paradigms" (citation set 1) "and successfully implemented in other labs as well" (citation set 2).
We have split the citation set as suggested (see lines 338-342).
Minor Comments:
(6) In the paragraph starting line 153 and in Fig. 2, please clarify what is meant by "trial" vs. "experiment". In many navigational tasks, "trial" refers to an individual lap in the environment, but here "trial" seems to refer to the whole behavioral session (i.e. synonymous with "experiment"?).
In our software implementation we had originally used “trial” to refer to an imaging session rather than experiment (and have made updates to start moving to the more conventional lexicon). To avoid confusion we have remove this use of “trial” throughout the manuscript and replaced with “experiment” whenever possible
(7) This is very minor, but in Figure 3 and 4, I don't believe the gavage needle is actually shown in the image. This is likely to avoid clutter but might be confusing to some readers, so it may be helpful to have a small inset diagram showing how the needle would be mounted.
We assessed the image both with and without the gavage needle and found the version in the original (without) to be easier to read and less cluttered and therefore maintained that version in the manuscript.
(8) In Figure 5 legend, please list n for mice and cells.
We have updated the Figure 5 legend to indicate that for panels C-G, n=6 mice (all mice were recorded in both VR and TM systems), 3253 cells in VR classified as significantly tuned place cells VR, and 6101 tuned cells in TM,
(9) Line 414: It is not necessary to tilt the entire animal and running wheel as long as the head-bar clamp and objective can rotate to align the imaging window with the objective's plane of focus. Perhaps the authors can just clarify the availability of this option if users have a microscope with a rotatable objective/scan head.
We have added the suggested caveat to the manuscript in order to clarify when the goniometers might be useful (see lines 281-288).
(10) Figure S1 and S2 could be referenced explicitly in the main text with their related main figures.
We have added explicit references to figures S1 and S2 in the relevant sections (see lines 443, 460 and 570)
(11) On line 532-533, is there a citation for "proximal visual cues and tactile cues (which are speculated to be more salient than visual cues)"?
We have added citations to both Knierim & Rao 2003 and Renaudineau et al. 2007 which discuss the differential impact of proximal vs distal cues during navigation as well as Sofroniew et al. 2014 which describe how mice navigate more naturally in a tactile VR setup as opposed to purely visual ones.
(12) There is a typo at the end of the Figure 2 legend, where it should say "Arduino Mini."
This typo has been fixed.
Reviewer #2 (Recommendations For The Authors):
(4) As mentioned in the public review: what is the major advantage of taking the IoT approaches as opposed to USB connections to the host computer, especially when behaviorMate relies on a central master computer regardless? The authors mentioned the readability of the JSON messages, making the system easier to debug. However, the flip side of that is the efficiency of data transmission. Although the bandwidth/latency is usually more than enough for transmitting data and commands for behavior devices, the efficiency may become a problem when neural recording devices (imaging or electrophysiology) need to be included in the system.
behaviorMate is not intended to do everything, and is limited to mainly controlling behavior and providing some synchronizing TTL style triggers. In this way the system can easily and inexpensively be replicated across multiple recording setups; particularly this is useful for constructing additional animal training setups. The system is very much sufficient for capturing behavioral inputs at relevant timescales (see the benchmarks in Figures 3 and 4 as well as the position correlated neural activity in Figures 5 and 6 for demonstration of this). Additional hardware might be needed to align the behaviorMate output with neural data for example a high-speed DAQ or input channels on electrophysiology recording setups could be utilized (if provided). As all recording setups are different the ideal solution would depend on details which are hard to anticipate. We do not mean to convey that the full neural data would be transmitted to the behaviorMate system (especially using the JSON/UDP communications that behaviorMate relies on).
(5) The author mentioned labView. A popular open-source alternative is bonsai (https://github.com/bonsai-rx/bonsai). Both include a graphical-based programming interface that allows the users to easily reconfigure the hardware system, which behaviorMate seems to lack. Additionally, autopilot (https://github.com/auto-pi-lot/autopilot) is a very relevant project that utilizes a local network for multiple behavior devices but focuses more on P2P communication and rigorously defines the API/schema/communication protocols for devices to be compatible. I think it's important to include a discussion on how behaviorMate compares to previous works like these, especially what new features behaviorMate introduces.
We believe that behaviorMate provides a more opinionated and complete solution than the projects mentioned. A wide variety of 1D navigational paradigms can be constructed in behaviorMate without the need to write any novel software. For example, bonsai is a “visual programming language” and would require experimenters to construct a custom implementation of each of their experiments. We have opted to use Java for the UI with distributed computations across modules in various languages. Given the IOT methodology it would be possible to use any number of programming languages or APIs; a large number of design decisions were made when building the project and we have opted to not include this level of detail in the manuscript in order to maintain readability. We strongly believe in using non-proprietary and open source projects, when possible, which is why the comparison with LabView based solutions was included in the introduction. Also, we have added a reference to the autopilot reference to the section of the introduction where this is discussed.
(6) One of the reasons labView/bonsai are popular is they are inherently parallel and can simultaneously respond to events from different hardware sources. While the JSON events in behaviorMate are asynchronous in nature, the handling of those events seems to happen only in a main event loop coupled with GUI, which is sequential by nature. Is there any multi-threading/multi-processing capability of behaviorMate? If so it's an important feature to highlight. If not I think it's important to discuss the potential limitation of the current implementation.
IOT solutions are inherently concurrent since the computation is distributed. Additional parallelism could be added by further distributing concerns between additional independent modules running on independent hardware. The UI has an eventloop which aggregates inputs and then updates contexts based on the current state of those inputs sequentially. This sort of a “snapshot” of the current state is necessary to reason about when the start certain contexts based on their settings and applied decorators. While the behaviorMate UI uses multithreading libraries in Java to be more performant in certain cases, the degree to which this represents true vs “virtual” concurrency would depend on the individual PC architecture it is run on and how the operating system allocates resources. For this reason, we have argued in the manuscript that behaviorMate is sufficient for controlling experiments at behaviorally relevant timescales, and have presented both benchmarks and discussed different synchronization approaches and permit users to determine if this is sufficient for their needs.
(7) The context list is an interesting and innovative approach to abstract behavior contingencies into a data structure, but it's not currently discussed in depth. I think it's worth highlighting how the context list can be used to cover a wide range of common behavior experimental contingencies with detailed examples (line 185 might be a good example to give). It's also important to discuss the limitation, as currently the context lists seem to only support contingencies based purely on space and time, without support for more complicated behavior metrics (e.g. deliver reward only after X% correct).
To access more complex behavior metrics during runtime, custom context list decorators would need to be implemented. While this is less common in the sort of 1D navigational behaviors the project was originally designed to control, adding novel decorators is a simple process that only requires basic object oriented programming knowledge. As discussed we are also implementing a plugin-architecture in the JavaFX update to streamline these types of additions.
Minor Comments:
(8) In line 202, the author suggests that a single TTL pulse is sent to mark the start of a recording session, and this is used to synchronize behavior data with imaging data later. In other words, there are no synchronization signals for every single sample/frame. This approach either assumes the behavior recording and imaging are running on the same clock or assumes evenly distributed recording samples over the whole recording period. Is this the case? If so, please include a discussion on limitations and alternative approaches supported by behaviorMate. If not, please clarify how exactly synchronization is done with one TTL pulse.
While the TTL pulse triggers the start of neural data in our setups, various options exist for controlling for the described clock drift across experiments and the appropriate one depends on the type of recordings made, frame rate duration of recording etc. Therefore behaviorMate leaves open many options for synchronization at different time scales (e.g. the adding a frame-sync circuit as shown in Figure 1 or sending TTL pulses to the same DAQ recording electrophysiology data). Expanded consideration of different synchronization methods has been included in the manuscript (see lines 224-238).
(9) Is the computer vision-based calibration included as part of the GUI functionality? Please clarify. If it is part of the GUI, it's worth highlighting as a very useful feature.
The computer vision-based benchmarking is not included in the GUI. It is in the form of a script made specifically for this paper. However for treadmill-based experiments behaviorMate has other calibration tools built into it (see line 301-303).
(10) I went through the source code of the Arduino firmware, and it seems most "open X for Y duration" functions are implemented using the delay function. If this is indeed the case, it's generally a bad idea since delay completely pauses the execution and any events happening during the delay period may be missed. As an alternative, please consider approaches comparing timestamps or using interrupts.
We have avoided the use of interrupts on the GPIO due to the potential for unpredictable code execution. There is a delay which is only just executed if the duration is 10 ms or less as we cannot guarantee precision of the arduino eventloop cycling faster than this. Durations longer than 10 ms would be time stamped and non-blocking. We have adjusted this MAX_WAIT to be specified as a macro so it can be more easily adjusted (or set to 0).
(11) Figure 3 B, C, D, and Figure 4 D, E suffer from noticeable low resolution.
We have converted Figure 3B, C, D and 4C, D, E to vector graphics in order to improve the resolution.
(12) Figure 4C is missing, which is an important figure.
This figure appeared when we rendered and submitted the manuscript. We apologize if the figure was generated such that it did not load properly in all pdf viewers. The panel appears correctly in the online eLife version of the manuscript. Additionally, we have checked the revision in Preview on Mac OS as well as Adobe Acrobat and the built-in viewer in Chrome and all figure panels appear in each so we hope this issue has been resolved.
(13) There are thin white grid lines on all heatmaps. I don't think they are necessary.
The grid lines have been removed from the heatmaps as suggested.
(14) Line 562 "sometimes devices directly communicate with each other for performance reasons", I didn't find any elaboration on the P2P communication in the main text. This is potentially worth highlighting as it's one of the advantages of taking the IoT approaches.
In our implementation it was not necessary to rely on P2P communication beyond what is indicated in Figure 1. The direct communication referred to in line 562 is meant only to refer to the examples expanded on in the rest of the paragraph i.e. the behavior controller may signal the microscope directly using a TTL signal without looping back to the UI. As necessary users could implement UDP message passing between devices, but this is outside the scope of what we present in the manuscript.
(15) Line 147 "Notably, due to the systems modular architecture, different UIs could be implemented in any programming language and swapped in without impacting the rest of the system.", this claim feels unsupported without a detailed discussion of how new code can be incorporated in the GUI (plugin system).
This comment refers to the idea of implementing “different UIs”. This would entail users desiring to take advantage of the JSON messaging API and the proposed electronics while fully implementing their own interface. In order to facilitate this option we have improved documentation of the messaging API posted in the README file accompanying the arduino source code. We have added reference to the supplemental materials where readers can find a link to the JSON API implementation to clarify this point.
Additionally, while a plugin system is available in the JavaFX version of behaviorMate, this project is currently under development and will update the online documentation as this project matures, but is unrelated to the intended claim about completely swapping out the UI.
Reviewer #3 (Recommendations For The Authors):
(6) Figure 1 - the terminology for each item is slightly different in the text and the figure. I think making the exact match can make it easier for the reader.
- Real-time computer (figure) vs real-time controller (ln88).
The manuscript was adjusted to match figure terminology.
- The position controller (ln565) - position tracking (Figure).
We have updated Figure 1 to highlight that the position controller does the position tracking.
- Maybe add a Behavior Controller next to the GPIO box in Figure 1.
We updated Figure 1 to highlight that the Behavior Controller performs the GPIO responsibility such that "Behavior Controller" and "GPIO circuit" may be used interchangeably.
- Position tracking (fig) and position controller (subtitle - ln209).
We updated Figure 1 to highlight that the position controller does the position tracking.
- Sync Pulse is not explained in the text.
The caption for Figure 1 has been updated to better explain the Sync pulse and additional systems boxes
(7) For Figure 3B/C: What is the number of data points? It would be nice to see the real population, possibly using a swarm plot instead of box plots. How likely are these outliers to occur?
In order to better characterize the distributions presented in our benchmarking data we have added mean and standard deviation information the plots 3 and 4. For Figure 3B: 0.0025 +/- 0.1128, Figure 3C: 12.9749 +/- 7.6581, Figure 4C: 66.0500 +/- 15.6994, Figure 4E: 4.1258 +/- 3.2558.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public Review):
Time periods in which experience regulates early plasticity in sensory circuits are well established, but the mechanisms that control these critical periods are poorly understood. In this manuscript, Leier and Foden and colleagues examine early-life critical periods that regulate the Drosophila antennal lobe, a model sensory circuit for understanding synaptic organization. Using early-life (0-2 days old) exposure to distinct odorants, they show that constant odor exposure markedly reduces the volume, synapse number, and function of the VM7 glomerulus. The authors offer evidence that these changes are mediated by invasion of ensheathing glia into the glomerulus where they phagocytose connections via a mechanism involving the engulfment receptor Draper.
This manuscript is a striking example of a study where the questions are interesting, the authors spent a considerable amount of time to clearly think out the best experiments to ask their questions in the most straightforward way, and expressed the results in a careful, cogent, and well-written fashion. It was a genuine delight to read this paper. I have two experimental suggestions that would really round out existing work to better support the existing conclusions and some instances where additional data or tempered language in describing results would better support their conclusions. Overall, though, this is an incredibly important finding, a careful analysis, and an excellent mechanistic advance in understanding sensory critical period biology.
We thank the reviewer for their thoughtful and constructive comments on our manuscript. In response to their critiques, we conducted several new experiments as well as additional analysis and making changes to the text. As requested, we carried out an electrophysiological analysis of VM7 PN firing in draper knockdown animals with and without odor exposure. To our surprise, loss of glial Draper fully suppresses the dramatic reduction in spontaneous PN activity observed following critical period ethyl butyrate exposure, arguing that the functional response is restored alongside OSN morphology. It also suggests that the OR42a OSN terminals are intact and functional until they are phagocytosed by ensheathing glia. In other words, glia are not merely clearing axon terminals that have already degenerated. This evidence provides additional support to the claim that the VM7 glomerulus will be an outstanding model for defining mechanism of experience-dependent glial pruning. Detailed responses to the reviewers’ comments follow below.
Regarding the apparent disconnect between the near complete silencing of PNs versus the 50% reduction in OR42a OSN infiltration volume, we agree with the reviewer that this tracks with previous data in the field. While our Imaris pipeline is relatively sensitive, it may not pick up modest changes to terminal arbor architecture. Indeed, as described in Jindal et al. (2023) and in the Methods in this manuscript, we chose conservative software settings that, if anything, would undercount the percent change in infiltration volume. We also note that increased inhibitory LN inputs onto PNs could contribute to dramatic PN silencing we observe. While fascinating, we view LN plasticity beyond the scope of the current manuscript. We removed any mention of ‘silent synapses’ and now speculate about increased inhibition.
Reviewer #1 (Recommendations For The Authors):
Major Elements:
(1) The authors demonstrate that loss of draper in glia can suppress many of the pruning related phenotypes associated with EB exposure. However, they do not assess electrophysiological output in these experiments, only morphology. It would be great to see recordings from those animals to see if the functional response is also restored.
We performed the experiment the reviewer requested (see Figure 4F-J). We are pleased to report that our recordings from VM7 PNs match our morphology measurements: in repo-GAL4>UAS-draper RNAi flies, there was no difference in the innervation of VM7 PNs between animals exposed to mineral oil or 15% EB from 0-2 DPE. This result is in sharp contrast to the near-total loss of OSN-PN innervation in flies with intact glial Draper signaling, and strongly validates the role we propose for Draper in the Or42a OSN critical period.
(2) There is a disconnect between physiology and morphology with a near complete loss of activity from VM7 PNs but a less severe loss of ORN synapses. While not completely incongruent (previous work in the AL showed a complete loss of attractive behavior though synapse number was only reduced 40% - Mosca et al. 2017, eLife), it is curious. Can the authors comment further? Ideally, some of these synapses could be visualized by EM to determine if the remaining synapses are indeed of correct morphology. If not, this could support their assertion of silent inputs from page 7. Further, what happens to the remaining synapses? VM7 PNs should be receiving some activity from other local interneurons as well as neighboring PNs.
We agree that on the surface, our electrophysiology results are more striking than one might expect solely from our measurements of VM7 morphology and presynaptic content. As the reviewer points out, previous studies of fly olfaction have consistently found that relatively modest shifts in glomerular volume in response to prolonged earlylife odorant exposure can be accompanied by drastic changes in physiology and behavior (in addition, we would add Devaud et al., 2003; Devaud et al., 2001; Acebes et al., 2012; and Chodankar et al., 2020, as foundational examples of this phenomenon).
A major driver of these changes appears to be remodeling of antennal lobe inhibitory LNs (see Das et al., 2011; Wilson and Laurent, 2005; Chodankar et al., 2020), especially GABAergic inhibitory interneurons. Perhaps increased LN inhibition of chronically activated PNs, on top of the reduced excitatory inputs resulting from ensheathing glial pruning of the Or42a OSN terminal arbor, would explain the near-total loss of VM7 PN activity we observe after critical period EB exposure. However, given that the scope of our study is limited to critical-period glial biology and does not address the complex topics of LN rewiring or synapse morphology, we have removed the sentence in which we raise the possibility of “silent synapses” in order to avoid confusion. The reviewer is also correct that VM7 PNs have inputs from non-ORN presynaptic partners, including LNs and PNs. So again, perhaps increased inhibitory inputs contributes to the near-complete silencing of the PNs. Given the heterogeneity of LN populations, we view this area as fertile ground for future research.
Language / Data Considerations:
(1) Or42a OSNs have other inputs, namely, from LNs. What are they doing here? Are they also affected?
As discussed above, the question of how LN innervation of Or42a OSNs is altered by critical-period EB exposure is an intriguing one that fully deserves its own follow-up study, and we have tried to avoid speculation about the role of LNs when discussing our pruning phenotype. We note at multiple points throughout the text the importance of LNs and refer to previous studies of LN plasticity in response to chronic odorant exposure.
(2) In all of the measurements, what happens to synaptic density? Is it maintained? Does it scale precisely? This would be helpful to know.
We have performed the analysis as requested, which is now included in a supplement to Figure 5. We found that synaptic density shows no trend in variation across conditions and glial driver genotypes.
(3) In Figure 5, the controls for the alrm-GAL4 experiments show a much more drastic phenotype than controls in previous figures? Does this background influence how we can interpret the results? Could the response have instead hit a floor effect and it's just not possible to recover?
The reviewer is correct that following EB exposure, astrocyte vs. ensheathing glial driver backgrounds displayed modest differences in the extent of pruning by volume (0.27 for astros, 0.36 for EG). We note that the two drpr RNAi lines that we used had non-significant (but opposite) effects on the estimated size of OSN42a OSN volume in combination with the astrocyte driver, arguing against a floor effect. In addition, a recent publication by Nelson et al. (2024) replicated our findings with a different astrocyte GAL4 driver and draper RNAi line. Thus, we are confident that this result is biologically meaningful and not an artifact of genetic background.
(4) The estimation of infiltration measurement in Figure 6 is tricky to interpret. It implies that the projections occupy the same space, which cannot be possible. I'd advocate a tempering of some of this language and consider an intensity measurement in addition to their current volume measurements (or perhaps an "occupied space" measurement) to more accurately assess the level of resolution that can be obtained via these methods.
We completely agree that our language in describing EG infiltration could have been more precise, and we modified our language as suggested. The combination of the Or42a-mCD8::GFP label we and others use, our use of confocal microscopy, and our Surface pipeline in Imaris combine to create a glomerular mask that traces the outline of the OSN terminal arbor, but is nonetheless not 100% “filled” by neuronal membrane and/or glial processes.
(5) Do the authors have the kind of resolution needed to tell whether there is indeed Or42a-positive axon fragmentation (as asserted on p16 and from their data in figures 4, 5, 7). If the authors want to say this, I would advocate for a measurement of fragmentation / total volume to prove it - if not, I would advocate tempering of the current language.
The reviewer brings up a fair criticism: while our assertion about axon fragmentation was based on our visual observations of hundreds of EB-exposed brains, the resolution limits of confocal microscopy do not allow us to rigorously rule out fragmentation within a bundle of OSN axons. Instead, our most compelling evidence for the lack of EB-induced Or42a OSN fragmentation in the absence of glial Draper comes from our new electrophysiology data (Figure 4F-J) in repo-GAL4>UAS-draper RNAi animals. We found no difference in spontaneous release from Or42a terminals in flies exposed to mineral oil or 15% EB from 0-2 DPE, which would not be the case if there was Draper-independent fragmentation along the axons or terminal arbors upon EB exposure. We have updated our discussion of fragmentation so that our statements are based on this new evidence, and not confocal microscopy.
(6) There is an interesting Discussion opportunity missed here. Some experiments would, ostensibly, require pupae to detect odorants within the casing via structures consistently in place for olfaction during pupation. It would be useful for the authors to discuss a little more deeply when this critical period may arise and why the experiment where pupae are exposed to EB two days before eclosion and there is no response, occurs as it does. I agree that it's clearly a time when they are not sensitive to the odorant, but that could just be because there's no ability to detect odorants at that time. Is it a question of non-sensitivity to EB or just non-sensitivity to everything?
We share the reviewer’s interest in the plasticity of the olfactory circuit during pupariation, although, as they correctly point out, it is difficult to conceive of an odorant-exposure experiment that could disentangle the barrier effects of puparium from the sensitivity of the circuit itself, and our pre-eclosion data in Figure 3A, D, G does not distinguish between the two. While an investigation into mechanism by which the critical period for ethyl butyrate exposure opens and closes is outside the scope of the present study, we would consider the physical barrier of the puparium to be a satisfactory explanation for why eclosion marks the functional opening of experiencedependent plasticity. As the reviewer suggests, we have added this important nuance to our discussion of the opening of the critical period in the corresponding paragraph of the Results, as well as to the Discussion section “Glomeruli exhibit dichotomous responses to critical period odor exposure.”
Minor Elements:
(1) Page 6 bottom: "Or4a-mCD8::GFP" should be "Or42a-mCD8::GFP"
(2) Page 15, end of last full paragraph. Remove the "e"
Thank you for pointing out these typos. They have been corrected.
Reviewer #2 (Public Review):
Sensory experiences during developmental critical periods have long-lasting impacts on neural circuit function and behavior. However, the underlying molecular and cellular mechanisms that drive these enduring changes are not fully understood. In Drosophila, the antennal lobe is composed of synapses between olfactory sensory neurons (OSNs) and projection neurons (PNs), arranged into distinct glomeruli. Many of these glomeruli show structural plasticity in response to early-life odor exposure, reflecting the sensitivity of the olfactory circuitry to early sensory experiences.
In their study, the authors explored the role of glia in the development of the antennal lobe in young adult flies, proposing that glial cells might also play a role in experiencedependent plasticity. They identified a critical period during which both structural and functional plasticity of OSN-PN synapses occur within the ethyl butyrate (EB)responsive VM7 glomerulus. When flies were exposed to EB within the first two days post-eclosion, significant reductions in glomerular volume, presynaptic terminal numbers, and postsynaptic activity were observed. The study further highlights the importance of the highly conserved engulfment receptor Draper in facilitating this critical period plasticity. The authors demonstrated that, in response to EB exposure during this developmental window, ensheathing glia increase Draper expression, infiltrate the VM7 glomerulus, and actively phagocytose OSN presynaptic terminals. This synapse pruning has lasting effects on circuit function, leading to persistent decreases in both OSN-PN synapse numbers and spontaneous PN activity as analyzed by perforated patch-clamp electrophysiology to record spontaneous activity from PNs postsynaptic to Or42a OSNs.
In my view, this is an intriguing and potentially valuable set of data. However, since I am not an expert in critical periods or habituation, I do not feel entirely qualified to assess the full significance or the novelty of their findings, particularly in relation to existing research.
We thank the reviewer for their insightful critique of our work. In response to their comments, we added additional physiological analysis and tempered our language around possible explanations for the apparent disconnect between the physiological and morphological critical period odor exposure. These changes are explained in more detail in the response to the public review by Reviewer 1 and also in our responses outlined below.
Reviewer #2 (Recommendations For The Authors):
I though do have specific comments and questions concerning the presynaptic phenotype they deduce from confocal BRP stainings and electrophysiology.
Concerning the number of active zones: this can hardly be deduced from standardresolution confocal images and, maybe more importantly, lacking postsynaptic markers. This particularly also in the light of them speculating about "silent synapses". There are now tools existing concerning labeled, cell type specific expression of acetylcholine-receptor expression and cholinergic postsynaptic density markers (importantly Drep2). Such markers should be entailed in their analysis. They should refer to previous concerning "brp-short" concerning its original invention and prior usage.
We thank the reviewer for their thoughtful approach to our methodology and claims. While the use of confocal microscopy of Bruchpilot puncta to estimate numbers of presynapses is standard practice (see Furusawa et al., 2023; Aimino et al., 2022; Urwyler et al., 2019; Ackerman et al., 2021), the reviewer is correct that a punctum does not an active zone make. Bruchpilot staining and quantification is a well-validated tool for approximating the number of presynaptic active zones, not a substitute for super-resolution microscopy. We made changes to our language about active zones to make this distinction clearer. We have also removed the sentence where we discuss the possibility of “silent synapses,” which both reviewers felt was too speculative for our existing data. Finally, we are highly interested in characterizing the response of PNs and higher-order processing centers to critical-period odorant exposure as a future direction for our research. However, given the complexity of the subject, we chose to limit the scope of this study to the interactions between OSNs and glia.
Regarding their electrophysiological analysis and the plausibility of their findings: I am uncertain whether the moderate reduction in BRP puncta at the relevant OSN::PN synapse can fully account for the significantly reduced spontaneous PN activity they report. This seems particularly doubtful in the absence of any direct evidence for postsynaptically silent synapses. Perhaps this is my own naivety, but I wonder why they did not use antennal nerve stimulation in their experiments?
We refer to previous studies of the AL indicating that moderate changes in glomerular volume and presynaptic content can translate to far more striking alterations in electrophysiology and behavior (Devaud et al., 2003; Devaud et al., 2001; Acebes et al., 2012; and Chodankar et al., 2020, Mosca et al., 2017). This literature has demonstrated that chronic odorant exposure can result in remodeling of inhibitory local interneurons to suppress over-active inputs from OSNs. While we do not address the complex subject of interneuron remodeling in the present study, we find it highly likely that there would be significant changes in interneuron innervation of PNs, independent of glial phagocytosis of OSN excitatory inputs, resulting in additional inhibition. Moving forward, we are very interested in expanding these studies to include odor-evoked changes in PN activity.
Additional minor point: The phrase "Soon after its molecular biology was described (et al., 1999), the Drosophila melanogaster" seems somewhat misleading. Isn't the field still actively describing the molecular biology of the fly olfactory system?
We completely agree and have removed this sentence entirely.
Reviewing Editor's Note: to enhance the evidence from mostly compelling in most facets to solid would be to add physiology to the Draper analysis.
These experiments have been completed and are presented in Figure 4F-J.
References
Acebes A, Devaud J-M, Arnés M, Ferrús A. 2012. Central Adaptation to Odorants Depends on PI3K Levels in Local Interneurons of the Antennal Lobe. J Neurosci 32:417–422. doi:10.1523/jneurosci.2921-11.2012
Ackerman SD, Perez-Catalan NA, Freeman MR, Doe CQ. 2021. Astrocytes close a motor circuit critical period. Nature592:414–420. doi:10.1038/s41586-021-03441-2
Aimino MA, DePew AT, Restrepo L, Mosca TJ. 2022. Synaptic Development in Diverse Olfactory Neuron Classes Uses Distinct Temporal and Activity-Related Programs. J Neurosci 43:28–55. doi:10.1523/jneurosci.0884-22.2022
Chodankar A, Sadanandappa MK, VijayRaghavan K, Ramaswami M. 2020. Glomerulus-Selective Regulation of a Critical Period for Interneuron Plasticity in the Drosophila Antennal Lobe. J Neurosci 40:5549–5560. doi:10.1523/jneurosci.2192-19.2020
Das S, Sadanandappa MK, Dervan A, Larkin A, Lee JA, Sudhakaran IP, Priya R, Heidari R, Holohan EE, Pimentel A, Gandhi A, Ito K, Sanyal S, Wang JW, Rodrigues V, Ramaswami M. 2011. Plasticity of local GABAergic interneurons drives olfactory habituation. Proc Natl Acad Sci 108:E646–E654. doi:10.1073/pnas.1106411108 Devaud J, Acebes A, Ramaswami M, Ferrús A. 2003. Structural and functional changes in the olfactory pathway of adult Drosophila take place at a critical age. J Neurobiol 56:13–23. doi:10.1002/neu.10215
Devaud J-M, Acebes A, Ferrus A. 2001. Odor Exposure Causes Central Adaptation and ́Morphological Changes in Selected Olfactory Glomeruli in Drosophila. J Neurosci 21:6274–6282. doi:10.1523/jneurosci.21-16-06274.2001
Furusawa K, Ishii K, Tsuji M, Tokumitsu N, Hasegawa E, Emoto K. 2023. Presynaptic Ube3a E3 ligase promotes synapse elimination through down-regulation of BMP signaling. Science 381:1197–1205. doi:10.1126/science.ade8978
Mosca TJ, Luginbuhl DJ, Wang IE, Luo L. 2017. Presynaptic LRP4 promotes synapse number and function of excitatory CNS neurons. eLife 6:e27347. doi:10.7554/elife.27347
Nelson N, Vita DJ, Broadie K. 2024. Experience-dependent glial pruning of synaptic glomeruli during the critical period. Sci Rep 14:9110. doi:10.1038/s41598-024-59942-3
Urwyler O, Izadifar A, Vandenbogaerde S, Sachse S, Misbaer A, Schmucker D. 2019. Branch-restricted localization of phosphatase Prl-1 specifies axonal synaptogenesis domains. Science 364. doi:10.1126/science.aau9952
Wilson RI, Laurent G. 2005. Role of GABAergic Inhibition in Shaping Odor-Evoked Spatiotemporal Patterns in the Drosophila Antennal Lobe. J Neurosci 25:9069–9079.
doi:10.1523/jneurosci.2070-05.2005
-
-
-
Author response:
We thank the reviewers and the editor for the detailed and constructive feedback provided. We look forward to submitting a revised version of the manuscript that addresses their comments. We acknowledge that further clarification is needed about the novelty brought by our experimental setup and model in comparison to previous studies using different methodologies. We also acknowledge that more details can be included about the calibration steps and sensitivity of the model parameters. Below we detail the planned changes for the revised version regarding the points raised by the reviewers.
Reviewer #1 (Public review):
- The authors then claim that the fragmentation of aggregates due to fluid flows occurs through erosion of small pieces. Because their experimental setup does not allow them to explicitly observe this process (for example, by watching one aggregate break into pieces), they implement an idealized model to show that the nature of the changes to the size histogram agrees with an erosion process. However, in Figure 2C there is a noticeable gap between their experiment and the prediction of their model. Additionally, in a similar experiment shown in Figure S6, the experiment cannot distinguish between an idealized erosion model and an alternative, an idealized binary fission model where aggregates split into equal halves. For these reasons, this claim is weakened.
The two idealized models of fragment distribution, namely erosion and binary fission, lead to distinguishable final size distributions. We believe that our experiments support the hypothesis of the erosion mechanism. Please note that Figure 2 is concerned with the fragmentation of large colonies, whereas Figure 3 and associated Figure S6 are concerned with very small colonies of a few cells formed by aggregation of single-cell suspension. Indeed, for very small colonies of a few cells, our experimental results cannot distinguish between a binary fission model and an erosion model (Figure S6).
The situation is very different for large colonies. To address the reviewer’s concern, we will add a new figure in the Supplementary Information (SI), similar to our Figure 2C, where we will compare the erosion model with a binary fission model for large colonies fragmented under ε = 5.8 m<sup>2</sup>/s<sup>3</sup>. We already did this exercise. The results in this new supplementary figure will show that the idealized binary fission model (i.e., where every fracture event produces exactly two fragments) does not capture the experimental fragmentation behaviour of large colonies. In contrast, the idealized erosion model provides a much better prediction of the experimental results, within the experimental uncertainty and variability in colony strength, and has the notable advantage of a straightforward computational implementation.
- The fourth major result of the manuscript is displayed in Equation 8 and Figure 5, where the authors derive an expression for the ratio between the rate of increase of a colony due to aggregation vs. the rate due to cell division. They then plot this line on a phase map, altering two physical parameters (concentration and fluid turbulence) to show under what conditions aggregation vs. cell division are more important for group formation. Because these results are derived from relatively simple biophysical considerations, they have the potential to be quite powerful and useful and represent a significant conceptual advance. However, there is a region of this phase map that the authors have left untested experimentally. The lowest energy dissipation rate that the authors tested in their experiment seemed to be \dot{epsilon}~1e-2 [m^2/s^3], and the highest particle concentration they tested was 5e-4, which means that the authors never tested Zone II of their phase map. Since this seems to be an important zone for toxic blooms (i.e. the "scum formation" zone), it seems the authors have missed an important opportunity to investigate this regime of high particle concentrations and relatively weak turbulent mixing.
We agree with the reviewer that Zone (II) of Figure 5 is of great importance to dense bloom formation under wind mixing and that this parameter range was not covered by our experiments using a cone-and-plate shear flow. The measuring range of our device was motivated by engineering applications such artificial mixing of eutrophic lakes using bubble plumes, as well as preliminary experiments which demonstrated that high levels of dissipation rate were required to achieve fragmentation. The dissipation rates of our cone-and-plate experiments capture Zones (III) and (IV) and the higher end of Zone (I). However, the cone-and-plate experiments are less suitable for the lower dissipation rates of Zone (II), as indicated by the red bars in Figure 5, due to the accumulation of colonies in stagnation points.
Instead, in our revision we will more extensively discuss recent results published in the literature for evidence of aggregation-dominance at Zone (II). The experimental studies of Wu et al. (2019) and Wu et al. (2024) (full citation below) investigated the formation of Microcystis surface scum layers at high colony concentrations (high biovolume fraction) in wind-mixed mesocosms. These studies identified aggregation of colonies at rates faster than cell division, while the stable colony size decreased with mixing rate. The parameter range of these studies fall within Zone II, and their experimental results agree with our model predictions. We will include in the reviewed version these references and a detailed discussion elucidating the parameter range covered in our experiments and the findings of other studies.
Wu, X., Noss, C., Liu, L., & Lorke, A. (2019). Effects of small-scale turbulence at the air-water interface on Microcystis surface scum formation. Water Research, 167, 115091.
Wu, H., Wu, X., Rovelli, L., & Lorke, A. (2024). Dynamics of Microcystis surface scum formation under different wind conditions: the role of hydrodynamic processes at the air-water interface. Frontiers in Plant Science, 15, 1370874.
Other items that could use more clarity:
- The authors rely heavily on size distributions to make the claims of their paper. Yet, how they generated those size distributions is not clearly shown in the text. Of primary concern, the authors used a correction function (Equation S1) to estimate the counts of different size classes in their image analysis pipeline. Yet, it is unclear how well this correction function actually performs, what kinds of errors it might produce, and how well it mapped to the calibration dataset the authors used to find the fit parameters.
We agree with the reviewer that more details of the calibration processes should be included. We will include in the revised version of the SI more details of the calibration steps and direct comparison of raw and corrected histograms of the size distribution and its associated uncertainty.
- Second, in their models they use a fractal dimension to estimate the number of cells in the group from the group radius, but the agreement between this fractal dimension fit and the data is not shown, so it is not clear how good an approximation this fractal dimension provides. This is especially important for their later derivation of the "aggregation-to-cell division" ratio (Equation 8)
We agree with the reviewer that more details on the estimation of fractal dimension are needed. The revised version of the SI will include the estimation procedure, the number of colonies analysed, and the associated uncertainty.
Reviewer #2 (Public review)
- Especially the introduction seems to imply that shear force is a very important parameter controlling colony formation. However, if one looks at the results this effect is overall rather modest, especially considering the shear forces that these bacterial colonies may experience in lakes. The main conclusion seems that not shear but bacterial adhesion is the most important factor in determining colony size. As the importance of adhesion had been described elsewhere, it is not clear what this study reveals about cyanobacterial colonies that was not known before.
As we explain in the Introduction, it is a major open question whether cyanobacterial colonies are formed mainly by cell division (after which the dividing cells remain attached to each other by the EPS layer) or mainly by the aggregation of independent cells & colonies. See for example the highly cited review of Xiao & Reynolds 2018 (our ref 17), and references therein. This question has not been resolved and is investigated in our study. We would like to emphasize several key findings that our study reveals about the mechanical behaviour of cyanobacterial colonies under flow:
(i) Quantification of mechanical strength in cyanobacterial colonies: Our results demonstrate the high mechanical strength of cyanobacterial colonies (much higher than previously thought in references 32 and 39 of the manuscript), as evidenced by the requirement of very high shear rates to achieve fragmentation. To this end, our study highlights their resilience against naturally occurring flows and bridges the gap between theoretical assumptions about colony strength and experimentally measured mechanical properties.
(ii) Validation of a hypothesis regarding colony formation: Using a fluid-mechanical approach, we confirm the findings of recent genetic studies (references 25 and 64 of the manuscript) which indicated that colony formation of cyanobacteria under natural conditions occurs predominantly via cell division rather than via the aggregation of individual cells. Only in very dense blooms and surface scums, colony formation by the aggregation of smaller colonies likely plays a role.
(iii) Practical guidelines for cyanobacterial bloom control: Our findings provide valuable insights into the design of artificial mixing systems that are used to suppress surface blooms of buoyant cyanobacteria in lakes. In these lake applications, in which we have been involved, the aim of the mixing is to disperse the colonies over the water column so that they cannot form a surface layer (i.e., the mixing intensity should overcome the flotation velocity of the colonies), which takes away the competitive advantage of buoyant cyanobacteria over nonbuoyant phytoplankton species. However, it has always been an open question whether the high shear of artificial mixing would cause colony fragmentation. An understanding of changes in colony size is relevant for the design of artificial mixing, because smaller colonies have a lower flotation velocity. Our results show that the dissipation rates that are generated by artificial mixing are sufficient to prevent aggregation of large colonies, but not high enough to induce fragmentation of division-formed colonies.
In the revised version of the manuscript, we will improve the writing to better clarify these three novel insights obtained from our study.
- The agreement between model and experiments is impressive, but the role of the fit parameters in achieving this agreement needs to be further clarified.
The influence of the fit parameters (namely the stickiness α1 and the pairs of colony strength parameters S1,q1,S2,q2) is discussed in the sections “DYNAMICAL CHANGES IN COLONY SIZE MODELED BY A TWO-CATEGORY DISTRIBUTION” and “MATERIALS AND METHODS.” We kept the discussion concise to maintain readability. However, we agree with the reviewer that additional details about the importance of the fit parameters and the sensitivity of the results to these parameters could be beneficial. In the revised version of the SI, we will include a more detailed discussion of the fit parameters.
- The article may not be very accessible for readers with a biology background. Overall, the presentation of the material can be improved by better describing their new method.
We apologize for the limited readability of the description of the experimental setup and model used. In the revised version of the manuscript, we aim to expand the description of the new methods presented here for a broader audience of biology.
Tags
Annotators
URL
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public Review):
Summary:
The goal of this project is to test the hypothesis that individual differences in experience with multiple languages relate to differences in brain structure, specifically in the transverse temporal gyrus. The approach used here is to focus specifically on the phonological inventories of these languages, looking at the overall size of the phonological inventory as well as the acoustic and articulatory diversity of the cumulative phonological inventory in people who speak one or more languages. The authors find that the thickness of the transverse temporal gyrus (either the primary TTG, in those with one TTG, or in the second TTG, in people with multiple gyri) was related to language experience, and that accounting for the phonological diversity of those languages improved the model fit. Taken together, the evidence suggests that learning more phonemes (which is more likely if one speaks more than one language) leads to experience-related plasticity in brain regions implicated in early auditory processing.
Strengths:
This project is rigorous in its approach--not only using a large sample, but replicating the primary finding in a smaller, independent sample. Language diversity is difficult to quantify, and likely to be qualitatively and quantitatively distinct across different populations, and the authors use a custom measure of multilingualism (accounting for both number of languages as well as age of acquisition) and three measures of phonological diversity. The team has been careful in discussion of these findings, and while it is possible that pre-existing differences in brain structure could lead to an aptitude difference which could drive one to learn more than one language, the fine-grained relationships with phonological diversity seem less likely to emerge from aptitude rather than experience.
Weaknesses:
It is a bit unclear how the measures of phonological diversity relate to one another--they are partially separable, but rest on the same underlying data (the phonemes in each language). It would be helpful for the reader to understand how these measures are distributed (perhaps in a new figure), and the degree to which they are correlated with one another.
Thank you for the comment. Indeed our description missed this important detail that we now included in the manuscript. Unsurprisingly, the distances all correlated with one another, which we present in Table 2 in Section 2.3 of the revised manuscript. We have also added a figure with distributions of the three distance measures (see Figure S3).
Further, as the authors acknowledge, it is always possible that an unseen factor instead drives these findings--if typological lexical distance measures are available, it would be helpful to enter these into the model to confirm that phonological factors are the specific driver of TTG differences and not language diversity in a more general sense. That said, the relationship between phonological diversity and TTG structure is intuitive.
Thank you for the suggestion. To further establish that our results reflected the relationship between TTG structure and phonological diversity specifically (as opposed to language diversity in a more general sense), we derived a fourth measure of language experience, where the AoA index of different languages was weighted by lexical distances between the languages. Here, we followed the methodology described in Kepinska, Caballero, et al. (2023): We used Levenshtein Distance Normalized Divided (LDND) (Wichmann et al., 2010) which was computed using the ASJP.R program by Wichmann (https://github.com/Sokiwi/InteractiveASJP01). Information on lexical distances was combined with language experience information per participant using Rao's quadratic entropy equation in the same way as for the phonological measures.
We then entered this language experience measure accounting for lexical distances between the languages into linear models predicting the thickness of the second left and right TTG (controlling for participants’ age, sex and mean hemispheric thickness) in the main sample, and compared these models with the corresponding models including the original three phonological distance measures (models 24 in Author response table 1), and the measure with no typological information (1).
Below, we list adjusted R2 values of all models, from which it is clear that the index of multilingual language experience accounting for lexical distances between languages (5) explained less variance than the index incorporating phoneme-level distances between languages (2), both in the left and the right hemisphere. This further strengthens our conclusion that our results reflected the relationship between TTG structure and phonological diversity specifically, as opposed to language diversity in a more general sense.
Author response table 1.
We have added a description of this analysis to the manuscript, Section 3.3, lines 357-370.
One curious aspect of this paper relates to the much higher prevalence of split or duplicate TTG in the sample. The authors do a good job speculating on how features of the TASH package might lead to this, but it is unclear where the ground truth lies--some discussion of validation of TASH against a gold standard would be useful.
The validation of the TASH toolbox in comparison to gold standard manual measurement involved assessing how well the measurements of left and right Heschl's gyrus (HG) volumes obtained using the TASH method correlated with those obtained through manual labeling (see Dalboni da Rocha et al., 2020 for details). This validation process was conducted across three independent datasets. Additionally, for comparison, the manually labeled HG volumes were also compared with those obtained using FreeSurfer's Destrieux parcellation of the transverse temporal gyrus in the same datasets. The validation process, therefore, involved rigorous comparisons of HG volumes obtained through manual labeling, FreeSurfer, and TASH across different datasets, along with an assessment of inter-rater reliability for the manual labeling procedure. This comprehensive approach ensures that the results are robust and reliable. TASH_complete, the version used in the present work, is an extension of the extensively validated TASH, which apart from the first gyrus, also identifies additional transverse temporal gyri (i.e. Heschl’s gyrus duplications and multiplications) situated in the PT, when present. Since work on the correspondence between manually identified TTG multiplications is still ongoing, as outlined in the Methods section, we complemented the automatic segmentation by extensive visual assessment of the identified posterior gyri. This process involved removing from the analysis those gyri that lay along the portion of the superior temporal plane that curved vertically (i.e., within the parietal extension, Honeycutt et al., 2000), when present. Given that TASH_complete and TASH operate on the same principles and are both based on FreeSurfer’s surface reconstruction and cortical parcellation (which have been extensively validated against manual tracing and other imaging modalities, showing good accuracy), and since we have visually inspected all segmentations, we are confident as to the accuracy of the reported TTG variability. It has to be further noted that the prevalence of TTG multiplications beyond 2nd full posterior duplications was not systematically assessed in previous descriptive reports (Marie, 2015). However, we acknowledge that more work is needed to further ascertain anatomical accuracy of the segmentations, and we elaborate on this point in the Discussion of the revised manuscript (lines 621-623).
Reviewer #2 (Public Review):
This work investigates the possible association between language experience and morphology of the superior temporal cortex, a part of the brain responsible for the processing of auditory stimuli. Previous studies have found associations between language and music proficiency as well as language learning aptitude and cortical morphometric measures in regions in the primary and associated auditory cortex. These studies have most often, however, focused on finding neuroanatomical effects of difference between features in a few (often two) languages or from learning single phonetic/phonological features and have often been limited in terms of N. On this background, the authors use more sophisticated measures of language experience that take into account the age of onset and the differences in phonology between languages the subjects have been exposed to as well as a larger number of subjects (N = 146 + 69) to relate language experience to the shape and structure of the superior temporal cortex, measured from T1weighted MRI data. It shows solid evidence for there being a negative relationship between language experience and the right 2nd transverse temporal gyrus as well as some evidence for the relationship representing phoneme-level cross-linguistic information.
Strengths
The use of entropy measures to quantify language experience and include typological distance measures allows for a more general interpretation of the results and is an important step toward respecting and making use of linguistic diversity in neurolinguistic experiments.
A relatively large group of subjects with a range of linguistic backgrounds.
The full analysis of the structure of the superior temporal cortex including cortical volume, area, as well as the shape of the transverse gyrus/gyri. There is a growing literature on the meaning of the shape and number of the transverse gyri in relation to language proficiency and the authors explore all measures given the available data.
The authors chose to use a replication data set to verify their data, which is applaudable. However, see the relevant point under "Weaknesses".
Weaknesses
The authors fail to explain how a thinner cortex could reflect the specialization of the auditory cortex in the processing of diverse speech input. The Dynamic Restructuring Model (Pliatsikas, 2020) which is referred to does not offer clear guidance to interpretation. A more detailed discussion of how a phonologically diverse environment could lead to a thinner cortex would be very helpful.
Thank you for bringing our attention to this point. We have now extended the explanation we had previously included in the Discussion by including the following passage on p. 20 (lines 557-566) of the revised manuscript:
“Experience-induced pruning is essential for maintaining an efficient and adaptive neural network. It reinforces relevant neural circuits for faster more efficient information processing, while diminishing those that are less active, or less beneficial. The cortical specialization may need to arise because phonologically more diverse language experience requires that the mapping of acoustic signal to sound categories is denser, more detailed and more intricate. As a result, the brain may need to engage in more intensive processing to discriminate between and accurately perceive the sound categories of each language. This increased cognitive demand may, in turn, require the auditory and language processing regions of the brain to adapt and become more efficient. Over time, this heightened effort for successful speech perception and sound discrimination may lead to neural plasticity, resulting in cortical specialization. This means that cortical areas become more finely tuned and specialized for processing the unique phonological features of language(s) spoken by individuals.”
We have also added a passage to the Introduction regarding the possible microscopic or physiological underpinnings of the brain structural differences that we observe macroscopically using structural MRI (lines 68-73):
“Such environmental effect on cortical thickness might in turn be tied to microstructural changes to the underlying brain tissue, such as modifications in dendritic length and branching, synaptogenesis or synaptic pruning, growth of capillaries and glia, all previously tied to some kind of environmental enrichment and/or skill learning (see Lövdén et al., 2013; Zatorre et al., 2012 for overviews). Increased cortical thickness may reflect synaptogenesis and dendritic growth, while cortical thinning observed with MRI may be a result of increased myelination (Natu et al., 2019) or synaptic pruning.”
It is difficult to understand what measure of language experience is used when. Clearer and more explicit nomenclature would assist in the interpretation of the results.
We have added more explicit list of indices used in the Introduction (lines 104-107 of the revised manuscript) and in Section 2.4 and used them consistently throughout the text:
(1) language experience index not accounting for typological features: ‘Language experience - no typology’
(2) measures combining language experience with typological distances at different levels:
a. ‘Language experience – features’,
b. ‘Language experience – phonemes’,
c. ‘Language experience – phonological classes’.
There is a lack of description of the language backgrounds of the included subjects. How many came from each of the possible linguistic backgrounds? How did they differ in language exposure? This would be informative to evaluate the generalizability of the conclusions.
Thank you for raising this point. Given the complexity of participants’ language experience, ranging between monolingual to speaking 7 different languages, we opted for a fully parametric approach in quantifying it. We used the Shannon’s entropy and Rao’s quadratic entropy equations to create continuous measures of language experience, without the constraints of a minimum sample size per language and the need to exclude participants with underrepresented languages. To add further details in our description of the language background, we summarize the language background of both samples in the newly added Table 1 presenting a breakdown of participants by number of languages they spoke, and Supplementary Table S1 listing all languages spoken by each participant.
Only the result from the multiple transverse temporal gyri (2nd TTG) is analyzed in the replicated dataset. Only the association in the right hemisphere 2nd TTG is replicated but this is not reflected in the discussion or the conclusions. The positive correlation in the right TTG is thus not attempted to be replicated.
Thank you for bringing this point to our attention. Since only few participants presented single gyri in the left (n = 7) and the right hemisphere (n = 14), the replication analysis focused on the second TTG results only. We have now commented on this fact in Section 3.5 (lines 413-415), as well as in the Discussion (lines 594-596).
The replication dataset differed in more ways than the more frequent combination of English and German experience, as mentioned in the discussion. Specifically, the fraction of monolinguals was higher in the replication dataset and the samples came from different scanners. It would be better if the primary and replication datasets were more equally matched.
Indeed, the replication sample did not fully mimic the characteristics of the main sample and a better match between the two samples would have been preferable. As elaborated in the Introduction, however, the data was split into two groups according to the date of data acquisition, which also coincided with the field strength of the scanners used for data acquisition: the first, main sample’s data were acquired on a 1.5T, the replication sample’s on 3T. We opted for keeping this split and not introducing additional noise in the analysis by using data from different field strengths at the cost of not fully matching the two datasets. Observing the established effects (even partially) in this somewhat different replication sample, however, seems in our view to further strengthen our results.
Even if the language experience and typological distance measures are a step in the right direction for correctly associating language exposure with cortical plasticity, it still is a measure that is insensitive to the intensity of the exposure. The consequences of this are not discussed.
Indeed, we agree with the reviewer that there is still a lot of grounds to cover to fully understand the relationship between language experience and cortical plasticity. We have added a paragraph to the Discussion (lines 587-592 of the revised manuscript) to bring attention to this issue:
“Future research should also further increase the degree of detail in describing the multilingual language experience, as both AoA and proficiency (used here) are not sensitive to other aspects of multilingualism, such as intensity of the exposure to the different languages, or quantity and quality of language input. Since these aspects have been convincingly shown to be associated with neural changes (e.g., Romeo, 2019), incorporating further, more detailed measures describing individuals’ language experience could further enhance our understanding of cortical plasticity in general, and how the brain accommodates variable language experience in particular.”
Reviewer #3 (Public Review):
Summary:
The study uses structural MRI to identify how the number, degree of experience, and phonemic diversity of language(s) that a speaker knows can influence the thickness of different sub-segments of the auditory cortex. In both a primary and replication sample of adult speakers, the authors find key differences in cortical thickness within specific subregions of the cortex due to either the age at which languages are acquired (degree of experience), or the diversity of the phoneme inventories carried by that/those language(s) (breadth of experience).
Strengths:
The results are first and foremost quite fascinating and I do think they make a compelling case for the different ways in which linguistic experience shapes the auditory cortex.
The study uses a number of different measures to quantify linguistic experience, related to how many languages a person knows (taking into account the age at which each was learned) as well as the diversity of the phoneme inventories contained within those languages. The primary sample is moderately large for a study that focuses on brainbehaviour relationships; a somewhat smaller replication sample is also deployed in order to test the generality of the effects.
Analytic approaches benefit from the careful use of brain segmentation techniques that nicely capture key landmarks and account for vagaries in the structure of STG that can vary across individuals (e.g., the number of transverse temporal gyri varies from 1-4 across individuals).
Weaknesses:
The specificity of these effects is interesting; some effects really do appear to be localized to the left hemisphere and specific subregions of the auditory cortex e.g., TTG. However because analyses only focus on auditory regions along the STG and MTG, one could be led to the conclusion that these are the only brain regions for which such effects will occur. The hypothesis is that these are specifically auditory effects, but that does make a clear prediction that nonauditory regions should not show the same sort of variability. I recognize that expanding the search space will inflate type-1 errors to a point where maybe it's impossible to know what effects are genuine. And the fine-grained nature of the effects suggests a coarse analysis of other cortical regions is likely to fail. So I don't know the right answer here. Only that I tend to wonder if some control region(s) might have been useful for understanding whether such effects truly are limited to the auditory cortex. Otherwise one might argue these are epiphenomenal or some hidden factor unrelated to auditory experience predicting that we'd also see them in the non-auditory cortex as well, either within or outside the brain's speech network(s).
Thank you for raising this important issue. Our primary analyses indeed focused on the auditory regions, given their involvement in speech and language processing at different levels of processing hierarchy (from low – HG, to high – STG and STS). Here, we included a fairly broad range of ROIs (8 per hemisphere, 16 in total) and it has to be noted that it was only the bilateral planum temporale which showed an association with multilingualism. In the original submission we had indeed attempted at confirming the specificity of this result by performing a whole-brain vertex-wise analysis in freesurfer (see Table 3, Section 3.2, Figure S5), which again showed that the only cluster of vertices related to participants’ language experience at p < .0001 (uncorrected) was located in the superior aspect of the left STG, corresponding to the location of planum temporale and the second TTG. Lowering the threshold of statistical significance to p < .001 (uncorrected) results in further clusters of vertices whose thickness was positively associated with the degree of multilingual language experience localized in:
• Left hemisphere: central sulcus (S_cenral), long insular gyrus and central sulcus of the insula (G_Ins_lg_and_S_cent_ins), lingual gyrus (G_oc-temp_med-Lingual), planum temporale of the superior temporal gyrus (G_temp_sup-Plan_tempo), short insular gyri (G_insular_short), middle temporal gyrus (G_temporal_middle), and planum polare of the superior temporal gyrus (G_temp_sup-Plan_polar)
• Right hemisphere: angular gyrus (G_pariet_inf-Angular), superior temporal sulcus (S_temporal_sup), middle-posterior part of the cingulate gyrus and sulcus (G_and_S_cingul-Mid-Post), marginal branch of the cingulate sulcus (S_cingul-Marginalis), parieto-occipital sulcus (S_parieto_occipital), parahippocampal gyrus (G_oc-temp_med-Parahip), Inferior temporal gyrus (G_temporal_inf)
We present the result of this analysis in Author response image 1, where clusters are labelled according to the Destrieux anatomical atlas implemented in FreeSurfer:
Author response image 1.
As the reviewer points out, establishing relationships between our dependent and independent variables at a lower threshold of statistical significance might not reflect a true effect, and it is statistically more probable that multilingualism-related cortical thickness effects seem to be specific to the auditory regions. We do not exclude that an analysis of other pre-defined ROIs, performed at a similar level of detail as our present investigation, would uncover further significant associations between multilingual language experience and brain anatomy, but such an investigation is beyond the scope of the present work.
The reason(s) why we might find a link between cortical thickness and experience is not fully discussed. The introduction doesn't really mention why we'd expect cortical thickness to be correlated (positively or negatively) with speech experience. There is some discussion of it in the Discussion section as it relates to the Pliatsikas' Dynamic Restructuring Model, though I think that model only directly predicts thinning as a function of experience (here, negative correlations). It might have less to say about observed positive correlations e.g., HG in the right hemisphere. In any case, I do think that it's interesting to find some relationship between brain morphology and experience but clearer explanations for why these occur could help, and especially some mention of it in the intro so readers are clearer on why cortical thickness is a useful measure.
We have expanded the section of the Introduction introducing cortical thickness pointing to different microstructural changes previously associated with environmental enrichment and skill learning (lines 68-73), and hope the link between cortical thickness and multilingual language experience is clearer now:
“Such environmental effect on cortical thickness might in turn be tied to microstructural changes to the underlying brain tissue, such as modifications in dendritic length and branching, synaptogenesis or synaptic pruning, growth of capillaries and glia, all previously tied to some kind of environmental enrichment and/or skill learning (see Lövdén et al., 2013; Zatorre et al., 2012 for overviews). Increased cortical thickness may reflect synaptogenesis and dendritic growth, while cortical thinning observed with MRI may be a result of increased myelination (Natu et al., 2019) or synaptic pruning.”
In addition, we have also expanded the Discussion section providing more reasoning for the links between cortical thickness and multilingual language experience (lines 557-566):
“Experience-induced pruning is essential for maintaining an efficient and adaptive neural network. It reinforces relevant neural circuits for faster more efficient information processing, while diminishing those that are less active, or less beneficial. The cortical specialization may need to arise because phonologically more diverse language experience requires that the mapping of acoustic signal to sound categories is denser, more detailed and more intricate. As a result, the brain may need to engage in more intensive processing to discriminate between and accurately perceive the sound categories of each language. This increased cognitive demand may, in turn, require the auditory and language processing regions of the brain to adapt and become more efficient. Over time, this heightened effort for successful speech perception and sound discrimination may lead to neural plasticity, resulting in cortical specialization. This means that cortical areas become more finely tuned and specialized for processing the unique phonological features of language(s) spoken by individuals.”
One pitfall of quantifying phoneme overlap across languages is that what we might call a single 'phoneme', shared across languages, will, in reality, be realized differently across them. For instance, English and French may be argued to both use the vowel /u/ although it's realized differently in English vs. French (it's often fronted and diphthongized in many English speaker groups). Maybe the phonetic dictionaries used in this study capture this using a close phonetic transcription, but it's hard to tell; I suspect they don't, and in that case, the diversity measures would be an underestimate of the actual number of unique phonemes that a listener needs to maintain.
The PHOIBLE database uses transcription that reflects phonological descriptive data as closely as possible, according to the available descriptive sources. Different realizations of sounds are (as much as possible) marked in the database. For example, the open front unrounded vowel /a/ is listed as e.g., [a] or [a̟ ], with the “+” sign denoting a fronted realization. This is done in PHOIBLE by the use of diacritics (see https://phoible.org/conventions) which further specify variations on the language-specific realizations of the phonemes listed in the database. Further details are available in Moran (2012) (https://digital.lib.washington.edu/researchworks/items/0d26e54d-950a-4d0b-b72c-3afb4b1aa9eb). In our calculation of phoneme-based distances a sign with and without a diacritic were treated as different phonemes, and therefore the different realizations were accounted for.
That said, we fully agree with the reviewer that in fact any diversity measure will be an underestimation of the actual variation, as between-speaker micro-variation can never be fully reflected in largescale typological databases as the one used in the present study. To the best of our knowledge, however, PHOIBLE offers the most comprehensive way of allowing for quantifying cross-linguistic variation to date, and we are looking forward for the field to offer further tools capturing the linguistic variability at an ever-finer level of detail.
Discussion of potential genetic differences underlying the findings is interesting. One additional data point here is a study finding a relationship between the number of repeats of the READ1 (a factor of the DCDC2 gene) in populations of speakers, and the phoneme inventory of language(s) predominant in that population (DeMille, M. M., Tang, K., Mehta, C. M., Geissler, C., Malins, J. G., Powers, N. R., ... & Gruen, J. R. (2018). Worldwide distribution of the DCDC2 READ1 regulatory element and its relationship with phoneme variation across languages. Proceedings of the National Academy of Sciences, 115(19), 4951-4956.) Admittedly, that paper makes no claim about the cortical expression of that regulatory factor under study, and so more work needs to be done on whether this has any bearing at all on the auditory cortex. But it does represent one alternative account that does not have to do with plasticity/experience.
We thank the reviewer for bringing this important line of research to our attention, which we now included in the Discussion (lines 494-498 of the revised manuscript).
The replication sample is useful and a great idea. It does however feature roughly half the number of participants meaning statistical power is weaker. Using information from the first sample, the authors might wish to do a post-hoc power analysis that shows the minimum sample size needed to replicate their effect; given small effects in some cases, we might not be surprised that the replication was only partial. I don't think this is a deal breaker as much as it's a way to better understand whether the failure to replicate is an issue of power versus fragile effects.
Thank you for the suggestion. Indeed, the effect sizes established in the analyses using the main sample were small (e.g., f2 = 0.07). According to a power analysis performed with G*Power 3.1 (Faul et al., 2009), detecting an effect of this magnitude of the predictor of interest at alpha = .05 (two-tailed), in a linear multiple regression model with 4 predictors (i.e., 3 covariates of no-interest: sex, age, hemispheric thickness, and 1 predictor of interest), a sample of N = 114 is required to achieve 80% of power. Our partial lack of replicating the effect might therefore indeed be related to a lower power of the replication sample, rather than the effect itself being fragile.
Recommendations for the authors:
Reviewer #1 (Recommendations for the Authors):
A few remaining details that I think you can handle:
(1) Was there any correction for multiple comparisons, especially when multiple anatomical measures were investigated in separate models? (e.g. ln 130).
Since three different anatomical measures were investigated in Analysis 1 and Analysis 2 (see Table 1), the alpha level of the two linear mixed models was lowered to α = .0166. Note that the p-values of the predictors of interest were p = .012 (mixed model with all auditory regions) and p = .005 (mixed model with all identified TTGs).
(2) In Table 2, since your sample skews heavily female, it would be more useful to present the counts of Male/Female totals for 1, 2, 3, 4, etc TTGs as proportions of the total for that sex rather than counts, so that the distribution across sex is more obvious.
Thank you for bringing this issue to our attention. We have now included an additional row in Table 4, with proportions of males and females presenting different total number of identified gyri in the left and the right hemisphere.
(3) (ln 161) It wasn't clear to me how you dealt statistically with the fact that some participants had only one TTG - did you simply enter "0" as a value for cortical thickness for 2, 3, etc. for those participants? If so, it's possible that this result could reflect the number of split/duplicated gyri rather than the thickness of those gyri.
Indeed, if non-existing gyri were coded with a value of “0” (it being the lowest possible thickness value), the results would reflect the configuration of TTGs (single vs multiple gyri) rather than a relationship between thickness and language experience.
The model was, however, fit to all available thickness values, and the gyri labels (1st, 2nd, 3rd) were modeled as a fixed factor with 3 levels. This procedure allowed us to localize the effect of language experience to a specific gyrus. The following formula was used with the lmer package in R:
thickness ~ age + sex + whole_brain_thickness + language_experience* gyrus*hemisphere + (1 | participant_id)
We observed a significant interaction between language experience and the 2nd gyrus (NB. no significant 3-way interaction between language experience, the 2nd gyrus and hemisphere pointed to the effect being bilateral). This result was then followed up with two linear models: one for the thickness values of the 2nd left and one for the 2nd right gyrus, each fit to the available data only (n = 130 for the left hemisphere; n = 96 for the right), see Table 5. This procedure ensured that only the available cortical thickness data were considered when establishing their relationship with our independent variable (language experience).
(4) I think more could be done in the results section to distinguish your three phonological measures--these details are evident in the Methods section, but if readers consume this paper front to back they may find it difficult to figure out what each measure really means.
Thank you. We have added more explicit list of indices used in the Introduction (lines 104-107) and in Section 2.4. As per Reviewer #2 comments, the Methods section was also moved before the Results section, hopefully further enhancing the readability of the paper.
Typos:
ln 270: "weighed"--could you have meant "weighted"?
Corrected, thank you!
ln 377: "Apart from phoneme-based typological distance measure explaining" --> "Apart from *the* phonemebased..."
Corrected, thank you!
Reviewer #2 (Recommendations for the Authors):
The interpretation of the results would be much helped by the methods section being moved to precede it. Now, much of the results section is methods summaries that would not have been needed if the reader had been presented with the methods beforehand. This is especially true for the measures of language experience and typological distances used.
Thank you. We have moved the Materials and Methods section before the Results section.
The equation in section "4.2 Language experience" should be H = - sum(p_i log2 (p_i)) and not H = - sum(p_i log2(i)).
Corrected, thank you!
It is unclear what "S" represents in the equation in the section "4.4 Combining typology and language experience (indexed by AoA)".
The explanation has been added, thank you!
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public Review):
Summary:
The main conclusion of this manuscript is that the mediator kinases supporting the IFN response in Downs syndrome cell lines represent an important addition to understanding the pathology of this affliction.
Strengths:
Mediator kinase stimulates cytokine production. Both RNAseq and metabolomics clearly demonstrate a stimulatory role for CDK8/CDK19 in the IFN response. The nature of this role, direct vs. indirect, is inferred by previous studies demonstrating that inflammatory transcription factors are Cdk8/19 substrates. The cytokine and metabolic changes are clear-cut and provide a potential avenue to mitigate these associated pathologies.
Weaknesses:
This study revealed a previously undescribed role for the CKM in splicing. The previous identification of splicing factors as substrates of CDK8/CDK19 is also intriguing. However, additional studies seem to be necessary in order to attach this new function to the CKM. As the authors point out, the changes in splicing patterns are relatively modest compared to other regulators. In addition, some indication that the proteins encoded by these genes exhibit reduced levels or activities would support their RNAseq findings.
We have added new splicing data for the version of record. Specifically, we have added splicing data analysis for the "non-sibling" T21 cell line (±cortistatin A, t=4.5h) and for the sibling T21 line (±cortistatin A) at t=24h. The results are summarized in new Figure 5 – figure supplement 2. The data are in agreement with our prior data from the sibling T21 line ±CA at t=4.5h. In particular, i) similar numbers of genes were impacted by splicing changes (alternative exon inclusion or alternative exon skipping) in CA-treated cells in the "non-sibling" T21 line compared with the sibling T21 line; ii) upon completion of a pathway analysis of these alternatively spliced genes, similar pathways were affected by CA in each case (non-sibling T21 vs. sibling T21), in particular those related to IFN signaling; iii) regarding the new t=24h timepoint for the sibling T21 line, similar numbers of genes were alternatively spliced (alternative exon inclusion or alternative exon skipping) in CA-treated cells compared with the 4.5h timepoint, and iv) the IPA results with the alternately spliced genes identified inflammatory signaling, mRNA processing, and lipid metabolism among other pathways, which broadly reflect the cytokine screen and metabolomics data in CA-treated cells (t=24h).
Additional evidence for CDK8/CDK19 regulation of splicing comes from our t=24h RNA-seq data in T21 cells ±CA. GSEA results revealed down-regulation of many pathways related to RNA processing and splicing, suggesting that the splicing changes caused by Mediator kinase inhibition result from reduced expression of splicing regulators, at least at this longer timeframe. These results are summarized in new Figure 2 – figure supplement 2E. Collectively, the data shown in this article reveal a previously unidentified role for Mediator kinases as splicing regulators. We emphasize in the article, however, that the splicing effects of Mediator kinase inhibition appear modest, at least within the cell lines and timeframes of our experiments, especially when compared with CDK7 inhibition [Rimel et al. Genes Dev 2020 1452].
Seahorse analysis is normally calculated with specific units for oxygen consumption, ATP production, etc. It would be of interest to see the actual values of OCR between the D21 and T21 cell lines rather than standardizing the results. This will address the specific question about relative mitochondrial function between these cells. Reduced mitochondrial function has been associated with DS patients. Therefore, it would be important to know whether mitochondrial function is reduced in the T21 cells vs. the D21 control. Importantly for the authors' goal of investigating the use of CDK8/19 inhibitors in DS patients, does CA treatment reduce mitochondrial function to pathological levels?
These are good points. We have addressed as follows.
(1) We have added a comparative analysis of Seahorse data for the sibling-matched T21 and D21 lines. As shown in new Figure 2 – figure supplement 4A-C, the T21 line shows higher basal levels of OCR and ECAR compared with D21. Although reviewer 1 states that "reduced mitochondrial function has been associated with DS patients" we are unaware of the study from which this conclusion was made. Our results are consistent with a Down syndrome mouse model study published last year [Sarver et al. eLife 2023 e86023]. We acknowledge that in this study, T21/D21 OCR levels varied in different tissues, but the majority of tissue types showed elevated OCR in T21, similar to our results in the human B-cells used here.
(2) Interestingly, CA treatment reduced OCR and ECAR in T21 cells (and D21), suggesting that Mediator kinase inhibition might normalize mitochondrial function (and ECAR) toward D21 levels. We show this comparison in new Figure 2 – figure supplement 4D-F. Indeed, CA treatment appears to normalize T21 mitochondrial function and ECAR toward D21 levels. Although this may suggest a therapeutic benefit, we emphasize that more experiments would be needed to make such claims with confidence.
(3) We include a breakdown of mitochondrial parameters from Seahorse data in the bar plots shown in Figure 2–figure supplement 3. This includes ATP production, which shows reduced ATP levels in CA-treated T21 cells specifically.
(4) We have added Seahorse data for ECAR (extracellular acidification rate) in the siblingmatched D21 and T21 cells, ±CA. These results are shown in new Figure 2 – figure supplement 3D, and indicate that CA treatment reduces ECAR in both D21 and T21 cells. This result is consistent with a prior report that analyzed ECAR in CDK8 analog-sensitive HCT116 cells [Galbraith et al. Cell Rep 2017 1495].
Reviewer #2 (Public Review):
Summary:
In this manuscript, Cozzolino et al. demonstrate that inhibition of the Mediator kinase CDK8 and its paralog CDK19 suppresses hyperactive interferon (IFN) signaling in Down syndrome (DS), which results from trisomy of chromosome 21 (T21). Numerous pathologies associated with DS are considered direct consequences of chronic IFN pathway activation, and thus hyperactive IFN signaling lies at the heart of pathophysiology. The collective interrogation of transcriptomics, metabolomics, and cytokine screens in sibling-matched cell lines (T21 vs D21) allows the authors to conclude that Mediator kinase inhibition could mitigate chronic, hyperactive IFN signaling in T21. To probe the functional outcomes of Mediator kinase inhibition, the authors performed cytokine screens, transcriptomic, and untargeted metabolomics. This collective approach revealed that Mediator kinases establish IFN-dependent cytokine responses at least in part through transcriptional regulation of cytokine genes and receptors. Mediator kinase inhibition suppresses cell responses during hyperactive IFN signaling through inhibition of pro- inflammatory transcription factor activity (anti-inflammatory effect) and alteration of core metabolic pathways, including upregulation of anti-inflammatory lipid mediators, which served as ligands for specific nuclear receptors and downstream phenotypic outcomes (e.g., oxygen consumption). These data provided a mechanistic link between Mediator kinase activity and nuclear receptor function. Finally, the authors also disclosed that Mediator kinase inhibition alters splicing outcomes.
Overall, this study reveals a mechanism by which Mediator kinases regulate gene expression and establish that its inhibition antagonizes chronic IFN signaling through collective transcriptional, metabolic, and cytokine responses. The data have implications for DS and other chronic inflammatory conditions, as Mediator kinase inhibition could potentially mitigate pathological immune system hyperactivation.
Strengths:
(1) One major strength of this study is the mechanistic evidence linking Mediator kinases to hyperactive IFN signaling through transcriptional changes impacting cell signaling and metabolism. (2) Another major strength of this study is the use of sibling-matched cell lines (T21 vs D21) from various donors (not just one sibling pair), and further cross-referencing with data from large cohorts, suggesting that part of the data and conclusions are generalizable.
(3) Another major strength of this study is the combined experimental approach including transcriptomics, untargeted metabolomics, and cytokine screens to define the mechanisms underlying suppression of hyperactive interferon signaling in DS upon Mediator kinase inhibition. (4) Another major strength of this study is the significance of the work to DS and its potential impact on other chronic inflammatory conditions.
Weakness:
(1) Genetic evidence linking the mentioned nuclear receptors to activation of an anti-inflammatory program upon Mediator kinase inhibition could improve the definition of the mechanism and overall impact of the work.
Existing data from other studies, some of which are cited in the article, have linked PPAR and LXR to lipid biosynthesis and anti-inflammatory signaling cascades. We assume that reviewer 2 is suggesting knockdown and/or degron depletion of specific nuclear receptors, to compare/contrast the effect of CA on IFN responses in T21 and D21 cells. Such experiments would help de-couple the NR-specific contributions from other CA-dependent effects. We consider these experiments important next steps for this project, but beyond the scope of this study. That said, we anticipate that data from such experiments might be challenging to interpret, given the complex and inter-connected cascade of transcriptional and metabolic changes that would result from PPAR or LXR depletion.
(2) Page 5 states that "Mediator kinases broadly regulate cholesterol and fatty acid biosynthesis and this was further confirmed by the metabolomics data", but a clear mechanistic explanation was lacking. Likewise, the data suggest but do not prove, that altered lipid metabolites influence the function of nuclear receptors to regulate an anti-inflammatory program in response to Mediator kinase inhibition (p. 6), despite the fact the gene expression changes elicited by Mediator kinase inhibition tracked with downstream metabolic changes.
We have clarified the text on page 5 to address this comment. Specifically, we note that CA treatment increases expression of FA metabolism and cholesterol metabolism genes in T21 cells under basal conditions, and the genes affected are shown in Figure 2–figure supplement 1E. Thus, the mechanistic explanation is that Mediator kinases cause elevated levels of FA and cholesterol metabolites via changes in expression of FA and cholesterol biosynthesis genes (at least in part). We further address the mechanism with the PRO-seq data and TFEA results in Figure 6; in particular, p53 activity is rapidly suppressed in CA-treated T21 cells (t=75min), and this alone is sufficient to activate SREBP [Moon et al. Cell 2019 564]. CA-dependent activation of SREBP target genes is a dominant feature in the T21 RNA-seq data (t=4.5h).
We agree with the second point raised by reviewer 2, that our data suggest but do not prove nuclear receptor function is altered by CA treatment. We do cite papers that have provided good evidence that the metabolites elevated in CA-treated cells are NR ligands and activate their target genes. Additional experiments to address this question might involve targeted depletion of select metabolites via inhibition of key biosynthetic enzymes. We consider these experiments beyond the scope of this already expansive article. That said, it will be challenging to conclusively demonstrate clear cause-effect relationships (e.g. to demonstrate whether select metabolites altered by CA treatment directly alter PPARA function), given i) the myriad transcriptional and metabolic changes caused by CA treatment, coupled with the fact that ii) the CA-dependent lipid metabolite changes are spread out across chemically distinct NR agonists (e.g. endocannabinoids, oleamide, or cholesterol metabolites such as desmosterol), and iii) NR activation can occur via multiple different metabolites.
(3) The figures are outstanding but dense.
Thank you. We have done our best to represent the results clearly and within the publication guidelines. There was an enormous amount of data to summarize for this article.
(4) Figure 6 (PRO-Seq). The authors refer to pro-inflammatory TFs (e.g. NF-kB/RelA). It is not clear whether the authors have specifically examined TF binding at enhancers or more broadly at every region occupied by the interrogated TFs?
This is a good point. Our analysis (TFEA) only identified the TFs whose activity was changing in CA-treated cells. It did not distinguish where these TFs were bound (enhancers vs. promoters). We completed a modified TFEA by separating enhancer TFs vs. promoter TFs. The results showed a preference for CA-dependent suppression of enhancer-bound TFs. This result is consistent with the general observation that stimulus-response transcription is controlled by enhancer-bound TFs (e.g. Kim et al. Nature 2010 182; Azofeifa et al. Genome Res 2018 334; Jones et al. bioRxiv 2024 585303). However, our TFEA enhancer/promoter analysis is preliminary and more work would be needed to address this comment in a rigorous way. Therefore, we did not include this analysis in the revision.
Reviewing Editor Comments:
Main suggestions for improvement:
(1) Provide additional information about the mechanistic basis for the changes in lipid levels observed on kinase inhibition.
We have changed the text to better emphasize that the mechanistic basis involves i) gene expression changes resulting from Mediator kinase inhibition (e.g. Fig 2 – figure supplement 1D, E, Fig 2 – figure supplement 2B, Fig 2 – figure supplement 4B-D); ii) activation of SREBP and PPAR and LXR, based upon IPA results with RNA-seq data (e.g. Fig 2B, Fig 2 – figure supplement 1F, Fig 2 – figure supplement 2D, Fig 2 – figure supplement 4E; Fig 3E), and iii) rapid CAdependent suppression of p53 function (Fig 6A), which will activate SREBP (Moon et al. Cell 2019 564).
(2) Provide direct genetic evidence that the nuclear receptors are activated by the lipid changes to mediate an anti-inflammatory program in response to Mediator kinase inhibition.
This is an excellent question but we consider it beyond the scope of this already expansive study. That said, we cite several papers in the article that demonstrate that the lipids we observe elevated in CA-treated cells i) directly bind PPAR or LXR and ii) activate their TF function. We also note that the anti-inflammatory impacts of Mediator kinase inhibition are broad, affecting distinct gene sets through transcriptional changes, metabolites, and cytokines. Any NR-specific contributions could be challenging to de-couple from CA-dependent effects using knockdown or depletion methods, given the compensatory responses that would result.
(3) Improve/expand the evidence that Mediator kinase inhibition confers reduced mitochondrial function.
We have added new Seahorse data for sibling-matched D21 and T21 cells (±CA) for the version of record. Our prior results showed reduced mitochondrial function and OCR in CA-treated T21 cells. We have added data that compares D21 and T21 mitochondrial function. As shown in new Figure 2 – figure supplement 4A-C, the T21 line shows higher basal levels of OCR and ECAR compared with D21. These results are consistent with a Down syndrome mouse model study published last year [Sarver et al. eLife 2023 e86023]. When we compare CA-treated T21 with D21 cells, mitochondrial respiration and OCR are similar, suggesting that Mediator kinase inhibition might normalize mitochondrial function (and ECAR) toward D21 levels. We show this comparison in new Figure 2 – figure supplement 4D-F. Although this may suggest a therapeutic benefit, we emphasize that more experiments would be needed to make such claims with confidence.
(4) Determine whether mitochondrial function is reduced in the T21 cells vs. the D21 controls and whether kinase inhibition with the inhibitor reduces mitochondrial function to pathological levels.
For the version of record, we have added a direct comparison of mitochondrial parameters and OCR in the sibling-matched D21/T21 lines. The data show that T21 cells have higher OCR compared with D21. These results are consistent with a Down syndrome mouse model study published last year [Sarver et al. eLife 2023 e86023]. Our results also indicate that CA treatment brings OCR and other "mitochondrial parameters" in T21 cells toward D21 levels, as noted above.
(5) Consider whether the CDK8/19 inhibitor has off-target effects that would lessen its therapeutic value.
We chose cortistatin A (CA) for this project because it is the most potent and selective inhibitor available for targeting CDK8/CDK19. Initial published reports suggested off-target effects (Cee et al. Angew Chem IEE 2009), but these experiments used binding assays against the kinase protein alone, and did not measure binding or inhibition with biologically relevant, active kinase complexes. Kinome-wide screens involving native, active kinase complexes showed no evidence of off-target effects for cortistatin A, even at concentrations 5000-times the measured KD (Pelish et al. Nature 2015). See Author response image 1.
Related to CA therapeutic value, that is an important issue but beyond the scope of this study. We consider CA a valuable chemical probe, to use as a means to define CDK8/CDK19-dependent functions in cell line models. As a chemical probe, we consider CA the "best-in-class" Mediator kinase inhibitor, based upon all available data (Clopper & Taatjes Curr Opin Chem Biol 2022 102186).
That said, we understand the concern about off-target effects, which can never be ruled out with a chemical inhibitor. We include quantitative western data (Fig 1 – figure supplement 1A) that compares CA with a structurally distinct CDK8/CDK19 inhibitor, CCT251545. The data show that, as expected, CA (100nM) and CCT251545 (250nM) similarly inhibit STAT1 S727 phosphorylation in IFN-stimulated cells. The samples were pre-treated with inhibitor for 30 minutes prior to IFNg and collected 45 minutes after IFNg treatment.
We did not complete any experiments with knockouts or kinasedead alleles primarily because knockouts or kinase-dead alleles are not reliable comparisons for chemical inhibition because of the different time frames involved. For example, there will be genetic compensation in edited cell lines (Rossi/Stanier Nature 2015 230) and we and others have shown that there are major differences between kinase protein loss through knockdown or knockout methods vs. rapid inhibition with small molecules (e.g. Poss et al. Cell Rep 2016 436; Sooraj et al. Mol Cell 2022 123).
Author response image 1.
Information about cortistatin A. A) KiNativ kinome screen from HEK293 lysates. CA blocked capture of only CDK8/CDK19 in this MSbased assay, among over 200 kinases detected. B) Equilibrium binding constants and kinetics for CA. C) CA structure; note the dimethylamine is protonated at physiological pH, and forms a pi-cation interaction with W105 (crystal structure, panel D). Only CDK8 and CDK19 have an aromatic residue (W) at this position, providing a structural basis for high selectivity.
(6) Improve the presentation of the splicing data and better discuss how the splicing alterations may be contributing to the disease phenotype.
We have added new splicing data for the version of record. Specifically, we have added splicing data analysis for the "non-sibling" T21 cell line (±cortistatin A, t=4.5h) and for the sibling T21 line (±cortistatin A) at t=24h. The results are summarized in new Figure 5 – figure supplement 2. The data are in agreement with our prior results from the sibling T21 line ±CA at t=4.5h. In particular, i) similar numbers of genes were impacted by splicing changes (alternative exon inclusion or alternative exon skipping) in CA-treated cells in the "non-sibling" T21 line compared with the sibling T21 line; ii) upon completion of a pathway analysis of these alternatively spliced genes, similar pathways, including IFN signaling pathways, were affected by CA in each case (non-sibling T21 vs. sibling T21); iii) regarding the new t=24h timepoint for the sibling T21 line, similar numbers of genes were alternatively spliced (alternative exon inclusion or alternative exon skipping) in CA-treated cells compared with the 4.5h timepoint, and iv) the IPA results with the alternately spliced genes identified inflammatory signaling, mRNA processing, nucleotide and lipid metabolism among other pathways, which broadly reflect the cytokine screen and metabolomics data in CA-treated cells (t=24h).
Additional evidence for CDK8/CDK19 regulation of splicing comes from our t=24h RNA-seq data in T21 cells ±CA. GSEA results from sibling T21 cells ±CA revealed down-regulation of many pathways related to RNA processing and splicing (RNA-seq data, t=24h), suggesting that the splicing changes caused by Mediator kinase inhibition result from reduced expression of splicing regulators, at least at longer timeframes. These results are summarized in new Figure 2 – figure supplement 2E.
Related to how splicing alterations may be contributing to the CA-dependent effects and their potential therapeutic implications, this is an interesting question but open-ended. It will not be straightforward to link specific splicing changes to possible therapeutic outcomes, especially given that there are hundreds of genes affected and because the effects are modest (i.e. not all-ornothing).
Reviewer #1 (Recommendations For The Authors):
The findings that CA treatment leads to upregulation of as many genes are downregulated is consistent with previous studies of a 50:50 role for the CKM. However, most previous studies utilized knockout alleles or knockdown approaches. As the authors demonstrated in a previous study, CA inhibits kinase activity without changing CDK8 levels. Does this indicate that the kinase activity of Cdk8/19 is required for transcriptional repression? Previous in vitro studies suggested that Cdk8/19-dependent repression was independent of their kinase activity. The authors should comment on this.
This is a challenging question to address, because the answer will depend on the timing of the experiment and the experimental context. The short answer is that the kinase activity of CDK8/19 will activate some genes and reduce expression of others, at least in part because CDK8/19 phosphorylate TFs, which drive global gene expression programs. TF phosphorylation by CDK8/19 appears to activate some genes and repress others (e.g. STAT1 S727A example from Steinparzer et al. Mol Cell 2019 485), at least based upon RNA-seq data, but this doesn't measure the immediate effects on the transcriptome. It is true that kinase activity isn't required to block pol II incorporation into the PIC (Knuesel et al. Genes Dev 2009 439). This is a kinase-independent function of the module; MKM-Mediator binding will block Mediator-pol II interaction and therefore block PIC assembly and pol II initiation (Knuesel 2009; Ebmeier & Taatjes PNAS 2010 11283). The kinase-independent functions of CDK8/19 were not a focus of the work described here. We only focus on Mediator kinase activity. We also do not focus on potential effects on RNAPII initiation or PIC assembly, although these are important peripheral topics.
Descriptors are less useful as the reader must go back to reconstruct the experiment: "Although metabolites were measured 24h after CA treatment, these data suggest that altered lipid metabolites influence LXR and PPAR function". Does "altered" mean the lipid concentrations were up or down? Similarly, lipids that "influenced" LXR function - were they stimulatory or inhibitory?
Good point. Where possible, we used more accurate language when describing CAdependent changes.
I found many sections in the text confusing. For example: Figure 3. Mediator kinase inhibition antagonizes IFNγ transcriptional responses in T21 and D21. It takes a while to unpack this figure title. Instead of the double negative, the authors could simply state that "Mediator kinase is required for IFN-dependent transcriptional activation". Describing the protein activity, versus the drug-induced phenotype, can often clarify complicated scenarios.
Good idea. We have edited the text to eliminate some but not all of these double negatives. In some cases we prefer to describe the consequence of kinase inhibition.
Reviewer #2 (Recommendations For The Authors):
(1) The splicing data analysis is compelling, but not well integrated into the overall story and it cuts the storytelling logic in the Abstract. The authors could consider better integrating the large amount of data generated and better explaining how it relates to the various aspects of the proposed model (transcriptional, metabolism) to help improve potential cause-and-effect outcomes. -
We agree. The large amount of data, combined with the different experimental approaches, makes it a challenge to summarize the data in a concise way. We have done our best to organize the results in a logical and clear manner. To address this comment, we have gone through the text and re-organized where possible, and we have edited the abstract. We have added new splicing data and the splicing results are now better integrated (in our opinion) in part because of the pathway results from the t=24h ±CA RNA-seq data, which show major reductions in gene sets related to splicing and RNA processing.
(2) The manuscript could improve its readability by providing specific details throughout. Examples include i) explaining why and what 29 cytokines were chosen for the screen (p. 3, p. 4) ii) providing major data analysis conclusions to the cytokine screen part (p. 3) iii) expanding the conclusions to the metabolic pathway analysis (p. 4) iv) being more precise when referring to T21-specific changes (up or down?) (p 4), and "significantly altered" by CA treatment in T21 cells (up or down?) (p. 5).
Good points. We have edited the text to address these comments. Please note that the 29 cytokines refers to a different study (Malle et al. Nature 2023) and we had no role in selecting the cytokines. Our screen involved 105 cytokines that were arrayed as part of a commercially available panel.
(3) The figures are outstanding but dense (e.g., Figure 1b, can any simplification and/or highlighting be done to underscore important features?). Some panels are illegible (e.g. Figure 1- supplement Figure 2a and b). The authors could improve data presentation. For example, the Venn diagrams (e.g., Figure 2f) are hard to quickly digest. Can the authors find a better way to highlight important data (e.g., hard to distinguish the meaning of font bolding from italics)?
Thank you for these suggestions. Regarding Figure 1B, we simplified the metabolic pathways to emphasize the biochemicals that specifically relate to this study. We decided against highlighting specific metabolites beyond this simplification, because in our opinion it causes as many problems as it solves. Where possible, we have enlarged the panels with hard-to-read text; thank you for the suggestion. For the Venn diagrams, they convey a large amount of information in a single panel: increased or decreased gene expression in T21 or D21, cytokine genes or cytokine receptors, and gene expression convergence or divergence compared with protein levels from cytokine screens. There is a different way to display the results, but it would involve generating more data panels to parse out the results. This could be considered better, but we opted for something that is more information-rich that requires only a single data panel. Given the large amount of data already shown, we hope the reviewer can understand this choice.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public review):
Summary:
In this manuscript, Liu et al. present CROWN-seq, a technique that simultaneously identifies transcription-start nucleotides and quantifies N6,2'-O-dimethyladenosine (m6Am) stoichiometry. This method is derived from ReCappable-seq and GLORI, a chemical deamination approach that differentiates A and N6-methylated A. Using ReCappable-seq and CROWN-seq, the authors found that genes frequently utilize multiple transcription start sites, and isoforms beginning with an Am are almost always N6-methylated. These findings are consistently observed across nine cell lines. Unlike prior reports that associated m6Am with mRNA stability and expression, the authors suggest here that m6Am may increase transcription when combined with specific promoter sequences and initiation mechanisms. Additionally, they report intriguing insights on m6Am in snRNA and snoRNA and its regulation by FTO. Overall, the manuscript presents a strong body of work that will significantly advance m6Am research.
Strengths:
The technology development part of the work is exceptionally strong, with thoughtful controls and well-supported conclusions.
We appreciate the reviewer for the very positive assessment of the study. We have addressed the concerns below.
Weaknesses:
Given the high stoichiometry of m6Am, further association with upstream and downstream sequences (or promoter sequences) does not appear to yield strong signals. As such, transcription initiation regulation by m6Am, suggested by the current work, warrants further investigation.
We thank the reviewer for the insightful comments. We have softened the language related to m<sup>6</sup>Am and transcription regulation. We totally agree with the reviewer that future investigation is required to determine the molecular mechanism behind m<sup>6</sup>Am and transcription regulation.
Reviewer #2 (Public review):
Summary:
In the manuscript "Decoding m6Am by simultaneous transcription-start mapping and methylation quantification" Liu and co-workers describe the development and application of CROWN-Seq, a new specialized library preparation and sequencing technique designed to detect the presence of cap-adjacent N6,2'-O-dimethyladenosine (m6Am) with single nucleotide resolution. Such a technique was a key need in the field since prior attempts to get accurate positional or quantitative measurements of m6Am positioning yielded starkly different results and failed to generate a consistent set of targets. As noted in the strengths section below the authors have developed a robust assay that moves the field forward.
Furthermore, their results show that most mRNAs whose transcription start nucleotide (TSN) is an 'A' are in fact m6Am (85%+ for most cell lines). They also show that snRNAs and snoRNAs have a substantially lower prevalence of m6Am TSNs.
Strengths:
Critically, the authors spent substantial time and effort to validate and benchmark the new technique with spike-in standards during development, cross-comparison with prior techniques, and validation of the technique's performance using a genetic PCIF1 knockout. Finally, they assayed nine different cell lines to cross-validate their results. The outcome of their work (a reliable and accurate method to catalog cap-adjacent m6Am) is a particularly notable achievement and is a needed advance for the field.
Weaknesses:
No major concerns were identified by this reviewer.
We thank the reviewer for the positive assessment of the method and dataset. We have addressed the concerns below.
Mid-level Concerns:
(1) In Lines 625 and 626, the authors state that “our data suggest that mRNAs initate (mis-spelled by authors) with either Gm, Cm, Um, or m6Am.” This reviewer took those words to mean that for A-initiated mRNAs, m6Am was the ‘default’ TSN. This contradicts their later premise that promoter sequences play a role in whether m6Am is deposited.
We thank the reviewer for the comment. We have changed this sentence into “Instead, our data suggest that mRNAs initiate with either Gm, Cm, Um, or Am, where Am are mostly m<sup>6</sup>Am modified.” The revised sentence separates the processes of transcription initiation and m<sup>6</sup>Am deposition, which will not confuse the reader.
(2) Further, the following paragraph (lines 633-641) uses fairly definitive language that is unsupported by their data. For example in lines 637 and 638 they state “We found that these differences are often due to the specific TSS motif.” Simply, using ‘due to’ implies a causative relationship between the promoter sequences and m6Am has been demonstrated. The authors do not show causation, rather they demonstrate a correlation between the promoter sequences and an m6Am TSN. Finally, despite claiming a causal relationship, the authors do not put forth any conceptual framework or possible mechanism to explain the link between the promoter sequences and transcripts initiating with an m6Am.
(3) The authors need to soften the language concerning these data and their interpretation to reflect the correlative nature of the data presented to link m6Am and transcription initiation.
For (2) and (3). We have softened the language in the revised manuscript. Specifically, for lines 633-641 in the original manuscript, we have changed “are often due to” into “are often related to” in the revised manuscript, which claims a correlation rather than a causation.
Reviewer #3 (Public review):
Summary:
m6Am is an abundant mRNA modification present on the TSN. Unlike the structurally similar and abundant internal mRNA modification m6A, m6Am’s function has been controversial. One way to resolve controversies surrounding mRNA modification functions has been to develop new ways to better profile said mRNA modification. Here, Liu et al. developed a new method (based on GLORI-seq for m6A-sequencing), for antibody-independent sequencing of m6Am (CROWN-seq). Using appropriate spike-in controls and knockout cell lines, Liu et al. clearly demonstrated CROWN-seq’s precision and quantitative accuracy for profiling transcriptome-wide m6Am. Subsequently, the authors used CROWN-seq to greatly expand the number of known m6Am sites in various cell lines and also determine m6Am stoichiometry to generally be high for most genes. CROWN-seq identified gene promoter motifs that correlate best with high stoichiometry m6Am sites, thereby identifying new determinants of m6Am stoichiometry. CROWN-seq also helped reveal that m6Am does not regulate mRNA stability or translation (as opposed to past reported functions). Rather, m6Am stoichiometry correlates well with transcription levels. Finally, Liu et al. reaffirmed that FTO mainly demethylates m6Am, not of mRNA but of snRNAs and snoRNAs.
Strengths:
This is a well-written manuscript that describes and validates a new m6Am-sequencing method: CROWN-seq as the first m6Am-sequencing method that can both quantify m6Am stoichiometry and profile m6Am at single-base resolution. These advantages facilitated Liu et al. to uncover new potential findings related to m6Am regulation and function. I am confident that CROWN-seq will likely be the gold standard for m6Am-sequencing henceforth.
Weaknesses:
Though the authors have uncovered a potentially new function for m6Am, they need to be clear that without identifying a mechanism, their data might only be demonstrating a correlation between the presence of m6Am and transcriptional regulation rather than causality.
We thank the reviewer for the very positive assessment of the CROWN-seq method. We have softened the language which is related to the correlation between m<sup>6</sup>Am and transcription regulation.
Reviewer recommendations:
We thank the reviewers for their constructive suggestions. In the revised manuscript, we have corrected the errors and updated the requested discussions and figures.
Reviewer #1 (Recommendations for the authors):
(1) The prior work from the research group, "Reversible methylation of m6Am in the 5′ cap controls mRNA stability" (PMID: 28002401), should be cited, even if the current findings differ from earlier conclusions-particularly in line 58 and the section titled "m6Am does not substantially influence mRNA stability or translation".
We thank the reviewer for this comment. We have added the citation.
(2) I wonder why the authors chose to convert A to I before capping and recapping, as RNA fragmentation caused by chemical treatment may introduce noise into these processes.
We thank the reviewer for this comment. This is a very good point. We have indeed considered this alternative protocol. There are two concerns in performing decapping-and-recapping before A-to-I conversion: (1) it is unclear whether the 3’-desthiobiotin, which is essential for the 5’ end enrichment, is stable or not during the harsh A-to-I conversion; (2) performing decapping-and-recapping first requires more enzyme and 3’-desthiobiotin-GTP, which are the major cost of the library preparation. This is because the input of CROWN-seq (~1 μg mRNA) is much higher than that in ReCappable-seq (~5 μg total RNA or ~250 ng mRNA). In the current protocol, many 5’ ends are highly fragmented and therefore are lost during the A-to-I conversion. As a result, less enzyme and 3’-desthiobiotin-GTP are needed.
(3) During CROWN-seq benchmarking, the authors found that 93% of reads mapped to transcription start sites, implying a 7% noise level with a spike-in probe. This noise could lead to false positives in TSN assignments in real samples. It appears that additional filters (e.g., a known TSS within 100 nt) were applied to mitigate false positives. If so, I recommend that the authors clarify these filters in the main text.
We thank the reviewer for this comment. We think that the spike-in probes might lead to an underestimation of the accuracy of TSN mapping. The spike-in probes are made by in vitro transcription with m<sup>7</sup>Gpppm<sup>6</sup>AmG or m<sup>7</sup>GpppAmG analogs. We found that the in vitro transcription exhibits a small amount of non-specific initiation, which leads to spike-in probes with 5’ ends that are not precisely aligned with the desired TSS. To better illustrate the mapping accuracy of CROWN-seq, we provided Figure 2H, which compares the non-conversion rates of newly found A-TSNs between wild-type and PCIF1 knock cells. If the newly found A-TSNs are real, they should show high non-conversion rates in wild-type cells (i.e., high m<sup>6</sup>Am) and almost zero non-conversion rates (i.e., Am) in PCIF1 knockout cells. As expected, most of the newly found A-TSNs are true A-TSNs since they are m6Am in wild-type and Am in PCIF1 knockout. Thus, we think that CROWN-seq is very precise in TSS mapping. We have clarified this in the Discussion.
(4) I wonder if PCIF1 knockout affects TSN choice and abundance. If not, this data should be presented. If so, how are these changes accounted for in Figure 2H and Figure S5?
We thank the reviewer for this comment. PCIF1 KO does not really affect TSN choice. Here we calculate the correlation of relative TSN expression within genes between wild-type and PCIF1 KO cells (shown using Pearson’s r). It shows that most of the genes have similar TSN choices (with higher Pearson’s r) in both wild-type and PCIF1 KO cells. Thus, PCIF1 KO does not alter global TSN expressions.
Author response image 1.
(5) The manuscript refers to Am as a rare modification in mRNA (e.g., introduction lines 101-102; discussion lines 574, 608; and possibly other locations) without specifying this only applies to transcription start sites. As this study does not cover entire mRNA sequences, these statements may not be misleading.
We thank the reviewer for this comment. We have clarified it.
Reviewer #2 (Recommendations for the authors):
(1) On line 122, the authors state that: "On average, a gene uses 9.5{plus minus}9 (mean and s.d., hereafter) TSNs (Figure 1A)." However, they do not discuss the dispersion apparent in the TSNs they observed. Figure panels 1A, B, and S1A, B show a range of 120 bases or less. What is the predominant range of distances between annotated TSNs and the newly identified ones?
1a) For example, what percentage of new TSNs fall within 20? 50? 75? bases of the annotated sites? Additional text describing the distribution of these TSNs would help readers better understand the diversity inherent in these novel 5' RNA ends. Notably, this additional text likely is best placed in the CROWN-Seq section related to Figure 2 or S2.
We thank the reviewer for this comment. We have updated Figure S2 to describe the newly found TSSs. Depending on the coverage in CROWN-seq, the TSSs with higher coverage tend to overlap with or locate proximally to known TSSs. In contrast, the TSSs with low coverage tend to be located further away from annotated TSSs.
1b) The alternate TSNs can have effects on splicing patterns and isoform identity. Providing a few sentences to explain how regularly this occurs would be helpful.
We thank the reviewer for this comment. It is a very interesting point. Different TSNs can indeed have different splicing patterns. Although the discovery of splicing patterns regulated by TSNs is out of the scope of this study, we have discussed this possibility in the revised Discussion section.
(2) On Lines 241 and 242, the authors mentioned that 1284 sites were excluded from the analysis based on low (under 20-explained in the figure legend) read count, distance from TSS, or false negatives (which are not explained). Although I agree that the authors are justified in setting these reads aside, the information could be useful to readers willing to perform follow-up work if their mRNAs of interest were included in these 1284 sites.
2a) An annotation of all of these sites (broken down by category, i.e. the 811, the 343, and the 130) as a supplementary table should be provided.
We thank the reviewer for this comment. We have added the categories to the revised Table S1.
(3) Although I have marked several typos/grammar mistakes in several parts of this review, others exist elsewhere in the text and should be corrected.
We thank the reviewer for this comment. We have corrected them.
(4) In lines 122 and 123 the authors say "Only ~9% of genes contain a single TSN (Figure 1A)." However, their figure shows 81% with a single TSN. Why is there a 10% discrepancy?
We thank the reviewer for this comment. We have corrected the plot in Figure 1A, to match the description.
(5) The first Tab of Table S2 is labeled 'Legend', but is blank. Is this intentional?
We thank the reviewer for this comment. We have updated the table legends.
(6) On lines 70 and 76 of the supplementary figure file pertaining to Figure S2, the legend labels for Figure S2E and S2F are not accurate, they need to be changed to G and H.
(7) In Figure 4A 'percentile' is misspelled.
(8) The color-coding legend for the 4 bases is missing from (and should be added to) Figure S4A.
(9) On Lines 984, 1163, and 1194 the '2s' should be properly sub-scripted where appropriate.
For (6) to (9). We thank the reviewer for finding these issues. We have now corrected them.
Reviewer #3 (Recommendations for the authors):
(1) The authors should discuss if their results can definitively distinguish between the SSCA+1GC motif promoting m6Am that, in turn, promotes transcription, versus the SCA+1GC motif promoting m6Am but also separately promoting transcription in a m6Am-independent manner. The authors should also discuss this in light of recent findings by An et al. (2024 Mol. Cell), which support the former conclusion.
We thank the reviewer for the suggestion. We now have updated the Discussion to address that our paper and An et al. can support each other.
(2) Given that the authors showed m6Am promotes gene expression (Figure 5) but does not affect mRNA stability (Fig. S5), logic dictates that m6Am must regulate mRNA transcription. However, the authors should explain why this regulation focuses on the initiation aspect of transcription rather than other aspects of transcriptional e.g. premature termination, pause release, and elongation.
We thank the reviewer for this comment. In this study, we did not profile the 3’ ends of nascent RNAs and thus we can only make conclusions about the overall transcription process but not a specific aspect. We have updated the revised Discussion section to mention that An et al. discovered that m<sup>6</sup>Am can sequester PCF11 and thus promote transcription, and therefore some of the effects we see could be related to differential premature termination.
(3) Authors should add alternative versions of Figure 1D but with 3 colours corresponding to Am vs. m6Am vs. Cm/Gm/Um for all the cells, they performed CROWN-seq on.
We thank the reviewer for this comment. We have updated Figure S5 as the corresponding figure showing the fraction of Am vs. m6Am vs. Cm/Gm/Um.
(4) Figure 2H (left): Please comment on the few outliers that still show high non-conversion even in PCIF1-KO cells.
We thank the reviewer for this comment. We have discussed the outliers in the main text. These outliers can be found in the revised Table S3.
(5) Line 254: "Second, if these sites were RNA fragments they would not contain m6Am." is missing a comma.
(6) S2G and S2H labelling in Figure S2 legends is wrong.
For (5) and (6). We thank the reviewer for these comments. We have corrected them.
(7) Figure 3D: Many gene names are printed multiple times (e.g. ACTB is printed 5 times). Is this correct; is each dot representing 1 cell line?
We thank the reviewer for this comment. These gene names represent different transcription-start nucleotides. We now clarify that each instance refers to a different start site.
(8) S5A-C: Even if there's no substantial difference, authors should still display the Student's T-test P-values as they did for S5D-G.
We thank the reviewer for this comment. We have updated the P-values.
(9) Figure 5C and S5E: Why are the authors not showing the respective analysis for C-TSN and U-TSN genes?
We thank the reviewer for this comment. Most mRNAs start with A or G. We therefore selected G-TSN as the control. Unlike G-TSNs which occur in diverse sequence and promoter contexts, C-TSNs and U-TSNs are unusual. Genes that mainly use C-TSNs and U-TSNs are the so-called “5’ TOP (Terminal OligoPyrimidine)” genes. The 5’ TOP genes are mostly genes related to translation and metabolism, and thus their expressions reflect the homeostasis of cell metabolism. Thus, we were concerned that any differential expression of the C-TSN and U-TSN genes between wild-type and PCIF1 knockout cells might reflect specific effects on TOP transcriptional regulation rather than the general effects of PCIF1 on transcription.
(10) Line 82, 470, 506, 676: The authors should also cite Koh et al (2019 Nat. Comm.) in these lines that describe how snRNAs can also be m6Am-methylated and how FTO targets these same snRNAs for demethylation.
We thank the reviewer for this comment. We have updated the citation.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
We thank all the reviewers for their insightful comments on this work.
Response to Reviewer #1:
We greatly appreciate your comments on the general reliability and significance of our work. We fully agree that it would have been ideal to have additional evidence related to the role of PEBP1 in HRI activation. Unfortunately, we have not been able to find phospho-HRI antibodies that work reliably. The literature seems to agree with this as a band shift using total-HRI antibodies is usually used to study HRI activation. However, with the cell lines showing the most robust effect with PEBP1 knockout or knockdown, we are yet to convince ourselves with the band shifts we see. This could be addressed by optimizing phos-tag gels although these gels can be a bit tricky with complex samples such as cell lysates which contain many phosphoproteins.
To address the interaction between PEBP1 and eIF2alpha more rigorously we were inspired by the insights you and reviewer #2 provided. While we are unable to do further experiments, we now think it would indeed be possible to do this with either using the purified proteins and/or CETSA WB. These experiments could also provide further evidence for the role of PEBP1 phosphorylation. Although phosphorylation of PEBP1 at S153 has been implicated as being important for other functions of PEBP1, we are not sure about its role here. It may indeed have little relevance for ISR signalling.
For the in vitro thermal shift assay, we have performed two independent experiments. While it appears that there is a slight destabilization of PEBP1 by oligomycin, the ultimate conclusion of this experiment remains incomplete as there could be alternative explanations despite the apparent simplicity of the assay due the fluorescence background by oligomycin only. We now provide a lysate based CETSA analysis which does not display the same PEBP1 stabilization as the intact cell experiment. As for the signal saturation in ATF4-luciferase reporter assay, this is a valid point.
Response to Reviewer #2:
We strongly agree that CETSA has a lot of potential to inform us about cellular state changes and this was indeed the starting point for this project. We apologize for being (too) brief with the explanations of the TPP/MS-CETSA approach and we have now added a bit more detail. With regard to the cut-offs used for the mass spectrometry analysis, you are absolutely right that we did not establish a stringent cut-off that would show the specificity of each drug treatment. Our take on the data was that using the p values (and ignoring the fold-changes) of individual protein changes as in Fig 1D, we can see that mitochondrial perturbations display a coordinated response. We now realize that the downside of this representation is that it obscures the largest and specific drug effects. As mentioned in the response to Reviewer #1, we now also think that it would be possible to obtain more evidence for the potential interaction between PEBP1 and eIF2alpha using CETSA-based assays.
Response to Reviewer #3:
Thank you for your assessment, we agree that this manuscript would have been made much stronger by having clearer mechanistic insights. As mentioned in the responses to other reviewers above, we aim to address this limitation in part by looking at the putative interaction between PEBP1 and eIF2alpha with orthogonal approaches. However, we do realize that analysis of protein-protein interactions can be notoriously challenging due to false negative and false positive findings. As with any scientific endeavor, we will keep in mind alternative explanations to the observations, which could eventually provide that cohesive model explaining how precisely PEBP1, directly or indirectly, influences ISR signalling.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
The data overall are very solid, and I would only recommend the following minor changes:
(1) Line 187 and line 268: there is perhaps a trend towards slightly increased ATF4-luc reporter with PEBP1-S153D, but it is not statistically significant, so I would tone down the wording here.
We now modified this part to "This data is consistent with the modest increase…" .
(2) The recently discovered SIFI complex (Haakonsen 2024, https://doi.org/10.1038/s41586023-06985-7) regulates both HRI and DELE1 through bifunctional localization/degron motifs. It seems like PEBP1 also contains such a motif, which suggests a potential mechanism for enrichment near mitochondria, perhaps even in response to stress. Maybe the authors could further speculate on this in the discussion.
While working on the manuscript, we considered the possibility that PEBP1 function could be related to SIFI complex and concluded that here is a critical difference: while SIFI specifically acts to turn off stress response signalling, loss of PEBP1 prevents eIF2alpha phosphorylation. We did not however consider that PEBP1 could have a localization/degron motif. Motif analysis by deepmito (busca.biocomp.unibo.it) and similar tools did not identify any conventional mitochondrial targeting signal although we acknowledge that PEBP1 has a terminal alpha-helix which was identified for SIFI complex recognition. We are not sure why you think PEBP1 contains such a motif and therefore are hesitant to speculate on this further in the manuscript.
(3) Line 358: references 50 and 45 are identical.
Thank you for spotting this. Corrected now.
(4) Figure S1D: it looks like Oligomycin has a significant background fluorescence, which makes interpretation of these graphs difficult - do you have measurements of the compound alone that can be used to subtract this background from the data? Based on the Tm I would say it does stabilize recombinant PEBP1, and there is no quantification of the variance across the 3 replicates to say there is no difference.
You are right, this assay is problematic due to the background fluorescence. The measurements with oligomycin only and subtracting this background results in slightly negative values and nonsensical thermal shift curves. We now additionally show quantification from two different experiments (unfortunately we ran out of reagents for further experiments), and this quantification shows that if anything, oligomycin causes mild destabilization of recombinant PEBP1. We also used lysate CETSA assay which does not show thermal stabilization of PEBP1 by oligomycin, ruling out a direct effect. We attempted to use ferrostatin1 as a positive control as it may bind PEBP1-ALOX protein complex, and it appeared to show marginal stabilization of PEBP1.
Reviewer #2 (Recommendations for the authors):
I have a few comments for the authors to address:
(1) The MS-CETSA experiment is quite briefly described and this could be expanded somewhat. Not clear if multiple biological replicates are used. Is there any cutoff in data analysis based on fold change size (which correlated to the significance of cellular effects), etc? As expected from only one early timepoint (see eg PMID: 38328090), there appear to be a limited number of significant shifts over the background (as judged from Figure S1A). In the Excel result file, however (if I read it right) there are large numbers of proteins that are assigned as stabilized or destabilized. This might be to mark the direction of potential shifts, but considering that most of these are likely not hits, this labeling could give a false impression. Could be good to revisit this and have a column for what could be considered significant hits, where a fold change cutoff could help in selecting the most biologically relevant hits. This would allow Figure 1D to be made crisper when it likely dramatically overestimates the overlap between significant CETSA shifts for these drugs.
Fair point, while we focused more on PEBP1, it is important to have sufficient description of the methods. We used duplicate samples for the MS, which is probably the most important point which was absent from the original submission as is now added to the methods. We also added slightly more description on the data analysis. While the AID method does not explicitly use log2 fold changes, it does consider the relative abundance of proteins under different temperature fractions. Since the Tm (melting temperature) for each protein can be at any temperature, we felt that if would be complicated to compare fractions where the protein stability is changed the most and even more so if we consider both significance and log2FC. Therefore, we used this multivariate approach which indicates the proteins with most likely changes across the range of temperatures. To acknowledge that most of the statistically significant changes are not the much over the background as you correctly pointed out, we now add to the main text that “However, most of these changes are relatively small. To focus our analysis on the most significant and biologically relevant changes…” We also agree that it may be confusing that the AID output reports de/stabilization direction for all proteins. In general, we are not big fans of cutoffs as these are always arbitrary, but with multivariate p value of 0.1 it becomes clear that there are only a relatively small number of hits with larger changes. We have now added to the guide in the data sheet that "Primarily, use the adjusted p value of the log10 Multivariate normal pvalue for selecting the overall statistically significant hits (p<0.05 equals -1.30 or smaller; p<0.01 equals -2 or smaller)". We have also added to the guide part of the table that “Note that this prediction does not consider whether the change is significant or not, it only shows the direction of change”
(2) On page 4 the authors state "We reasoned that thermal stability of proteins might be particularly interesting in the context of mitochondrial metabolism as temperature-sensitive fluorescent probes suggest that mitochondrial temperature in metabolically active cells is close to 50{degree sign}C". I don't see the relevance of this statement as an argument for using TPP/CETSA. When this is also not further addressed in the work, it could be deleted.
Deleted. We agree, while this is an interesting point, it is not that relevant in this paper.
(3) To exclude direct drug binding to PEBP1, a thermofluor experiment is performed (Fig S1D). However, the experiment gives a high background at the lower temperatures and it could be argued that this is due to the flouroprobe binding to a hydrophobic pocket of the protein, and that oligomycin at higher concentrations competes with this binding, attenuating fluorescence. These are complex experiments and there could be other explanations, but the authors should address this. An alternative means to provide support for non-binding would be a lysate CETSA experiment, with very short (1-3 minutes) drug exposure before heating. This would typically give a shift when the protein is indicated to be CETSA responsive as in this case.
Agree. However, we don't have good means to perform the thermofluor experiments to rule out alternative explanations. What we can say is (as discussed above for reviewer #1, point 4) that quantification from two different experiments shows that oligomycin is does not thermally stabilizing recombinant PEBP1. To complement this conclusion, we used lysate CETSA assay which does not show thermal stabilization of PEBP1 by oligomycin. In this assay we attempted to use ferrostatin1 as a positive control as it may bind PEBP1-ALOX protein complex, and it appeared to show marginal stabilization of PEBP1. But since we lack a robust positive control for these assays, some doubt will inevitably remain.
(4) The authors appear to have missed that there is already a MS-CETSA study in the literature on oligomycin, from Sun et al (PMID: 30925293). Although this data is from a different cell line and at a slightly longer drug treatment and is primarily used to access intracellular effects of decreased ATP levels induced by oligomycin, the authors should refer to this data and maybe address similarities if any.
Apologies for the oversight, the oligomycin data from this paper eluded us at it was mainly presented in the supplementary data. We compared the two datasets and find found some overlap despite the differences in the experimental details. Both datasets share translational components (e.g. EIF6 and ribosomal proteins), but most notably our other top hit BANF1 which we mentioned in the main text was also identified by Sun et al. We have updated the manuscript text as "Other proteins affected by oligomycin included BANF1, which binds DNA in an ATP dependent manner [16], and has also identified as an oligomycin stabilized protein in a previous MS-CETA experiment [23]", citing the Sun et al paper.
(5) The confirmation of protein-protein interaction is notoriously prone to false positives. The authors need to use overexpression and a sensitive reporter to get positive data but collect additional data using mutants which provide further support. Typically, this would be enough to confirm an interaction in the literature, although some doubt easily lingers. When the authors already have a stringent in-cell interaction assay for PEBP1 in the CETSA thermal shift, it would be very elegant to also apply the CETSA WB assay to the overexpressed constructs and demonstrate differences in the response of oligomycin, including the mutants. I am not sure this is feasible but it should be straightforward to test.
This is a very good suggestion. Unfortunately, due to the time constraints of the graduate students (who must write up their thesis very soon), we are not able to perform and repeat such experiments to the level of confidence that we would like.
(6) At places the story could be hard to follow, partly due to the frequent introduction of new compounds, with not always well-stated rationale. It could be useful to have a table also in the main manuscript with all the compounds used, with the rationale for their use stated. Although some of the cellular pathways addressed are shown in miniatures in figures, it could be useful to have an introduction figure for the known ISR pathways, at least in the supplement. There are also a number of typos to correct.
We agree that there are many compounds used. We have attempted to clarify their use by adding this information into the table of used compounds in the methods and adding an overall schematic to Fig S1G and a note on line 132 "(see Figure 1-figure supplement 1G for summary of drugs used to target PEBP1 and ISR in this manuscript). We have also attempted to remove typos as far as possible.
(7) EIF2a phosphorylation in S1E does not appear to be more significant for Sodium Arsenite argued to be a positive control, than CCCP, which is argued to be negative. Maybe enough with one positive control in this figure?
This experiment was used as a justification for our 30 min time point for the proteomics. By showing the 30 min and 4 h time points as Fig 1G and Figure 1-figure supplement 1F, our point was to demonstrate that the kinetics of phosphorylation and dephosphorylation are relevant. As you correctly pointed out, the stress response induced by sodium arsenite, but also tunicamycin is already attenuated at the 4h time point. We prefer to keep all samples to facilitate comparisons.
(8) Page 7 reference to Figure S2H, which doesn't exist. Should be S3H.
Apologies for the mistake, now corrected to Figure 2-figure supplement 1B.
(9) Finally, although the TPP labeling of the method is used widely in the literature this is CETSA with MS detection and MS-CETSA is a better term. This is about thermal shifts of individual proteins which is a very well-established biophysical concept. In contrast, the term Thermal Proteome Profiling does not relate to any biophysical concept, or real cell biology concept, as far as I can see, and is a partly misguided term.
We changed the term TPP into MS-CETSA, but also include the term TPP in the introduction to facilitate finding this paper by people using the TPP term.
Reviewer #3 (Recommendations for the authors):
Major Issues
(1) The one major issue of this work is the lack of a mechanism showing precisely how PEBP1 amplifies the mitochondrial integrated stress response. The work, as it is described, presents data suggesting PEBP1's role in the ISR but fails to present a more conclusive mechanism. The idea of mitochondrial stress causing PEBP1 to bind to eIF2a, amplifying ISR is somewhat vague. Thus, the lack of a more defined model considerably weakens the argument, as the data is largely corollary, showing KO and modulation of PEBP1 definitely has a unique effect on the ISR, however, it is not conclusive proof of what the authors claim. While KO of PEBP1 diminishes the phosphorylation of eIF2a, taken together with the binding to eIF2a, different pathways could be simultaneously activated, and it seems premature to surmise that PEBP1 is specific to mitochondrial stress. Could PEBP1 be reacting to decreased ATP? Release of a protein from the mitochondria in response to stress? Is PEBP1's primary role as a modulator of the ISR, or does it have a role in non-stress-related translation? A cohesive model would tie together these separate indirect findings and constitute a considerable discovery for the ISR field, and the mitochondrial stress field.
Thank you for your assessment, we agree that this manuscript would have been much stronger by having clearer mechanistic insights. As with any scientific endeavor, we will keep in mind alternative explanations to the observations, which could eventually provide that cohesive model explaining how precisely PEBP1, directly or indirectly, influences ISR signalling.
(2) The data relies on the initial identification of PEBP1 thermal stabilization concomitant with mitochondrial ISR induction post-treatment of several small molecules. However, the experiment was performed using a single timepoint of 30 minutes. There was no specific rationale for the choice of this time point for the thermal proteome profiling.
The reasoning for this was explicitly stated: "We reasoned that treating intact cells with the drugs for only 30 min would allow us to observe rapid and direct effects related to metabolic flux and/or signaling related to mitochondrial dysfunction in the absence of major changes in protein expression levels.”
Minor Issues
(1) In Lines 163-166 the authors state "The cells from Pebp1 KO animals displayed reduced expression of common ISR genes (Figure 2F), despite upregulation of unfolded protein response genes Ern1 (Ire1α) and Atf6 genes. This gene expression data therefore suggests that Pebp1 knockout in vivo suppresses induction of the ISR". This statement should be reassessed. While an arm of the UPR does stimulate ISR, this arm is controlled by PERK, and canonically IRE1 and ATF6 do not typically activate the ISR, thus their upregulation is likely unrelated to ISR activation and does not contribute the evidence necessary for this statement.
Apologies for the confusion, we aimed to highlight that as there is an increase in the two UPR arms, it is more likely that ISR instead of UPR is reduced. We have now changed the statement to the following:
"The cells from Pebp1 PEBP1 KO animals displayed reduced expression of common ISR genes (Figure 2F), while there was mild upregulation of the unfolded protein response genes Ern1 (Ire1α) and Atf6 genes. This gene expression data therefore suggests that the reduced expression of common ISR genes is less likely to be mediated by changes in PERK, the third UPR arm, and more likely due to suppression of ISR by Pebp1 knockout in vivo."
(2) In Lines 169 and 170 the authors state "Western blotting indicated reduced phosphorylation of eIF2α in RPE1 cells lacking PEBP1, suggesting that PEBP1 is involved in regulating ISR signaling between mitochondria and eIF2α". This conclusion is not supported by evidence. A number of pathways could be activated in these knockout cells, and simply observing an increase in p-eIF2α after knocking out PEBP1 does not constitute an interaction, as correlation doesn't mean causation. This KO could indirectly affect the ISR, with PEBP1 having no role in the ISR. While taken together there is enough circumstantial evidence in the manuscript to suggest a role for PEBP1 in the ISR, statements such as these have to be revised so as not to overreach the conclusions that can be achieved from the data, especially with no discernible mechanism.
We have now revised this statement by removing the conclusion and stating only the observation: "Western blotting indicated reduced phosphorylation of eIF2α in RPE1 cells lacking PEBP1 (Fig. 3A)."
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
comprehensiveness and rigor of the study are notable. Rarely have I reviewed a manuscript reporting the results of so many orthogonal experiments, all of which support the authors' hypotheses, and of so many excellent controls.” Reviewer 2 commented: “They have elegantly demonstrated how some mutants alter each step of processing. Together with FLIM experiments, this study provides additional evidence to support their 'stalled complex hypotheses'….This is a beautiful biochemical work. The approach is comprehensive.”
Below we respond to the relatively minor concerns of Reviewer 2, which may be included with the first version of the Reviewed Preprint.
Reviewer 2:
(1) It appears that the purified γ-secretase complex generates the same amount of Aβ40 and Aβ42, which is quite different in cellular and biochemical studies. Is there any explanation for this?
Roughly equal production of Aβ40 and Aβ42 is a phenomenon seen with purified enzyme assays, and the reason for this has not been identified. However, we suggest that what is meaningful in our studies is the relative difference between the effects of FAD-mutant vs. WT PSEN1 on each proteolytic processing step. All FAD mutations are deficient in multiple cleavage steps in γsecretase processing of APP substrate, and these deficiencies correlate with stabilization of E-S complexes.
(2) It has been reported the Aβ production lines from Aβ49 and Aβ48 can be crossed with various combinations (PMID: 23291095 and PMID: 38843321). How does the production line crossing impact the interpretation of this work?
In the cited reports, such crossover was observed when using synthetic Aβ intermediates as substrate. In PMID 2391095 (Okochi M et al, Cell Rep, 2013), Aβ43 is primarily converted to Aβ40, but also to some extent to Aβ38. In PMID: 38843321 (Guo X et al, Science, 2024), Aβ48 is ultimately converted to Aβ42, but also to a minor degree to Aβ40. We have likewise reported such product line “crossover” with synthetic Aβ intermediates (PMID: 25239621; Fernandez MA et al, JBC, 2014). However, when using APP C99-based substrate, we did not detect any noncanonical tri- and tetrapeptide co-products of Aβ trimming events in the LC-MS/MS analyses (PMID: 33450230; Devkota S et al, JBC, 2021). In the original report on identification of the small peptide coproducts for C99 processing by γ-secretase using LC-MS/MS (PMID: 19828817; Takami M et al, J Neurosci, 2009), only very low levels of noncanonical peptides were observed. In the present study, we did not search for such noncanonical trimming coproducts, so we cannot rule out some degree of product line crossover.
(3) In Figure 5, did the authors look at the protein levels of PS1 mutations and C99-720, as well as secreted Aβ species? Do the different amounts of PS1 full-length and PS1-NTF/CTF influence FILM results?
FLIM results depend on the degree that C99 and long Aβ intermediates are bound to γ-secretase compared to unbound C99 and Aβ. The 6E10-Alexa 488 lifetime is significantly decreased by FAD mutations compared to WT PSEN1 (Fig. 5). However, the observed decrease in lifetime with the PSEN1 FAD mutants might also be due to lower levels of C99-720 expression or higher levels of PSEN1 CTF (i.e., mature γ-secretase complexes). We checked the C99-720 fluorescence intensities in the FLIM experiments and found that C99-720 intensities are not significantly different between cells transfected with WT and those with FAD PSEN1. Furthermore, Western blot analysis shows that levels of C99-720 are not significantly low and those of PSEN1 CTF are not high in FAD PSEN1 compared to WT PSEN1 expressing cells. Although PSEN1 CTF levels trend low for PSEN1 F386S, this mutant resulted in decreased FLIM only in Aβ-rich regions. Thus, the reduced FLIM apparently reflects effects of FAD mutation on E-S complex stability. Levels of full-length PSEN1 were also determined and found not to correlated with FLIM effects, although full-length PSEN1 represents protein not incorporated into full active γ-secretase complexes and therefore does not interact with C99-720.
(4) It is interesting that both Aβ40 and Aβ42 Elisa kits detect Aβ43. Have the authors tested other kits in the market? It might change the interpretation of some published work.
We have not tested other ELISA kits. Considering our findings, it would be a good idea for other investigators to test whatever ELISAs they use for specificity vis-à-vis Aβ43.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
Reviewer #2 (Public Review):
Comment 1: In terms of the biological significance of this interaction, it would be good to examine (via co-immunoprecipitation) whether the CEP89/NCS-1/C3ORF14 interaction takes place upon serum starvation. Does the complex change?
NCS1 centriolar localization requires CEP89 as no NCS1 localization was observed in CEP89 knockout cells (Figure 2L; Figure 2-figure supplement 2B). Both CEP89 and NCS1 centriolar localization were observed (Figure 2C; Figure 1D of the PMID: 36711481) in cells grown in serum containing media, although their localization was further enhanced in serum starved cells. From these results, we predict that CEP89 and NCS1 can interact and colocalize in both serum-fed and serum-depleted condition. We think it may not be easy to assess the change in interaction with the co-immunoprecipitation assay, as interactions occur in a test tube, which may not reflect the binding condition inside the cells.
Comment 2: Also, for the subdistal appendage localization of NCS-1 and C3ORF14, would this also change upon serum starvation?
We agree that it would be interesting to see whether the subdistal appendage localization changes upon serum starvation, as NCS1 may capture the ciliary vesicle at the subdistal appendages as we discussed. However, the loss of the subdistal appendage protein, CEP128, blocks subdistal appendage localization of CEP89 [PMID: 32242819] without affecting cilium formation [PMID: 27818179]. This suggests that the subdistal appendage localization of NCS1 or C3ORF14 is likely dispensable for cilium formation.
Comment 3: For the ciliation results and the recruitment of IFT88 in CEP89 knockout cell lines, this contradicts previous work from Tanos et al (PMID: 23348840), as well as Hou et al (PMID: 36669498). A parallel comparison using siRNA, a transient knockout system, or a degron system would help understand this. A similar point goes for Figure 4, where the effect on ciliogenesis is minimal in knockout cells, but acute siRNA has been shown to have a stronger phenotype.
Hou et al. [PMID: 36669498] investigated the role of distal appendage proteins, CEP164, CEP89, and FBF1 in the ciliated chordotonal organ of Drosophila melanogaster by generating knockout Drosophila strains. The results were markedly different from what was observed in mammalian cells. Notably, CEP164 is not required for cilium formation, and CEP89 is required for FBF1 localization in the animal. CEP89 was required for cilium formation in the cells in the ciliated chordotonal organ, of which cilium formation is dependent on IFT machinery. They did not show if IFT centriolar recruitment is affected in the CEP89 mutant cells. These differences likely reflect the divergence of the organization of distal appendage during evolution.
The ciliation phenotype of our CEP89 knockout cells are milder than what was shown in Tanos et al [PMID: 23348840], but largely consistent with the results from Bornens group, which used siRNA to deplete CEP89 [PMID: 23789104]. Besides, NCS1 knockout cells showed very similar phenotype to the CEP89 knockout cells, and relatively acute deletion of NCS1 (14 days after infection of the lenti-virus containing sgNCS1 without single-cell cloning) displayed an almost identical ciliation defect (Figure 4B-C). Thus, we believe CEP89 is only partially required for cilium formation in RPE-hTERT cells and that the differences are more technical than definitive.
Comment 4: An elegant phenotype rescue is shown in Figure 5. An interesting question would be, how does this mutant and/or the myristoylation affect the recruitment of C3ORF14?
NCS1 is not required for the localization of C3ORF14 (Figure 2M; Figure 2- figure supplement 2C), so we can assume that the myristoylation defective mutant does not affect C3ORF14 recruitment.
Comment 5: For the EF-hand mutants, it would be good to use control mutants, from known Ca2+ binding proteins as a control for the experiment shown.
In the Figure 5-figure supplement 1A-C, we generated a series of EF-hand mutant of NCS1 to see if the calcium binding affects the CEP89 interaction, NCS1 localization, and cilium formation. NCS1 is only protein among the calcium binding NCS family proteins that was found as a positive hit in the mass spec data of CEP89 tandem affinity purification. Therefore, we cannot use other NCS1 family proteins as a control for CEP89 binding, NCS1 localization, and cilium formation.
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the original reviews.
eLife Assessment
This manuscript reports valuable findings on the role of the Srs2 protein in turning off the DNA damage signaling response initiated by Mec1 (human ATR) kinase. The data provide solid evidence that Srs2 interaction with PCNA and ensuing SUMO modification is required for checkpoint downregulation. However, experimental evidence with regard to the model that Srs2 acts at gaps after camptothecin-induced DNA damage is currently lacking. The work will be of interest to cell biologists studying genome integrity but would be strengthened by considering the possible role of Rad51 and its removal.
We thank editors and reviewers for their constructive comments and address their main criticisms below.
(1) Srs2 action sites. Our data provide support to the model that Srs2 removal of RPA is favored at ssDNA regions with proximal PCNA, but not at ssDNA regions lacking proximal PCNA. A prominent example of the former type of ssDNA regions is an ssDNA gap with a 3’ DNA end permissive for PCNA loading. Examples of the latter type of ssDNA sites include those within R-loops and negatively supercoiled regions, both lacking 3’ DNA end required for PCNA loading. The former type of ssDNA regions can recruit other DNA damage checkpoint proteins, such as 9-1-1, which requires a 5’ DNA end for loading; thus, these ssDNA regions are ideal for Srs2’s action in checkpoint dampening. In contrast, ssDNA within supercoiled and Rloop regions, both of which can be induced by CPT treatment (Pommier et al, 2022), lacks the DNA ends required for checkpoint activation. RPA loaded at these sites plays important roles, such as recruiting Rloop removal factors (Feng and Manley, 2021; Li et al, 2024; Nguyen et al, 2017), and they are not ideal sites for Srs2’s checkpoint dampening functions. Based on the above rationale and our data, we suggest that Srs2 removal of RPA is favored only at a subset of ssDNA regions prone to checkpoint activation and can be avoided at other ssDNA regions where RPA mainly helps DNA protection and repair. We have modified the text and model drawing to better articulate the implications of our work, that is, Srs2 can distinguish between two types of ssDNA regions by using PCNA proximity as a guide for RPA removal_._ We noted that the precise sites of Srs2 actions in the genome remain to be determined.
(2) Rad51 in the Srs2-RPA antagonism. In our previous report (Dhingra et al, 2021), we provided several lines of evidence to support the conclusion that Rad51 is not relevant to the Srs2-RPA antagonism, despite it being the best-studied protein that is regulated by Srs2. For example, while rad51∆ rescues the hyperrecombination phenotype of srs2∆ cells as shown by others, we found that rad51∆ did not affect the hypercheckpoint phenotype of srs2∆. In contrast, rfa1-zm1/zm2 have the opposite effects. The differential effects of rad51∆ and rfa1-zm1/zm2 were also seen for the srs2-ATPase dead allele (srs2-K41A). For example, rfa1-zm2 rescued the hyper-checkpoint defect and the CPT sensitivity of srs2-K41A, while rad51∆ had neither effect. These and other data described by Dhingra et al (2021) suggest that Srs2’s effects on checkpoint vs. recombination can be separated and that Rad51 removal by Srs2 is distinct from the Srs2RPA antagonism in checkpoint regulation. Given the functional separation summarized above, in our current work investigating which Srs2 features affect the Srs2-RPA antagonism, we did not focus on the role of Rad51. However, we did examine all known features of Srs2, including its Rad51 binding domain. Consistent with our conclusion summarized above, deleting the Rad51 binding domain in Srs2 (srs2∆Rad51BD) has no effect on rfa1-zm2 phenotype in CPT (Figure 2D). This data provides yet another evidence that Srs2 regulation of Rad51 is separable from the Srs2-RPA antagonism. Our work provides a foundation for future examination of how Srs2 regulates RPA and Rad51 in different manners and if there is a crosstalk between them in specific contexts. We have added this point to the revised text.
Public Reviews:
Reviewer #1.
Overall, the data presented in this manuscript is of good quality. Understanding how cells control RPA loading on ssDNA is crucial to understanding DNA damage responses and genome maintenance mechanisms. The authors used genetic approaches to show that disrupting PCNA binding and SUMOylation of Srs2 can rescue the CPT sensitivity of rfa1 mutants with reduced affinity for ssDNA. In addition, the authors find that SUMOylation of Srs2 depends on binding to PCNA and the presence of Mec1. Noted weaknesses include the lack of evidence supporting that Srs2 binding to PCNA and its SUMOylation occur at ssDNA gaps, as proposed by the authors. Also, the mutants of Srs2 with impaired binding to PCNA or impaired SUMOylation showed no clear defects in checkpoint dampening, and in some contexts, even resulted in decreased Rad53 activation. Therefore, key parts of the paper would benefit from further experimentation and/or clarification.
We thank the reviewer for the positive comments, and we address her/his remark regarding ssDNA gaps below. In addition, we provide evidence that redundant pathways can mask checkpoint dampening phenotype of the srs2-∆PIM and -3KR alleles.
Major Comments
(1) The central model proposed by the authors relies on the loading of PCNA at the 3' junction of an ssDNA gap, which then mediates Srs2 recruitment and RPA removal. While several aspects of the model are consistent with the data, the evidence that it is occurring at ssDNA gaps is not strong. The experiments mainly used CPT, which generates mostly DSBs. The few experiments using MMS, which mostly generates ssDNA gaps, show that Srs2 mutants lead to weaker rescue in this context (Figure S1). How do the authors explain this discrepancy? In the context of DSBs, are the authors proposing that Srs2 is engaging at later steps of HRdriven DSB repair where PCNA gets loaded to promote fill-in synthesis? If so, is RPA removal at that step important for checkpoint dampening? These issues need to be addressed and the final model adjusted.
Our data provide supports to the model that Srs2 removal of RPA is favored at ssDNA regions with proximal PCNA, but not at ssDNA regions lacking proximal PCNA (Figure 7). A prominent example of the former type is ssDNA gap with 3’ DNA end permissive for PCNA loading. Examples of the latter type of ssDNA sites are present within R-loops and negatively supercoiled regions, and these ssDNA sites lack 3’ DNA ends required for PCNA loading. In principle, the former can recruit other DNA damage checkpoint proteins, such as 9-1-1, which requires 5’ DNA end for loading, thus it is ideal for Srs2’s action in checkpoint dampening. In contrast, ssDNA within supercoiled and R-loop regions, which can be induced by CPT treatment (Pommier et al., 2022), lacks DNA ends required for checkpoint activation. RPA loaded at these sites plays important roles such as recruiting R-loop removal factors (Feng and Manley, 2021; Li et al., 2024; Nguyen et al., 2017), and these are not ideal sites for Srs2 removal of RPA to achieve checkpoint dampening. Our work suggests that Srs2 removal of RPA is favored only at a subset of ssDNA regions prone to checkpoint activation and can be avoided at other ssDNA regions where RPA mainly helps DNA protection and repair. We have modified the text and the model to clarify our conclusions and emphasized that Srs2 can distinguish between two types of ssDNA regions using PCNA proximity as a guide for RPA removal.
We note that in addition to DSBs, CPT also induces both types of ssDNA mentioned above. For example, CPT can lead to ssDNA gap formation upon excision repair or DNA-protein crosslink repair of trapped Top1 (Sun et al, 2020). The resultant ssDNA regions contain 3’ DNA end for PCNA loading, thus favoring Srs2 removal of RPA. CPT treatment also depletes the functional pool of Top1, thus causing topological stress and increased levels of DNA supercoiling and R-loops (Petermann et al, 2022; Pommier et al., 2022). As mentioned above, R-loops and supercoiled regions do not favor Srs2 removal of RPA due to a lack of PCNA loading. We have now adjusted the text to clarify that CPT can lead to the generation of two types of ssDNA regions as stated above. We have also adjusted the model drawing to indicate that while ssDNA gaps can be logical Srs2 action sites, other types of ssDNA regions with proximal PCNA (e.g., resected ssDNA tails) could also be targeted by Srs2. Our work paves the way to determine the precise ssDNA regions for Srs2’s action.
Multiple possibilities should be considered in explaining the less potent suppression of rfa1 mutants by srs2 alleles in MMS compared to CPT conditions. For example, MMS and CPT affect checkpoints differently. While CPT only activates the DNA damage checkpoint, MMS additionally induces DNA replication checkpoint (Menin et al, 2018; Redon et al, 2003; Tercero et al, 2003). It is possible that the Srs2-RPA antagonism is more relevant to the DNA damage checkpoint compared with the DNA replication checkpoint. Further investigation of this possibility among other scenarios will shed light on differential suppression seen here. We have included this discussion in the revised text.
(2) The data in Figure 3 showing that Srs2 mutants reduce Rad53 activation in the rfa1-zm2 mutant are confusing, especially given the claim of an anti-checkpoint function for Srs2 (in which case Srs2 mutants should result in increased Rad53 activation). The authors propose that Rad53 is hyperactivated in rfa1-zm2 mutant because of compromised ssDNA protection and consequential DNA lesions, however, the effects sharply contrast with the central model. Are the authors proposing that in the rfa1-zm2 mutant, the compromised protection of ssDNA supersedes the checkpoint-dampening effect? Perhaps a schematic should be included in Figure 3 to depict these complexities and help the reader. The schematic could also include the compensatory dampening mechanisms like Slx4 (on that note, why not move Figure S2 to a main figure?... and even expand experiments to better characterize the compensatory mechanisms, which seem important to help understand the lack of checkpoint dampening effect in the Srs2 mutants)
Partially defective alleles often do not manifest null phenotype. In this case, while srs2∆ increases Rad53 activation (Dhingra et al., 2021), srs2-∆PIM and -3KR did not (Figure 3A-3B). However, srs2-∆PIM did increase Rad53 activation when combined with another checkpoint dampening mutant slx4<sup>RIM</sup> (now Figure 4B-4C). This result suggests that defects of partially defective srs2 alleles can be masked by Slx4. Further, srs2-∆PIM and 3KR rescued rfa1-zm2’s checkpoint abnormality (now Figure 3B-3C), suggesting that Srs2 binding to PCNA and its sumoylation contribute to the Srs2-RPA antagonism in the DNA damage checkpoint response.
Partially defective alleles that impair specific features of a protein without producing null phenotype have been used widely to reveal biological mechanisms. For example, a partially defective allele of the checkpoint protein Rad9 perturbing binding to gamma-H2A (rad9-K1088M) does not cause DNA damage sensitivity on its own, due to the compensation from other checkpoint factors (Hammet et al, 2007). However_, rad9-K1088M_ rescues the DNA damage sensitivity and persistent G2/M checkpoint of slx4 mutants, providing strong evidence for the notion that Slx4 dampens checkpoint via regulating Rad9 (Ohouo et al, 2013).
We have now indicated that our model highlights the checkpoint recovery process and does not depict another consequence of the Srs2-RPA antagonism, that is, rfa1 DNA binding mutants can lead to increased levels of DNA lesions and consequently stronger checkpoint activation, which are rescued by lessening Srs2’s ability to strip RPA from DNA (Dhingra et al., 2021). We have stated these points more clearly in the text and added a schematic (Figure 3A) to outline the genetic relationship and interpretations. We also moved Figure S2 to the main figures (Figure 4), as suggested by the reviewer. Better characterizing the compensatory mechanisms among the multiple checkpoint dampening pathways requires substantial amounts of work that will be pursued in the future.
(3) The authors should demarcate the region used for quantifying the G1 population in Figure 3B and explain the following discrepancy: By inspection of the cell cycle graph, all mutants have lower G1 peak height compared to WT (CPT 2h). However, in the quantification bar graph at the bottom, ΔPIM has higher G1 population than the WT.
We now describe how the G1 region of the FACS histogram was selected to derive the percentage of G1 cells in Figure 3B (now Figure 3C). Briefly, the G1 region from the “G1 sample” was used to demarcate the G1 region of the “CPT 2h” sample. We noticed that a mutant panel was mistakenly put in the place of wild-type, and this error is now corrected. The conclusion remains that srs2-∆PIM and srs2-3KR improved rfa1-zm2 cells’ ability to exit G2/M, while they themselves do not show difference from the wild-type control for the percentage of G1 cells after 2hr CPT treatment. We have added statistics in Figure 3C that support this conclusion.
Reviewer #2:
This is an interesting paper that delves into the post-translational modifications of the yeast Srs2 helicase and proteins with which it interacts in coping with DNA damage. The authors use mutants in some interaction domains with RPA and Srs2 to argue for a model in which there is a balance between RPA binding to ssDNA and Srs2's removal of RPA. The idea that a checkpoint is being regulated is based on observing Rad53 and Rad9 phosphorylation (so there are the attributes of a checkpoint), but evidence of cell cycle arrest is lacking. The only apparent delay in the cell cycle is the re-entry into the second S phase (but it could be an exit from G2/M); but in any case, the wild-type cells enter the next cell cycle most rapidly. No direct measurement of RPA residence is presented.
We thank the reviewer for the helpful comments. Previous studies have shown that CPT does not induce the DNA replication checkpoint, and thus does not slow down or arrest S phase progression; however, CPT does induce the DNA damage checkpoint, which causes a delay (not arrest) in G2/M phase and re-entering into the second G1 (Menin et al., 2018; Redon et al., 2003). Our result is consistent with these findings, showing that CPT induces G2/M delay but not arrest. We have now made this point clearer in the text.
We have previously reported chromatin-bound RPA levels in rfa1-zm2, srs2, and their double mutants, as well as in vitro ssDNA binding by wild-type and mutant RPA complexes (Dhingra et al., 2021). These data showed that Srs2 loss or its ATPase dead mutant led to 4-6-fold increase of RPA levels on chromatin, which was rescued by rfa1-zm2 (Dhingra et al., 2021). On its own, rfa1-zm2 did not cause defective chromatin association, despite modestly reducing ssDNA binding in vitro (Dhingra et al., 2021). This discrepancy could be due to a lack of sensitivity of the chromatin fractionation assay in revealing moderate changes of RPA residence on DNA in vivo. Our functional assays (Figure 2-3) were more effective in identifying the Srs2 features pertaining to RPA regulation.
Strengths:
Data concern viability assays in the presence of camptothecin and in the post-translational modifications of Srs2 and other proteins.
Weaknesses:
There are a couple of overriding questions about the results, which appear technically excellent. Clearly, there is an Srs2-dependent repair process here, in the presence of camptothecin, but is it a consequence of replication fork stalling or chromosome breakage? Is repair Rad51-dependent, and if so, is Srs2 displacing RPA or removing Rad51 or both? If RPA is removed quickly what takes its place, and will the removal of RPA result in lower DDC1-MEC1 signaling?
Srs2 can affect both the checkpoint response and DNA repair processes in CPT conditions. However, rfa1zm2 mainly affects the former role of Srs2; this allows us to gain a deeper understanding of this role, which is critical for cell survival in CPT (Dhingra et al., 2021). Building on this understanding, our current study identified two Srs2 features that could afford spatial and temporal regulation of RPA removal from DNA, providing a rationale for how cells can properly utilize an activity that can be beneficial yet also dangerous if it were to lack regulation. Study of Srs2-mediated DNA repair in CPT conditions, either in Rad51-dependent or -independent manner, to deal with replication fork stalling or DNA breaks will require studies in the future.
Moreover, it is worth noting that in single-strand annealing, which is ostensibly Rad51 independent, a defect in completing repair and assuring viability is Srs2-dependent, but this defect is suppressed by deleting Rad51. Does deleting Rad51 have an effect here?
We have previously shown that rad51∆ did not rescue the hyper-checkpoint phenotype of srs2∆ cells in CPT conditions, while rfa1-zm1 and -zm2 did (Dhingra et al., 2021). This differential effect was also seen for the srs2 ATPase-dead allele (Dhingra et al., 2021). These and other data described by Dhingra et al (2021) suggest that Srs2’s effects on checkpoint vs. recombination are separable at least in CPT condition, and that the Srs2-RPA antagonism in checkpoint regulation is not affected by Rad51 removal (unlike in SSA).
Neither this paper nor the preceding one makes clear what really is the consequence of having a weakerbinding Rfa1 mutant. Is DSB repair altered? Neither CPT nor MMS are necessarily good substitutes for some true DSB assay.
We have previously showed that rfa1-zm1/zm2 did not affect the frequencies of rDNA recombination, gene conversation, or direct repeat repair (Dhingra et al., 2021). Further, rfa1-zm1/zm2 did not suppress the hyperrecombination phenotype of srs2∆, while rad51∆ did (Dhingra et al., 2021). In a DSB system, wherein the DNA repeats flanking the break were placed 30 kb away from each other, srs2∆ led to hyper-checkpoint and lethality, both of which were rescued by rfa1-zm mutants (Dhingra et al., 2021). In this assay, rfa1-zm1/zm2 did not show sensitivity, suggesting largely proficient DNA repair. Collectively, these data suggest that moderately weakening DNA binding of Rfa1 does not lead to detectable effect on the recombinational repair examined thus far, rather it affects Srs2-mediated checkpoint downregulation. In-depth studies of rfa1-zm mutations in the context of various DSB repair steps will be interesting to pursue in the future.
With camptothecin, in the absence of site-specific damage, it is difficult to test these questions directly. (Perhaps there is a way to assess the total amount of RPA bound, but ongoing replication may obscure such a measurement). It should be possible to assess how CPT treatment in various genetic backgrounds affects the duration of Mec1/Rad53-dependent checkpoint arrest, but more than a FACS profile would be required.
Quantitative measurement of RPA residence time on DNA in cellular context and the duration of the
Mec1/Rad53-mediated cell cycle delay/arrest will be informative but requires further technology development. Our current work provides a foundation for such quantitative assessment.
It is also notable that MMS treatment does not seem to yield similar results (Fig. S1).
Figure S1 showed that srs2-∆PIM and srs2-3KR had weaker suppression of rfa1-zm2 growth on MMS plates than on CPT plates. Multiple possibilities should be considered in explaining the less potent suppression of rfa1 mutants by srs2 in MMS compared with CPT conditions. For example, MMS and CPT affect checkpoints differently. While CPT only activates the DNA damage checkpoint, MMS additionally induces DNA replication checkpoint (Menin et al., 2018; Redon et al., 2003; Tercero et al., 2003). It is therefore possible that the Srs2RPA antagonism is more relevant for the DNA damage checkpoint control compared with the DNA replication checkpoint. Further investigation of this possibility will shed light on differential suppression seen here. We have included this discussion in the revised text.
Reviewer #3:
The superfamily I 3'-5' DNA helicase Srs2 is well known for its role as an anti-recombinase, stripping Rad51 from ssDNA, as well as an anti-crossover factor, dissociating extended D-loops and favoring non-crossover outcome during recombination. In addition, Srs2 plays a key role in ribonucleotide excision repair. Besides DNA repair defects, srs2 mutants also show a reduced recovery after DNA damage that is related to its role in downregulating the DNA damage signaling or checkpoint response. Recent work from the Zhao laboratory (PMID: 33602817) identified a role of Srs2 in downregulating the DNA damage signaling response by removing RPA from ssDNA. This manuscript reports further mechanistic insights into the signaling downregulation function of Srs2.
Using the genetic interaction with mutations in RPA1, mainly rfa1-zm2, the authors test a panel of mutations in Srs2 that affect CDK sites (srs2-7AV), potential Mec1 sites (srs2-2SA), known sumoylation sites (srs2-3KR), Rad51 binding (delta 875-902), PCNA interaction (delta 1159-1163), and SUMO interaction (srs2SIMmut). All mutants were generated by genomic replacement and the expression level of the mutant proteins was found to be unchanged. This alleviates some concern about the use of deletion mutants compared to point mutations. The double mutant analysis identified that PCNA interaction and SUMO sites were required for the Srs2 checkpoint dampening function, at least in the context of the rfa1-zm2 mutant. There was no effect of these mutants in a RFA1 wild-type background. This latter result is likely explained by the activity of the parallel pathway of checkpoint dampening mediated by Slx4, and genetic data with an Slx4 point mutation affecting Rtt107 interaction and checkpoint downregulation support this notion. Further analysis of Srs2 sumoylation showed that Srs2 sumoylation depended on PCNA interaction, suggesting sequential events of Srs2 recruitment by PCNA and subsequent sumoylation. Kinetic analysis showed that sumoylation peaks after maximal Mec1 induction by DNA damage (using the Top1 poison camptothecin (CPT)) and depended on Mec1. These data are consistent with a model that Mec1 hyperactivation is ultimately leading to signaling downregulation by Srs2 through Srs2 sumoylation. Mec1-S1964 phosphorylation, a marker for Mec1 hyperactivation and a site found to be needed for checkpoint downregulation after DSB induction did not appear to be involved in checkpoint downregulation after CPT damage. The data are in support of the model that Mec1 hyperactivation when targeted to RPA-covered ssDNA by its Ddc2 (human ATRIP) targeting factor, favors Srs2 sumoylation after Srs2 recruitment to PCNA to disrupt the RPA-Ddc2-Mec1 signaling complex. Presumably, this allows gap filling and disappearance of long-lived ssDNA as the initiator of checkpoint signaling, although the study does not extend to this step.
Strengths
(1) The manuscript focuses on the novel function of Srs2 to downregulate the DNA damage signaling response and provide new mechanistic insights.
(2) The conclusions that PCNA interaction and ensuing Srs2-sumoylation are involved in checkpoint downregulation are well supported by the data.
We thank the reviewer for carefully reading our work and for his/her positive comments.
Weaknesses
(1) Additional mutants of interest could have been tested, such as the recently reported Pin mutant, srs2Y775A (PMID: 38065943), and the Rad51 interaction point mutant, srs2-F891A (PMID: 31142613).
Residue Y775 of Srs2 was shown to serve as a separation pin in unwinding D-loops and dsDNA with 3’ overhang in vitro; however, srs2-Y775A lacks cellular phenotype in assays for gene conversion, crossover, and genetic interactions. As such, the biological role of this residue has not been clear. In addressing reviewer’s comment, we obtained srs2-Y775A, and the control strains as described in the recent publication (Meir et al, 2023). While srs2-Y775A on its own did not affect CPT sensitivity, it improved rfa1-zm_2 mutant growth on media containing CPT. This result suggests that Y775 can influence RPA regulation during in checkpoint dampening. Given that truncated Srs2 (∆Cter 276 a.a.) containing Y775A showed normal RPA stripping activity _in vitro, it is possible that cellular assay using rfa1-zm2 is more sensitive for revealing defect of this activity or full-length protein is required for manifest Y775A effect. Future experiments distinguishing these possibilities can provide more clarity. Nevertheless, our result reveals the first phenotype of Srs2 separation pin mutant. We have added this new result (Figure S4) and our interpretation.
We have already included data showing that a srs2 mutant lacking the Rad51 binding domain (srs2∆Rad51BD, ∆875-902) did not affect rfa1-zm2 growth in CPT nor caused defects in CPT on its own (Figure 2D). This data suggest that Rad51 binding is not relevant to the Srs2-RPA antagonism in CPT, a conclusion fully supported by data in our previous study (Dhingra et al., 2021). Collectively, these findings do not provide a strong rationale to test a point mutation within the Rad51BD region.
(2) The use of deletion mutants for PCNA and RAD51 interaction is inferior to using specific point mutants, as done for the SUMO interaction and the sites for post-translational modifications.
We generally agree with this view. However, it is less of a concern in the context of the Rad51 binding site mutant (srs2-∆Rad51BD) since it behaved as the wild-type allele in our assays. The srs2-∆PIM mutant (lacking 4 amino acids) has been examined for PCNA binding in vitro and in vivo (Kolesar et al, 2016; Kolesar et al, 2012); to our knowledge no detectable defect was reported. Thus, we believe that this allele is suitable for testing whether Srs2’s ability to bind PCNA is relevant to RPA regulation.
(3) Figure 4D and Figure 5A report data with standard deviations, which is unusual for n=2. Maybe the individual data points could be plotted with a color for each independent experiment to allow the reader to evaluate the reproducibility of the results.
We have included individual data points as suggested and corrected figure legend to indicate that three independent biological samples per genotype were examined in both panels.
References:
Dhingra N, Kuppa S, Wei L, Pokhrel N, Baburyan S, Meng X, Antony E, Zhao X (2021) The Srs2 helicase dampens DNA damage checkpoint by recycling RPA from chromatin. Proc Natl Acad Sci U S A 118: e2020185118.
Feng S, Manley JL (2021) Replication Protein A associates with nucleolar R loops and regulates rRNA transcription and nucleolar morphology. Genes Dev 35: 1579-1594.
Fiorani S, Mimun G, Caleca L, Piccini D, Pellicioli A (2008) Characterization of the activation domain of the Rad53 checkpoint kinase. Cell Cycle 7: 493-499.
Hammet A, Magill C, Heierhorst J, Jackson SP (2007) Rad9 BRCT domain interaction with phosphorylated H2AX regulates the G1 checkpoint in budding yeast. EMBO Rep 8: 851-857.
Kolesar P, Altmannova V, Silva S, Lisby M, Krejci L (2016) Pro-recombination role of Srs2 protein requires SUMO (Small Ubiquitin-like Modifier) but is independent of PCNA (Proliferating Cell Nuclear Antigen) interaction. J Biol Chem 291: 7594-7607.
Kolesar P, Sarangi P, Altmannova V, Zhao X, Krejci L (2012) Dual roles of the SUMO-interacting motif in the regulation of Srs2 sumoylation. Nucleic Acids Res 40: 7831-7843.
Li Y, Liu C, Jia X, Bi L, Ren Z, Zhao Y, Zhang X, Guo L, Bao Y, Liu C et al (2024) RPA transforms RNase H1 to a bidirectional exoribonuclease for processive RNA-DNA hybrid cleavage. Nat Commun 15: 7464.
Meir A, Raina VB, Rivera CE, Marie L, Symington LS, Greene EC (2023) The separation pin distinguishes the pro- and anti-recombinogenic functions of Saccharomyces cerevisiae Srs2. Nat Commun 14: 8144.
Memisoglu G, Lanz MC, Eapen VV, Jordan JM, Lee K, Smolka MB, Haber JE (2019) Mec1(ATR) autophosphorylation and Ddc2(ATRIP) phosphorylation regulates dna damage checkpoint signaling. Cell Rep 28: 1090-1102 e1093.
Menin L, Ursich S, Trovesi C, Zellweger R, Lopes M, Longhese MP, Clerici M (2018) Tel1/ATM prevents degradation of replication forks that reverse after Topoisomerase poisoning. EMBO Rep 19: e45535.
Nguyen HD, Yadav T, Giri S, Saez B, Graubert TA, Zou L (2017) Functions of Replication Protein A as a sensor of R loops and a regulator of RNaseH1. Mol Cell 65: 832-847 e834.
Ohouo PY, Bastos de Oliveira FM, Liu Y, Ma CJ, Smolka MB (2013) DNA-repair scaffolds dampen checkpoint signalling by counteracting the adaptor Rad9. Nature 493: 120-124.
Papouli E, Chen S, Davies AA, Huttner D, Krejci L, Sung P, Ulrich HD (2005) Crosstalk between SUMO and ubiquitin on PCNA is mediated by recruitment of the helicase Srs2p. Mol Cell 19: 123-133.
Petermann E, Lan L, Zou L (2022) Sources, resolution and physiological relevance of R-loops and RNA-DNA hybrids. Nat Rev Mol Cell Biol 23: 521-540.
Pommier Y, Nussenzweig A, Takeda S, Austin C (2022) Human topoisomerases and their roles in genome stability and organization. Nat Rev Mol Cell Biol 23: 407-427.
Redon C, Pilch DR, Rogakou EP, Orr AH, Lowndes NF, Bonner WM (2003) Yeast histone 2A serine 129 is essential for the efficient repair of checkpoint-blind DNA damage. EMBO Rep 4: 678-684.
Sun Y, Saha S, Wang W, Saha LK, Huang SN, Pommier Y (2020) Excision repair of topoisomerase DNAprotein crosslinks (TOP-DPC). DNA Repair (Amst) 89: 102837.
Tercero JA, Longhese MP, Diffley JFX (2003) A central role for DNA replication forks in checkpoint activation and response. Mol Cell 11: 1323-1336.
Reviewer #1 (Recommendations For The Authors):
(1) "the srs2-ΔPIM (Δ1159-1163 amino acids)". "11" should not be italic.
Corrected.
(2) "the srs2-SIMmut (1170 IIVID 1173 to 1170 AAAAD 1173)". "1173" should be 1174.
Corrected.
(3) Can Slx4-RIM mutant rescue rfa1-zm2 CPT sensitivity?
We found that unlike srs2∆, slx4∆ failed to rescue rfa1-zm2 CPT sensitivity (picture on the right). On the other hand, slx4∆ counteracts Rad9-dependent Rad53 activation as shown by Ohouo et al (2013).
Author response image 1.
(4) One genotype (rfa1-zm2 srs2-3KR) is missing in Figure 5B.
Corrected.
(5) In Fig. S2C, FACS plots do not match the bar graph (see major concern 3).
Corrected and is described in more detail in Major Concern #3.
Reviewer #2 (Recommendations For The Authors):
Figure 1. The colors in A are not well-conserved in B.
Colors for srs2-7AV and -2SA in panel B are now matched with those in panel A.
Figure 2. Is srs2-SIMmut the same as srs2-sim?
This mutant allele is now referred to as srs2-SIM<sup>mut</sup> throughout the text and figures.
The suppression of rfa1-zm2 and (less strongly) rfa-t33 by the Srs2 mutants is interesting. Based on previous data, the suppression is apparently mutual, though it isn't shown here, unless we misunderstand.
We have previously shown that rfa1-zm2 and srs2∆ showed mutual suppression (Dhingra et al 2021 PNAS) and have included an example in Figure S1A. Unlike srs2∆, srs2-∆PIM and -3KR showed little damage sensitivity and DDC defects, likely due to the compensation by the Slx4-mediated checkpoint dampening (detailed in the Public Review section). Suppression is not applicable toward mutants lacking a phenotype, though the mutants could confer suppression when there is a functional relationship with another mutant, as we see here toward rfa1-zm2.
Is Srs2 interaction with PCNA dependent on its ubiquitylation or SUMO? Does PCNA mutant K164R mimic this mutation? (this may well be known; our ignorance).
It was known that Srs2 can bind unmodified PCNA, though SUMO enhances this interaction; however, a very small percentage of PCNA is sumoylated in cells and PCNA sumoylation affects both Srs2-dependent and independent processes (e.g., (Papouli et al, 2005). As such, the genetic interaction of K164R with rfa1-zm2 can be difficult to interpret.
Why srs2-7AV or srs2-sim make rfa1-zm2 even more sensitive is also not obvious. The authors take refuge in the statement that Srs2 "has multiple roles in cellular survival of genotoxic stress" but don't attempt to be more precise.
Our understanding of srs2-7AV and -sim is limited; thus, more specific speculation cannot be made at this time.
Figure 3. It is striking (Figure 3A) that all the cells have reached G2 an hour after releasing from alpha-factor arrest, even though presumably CPT treatment must impair replication. It is even more striking that there is apparently no G2/M arrest in the presumably damaged cells as the WT (Figure 3B) has the most rapid progression through the cell cycle. How does this compare with cells in the absence of CPT? The idea that CPT is triggering Rad53-mediated response is hard to understand if there is in fact no delay in the cell cycle. Instead, the several mutants appear to delay re-entry into S... Or maybe it is actually an exit from G2/M?
This phenomenon needs a better explanation.
CPT does not induce the DNA replication checkpoint nor S phase delay, explaining apparent G2 content by the one hour time point; however, CPT does induce the DNA damage checkpoint, and a delay (not arrest) in G2/M (Menin et al., 2018; Redon et al., 2003; Tercero et al., 2003). We confirmed these findings. In our hand, wildtype G1 cells released into the cell cycle in the absence of CPT complete the first cell cycle within 80 minutes, such that most cells are in the second G1 phase by 90 min. In contrast, when wild-type cells were treated with CPT, G2/M exit was only partial at 120min (e.g., Figure 3B). These features differentiate CPT treatment from MMS treatment, which induces both types of checkpoints and lengthening the time that cells reach G2. We have highlighted this unique feature of CPT in checkpoint induction.
What is "active Rad53"? If the authors mean they are using a phospho-specific Ab versus Rad53, they should explain this. It's impossible to know if total Rad53 is altered from Figure 3A. A blot with an antibody that detects both phosphorylated and nonphosphorylated Rad53 would help.
The F9 antibody used here detects phosphorylated Rad53 forms induced by Mec1 activation and does not detect unphosphorylated Rad53 (Fiorani et al, 2008). We changed “active Rad53” to “phosphorylated Rad53”. We used Pgk1 as a loading control to ensure equal loading, which help to quantify the relative amount of “active Rad53” in cells. This method has been used widely in the field.
Also is there a doublet of Rad53 in the right two lanes and in WT? Rad53 often shows more than one slowmigrating species, so this isn't necessarily a surprise. Were both forms used in quantitation?
Both forms are used for quantification.
Figure 4A. Is there a di-SUMO form above the band marked Srs2-Su? Is this known? Is it counted?
Mono-sumoylated form of Srs2 is the most abundant form of sumoylated Srs2, though we detected a sumoylated Srs2 band that can represent its di-sumo form. We did quantify both forms in the plot.
B. The dip at 1.5 h in Rad9-P is curious. It would be useful to know what % of Rad9 is phosphorylated in a repair-defective (rad52?) background with CPT treatment. And would such rad52 cells show a long arrest?
This dip is reproducible and may reflect that a population of cells escape G2/M delay at this timepoint.
Figure 5. It seems clear that the autophosphorylation site of Mec1, which was implicated in turning off a longdelayed G2/M arrest has no effect here, but presumably, a kinase-dead Mec1 (or deletion) does? The idea that a checkpoint is being regulated seems to come more from an assumption than from any direct data; as noted above, the only apparent delay in the cell cycle is the re-entry into S. There clearly is Rad53 and Rad9 phosphorylation so there are the attributes of a checkpoint. If PI3KK phosphorylation is important, can this be accomplished by Tel1 as well as Mec1?
A mec1 helicase dead or null would not activate the checkpoint at the first place, therefore will not be useful to address whether Mec1 autophosphorylation is implicated in turning off checkpoint. A recent study from the Haber lab provided evidence that Mec1 autophosphorylation at S1964 helps to turn off the checkpoint in a DSB situation (Memisoglu et al, 2019). The role of Tel1 in checkpoint dampening will be interesting to examine in the future.
Figure 6. Two Rfa1 phospho-sites don't appear to be important, but do the known multiple phosphorylations of Rfa2 play a role?
Figure 6D examined three Rfa2 phosphorylation sites and found no genetic interaction with srs2∆.
Summary: There are a lot of interesting data here, but they don't strongly support the author's model in the absence of a more direct way to monitor RPA binding and removal. This could be done using some sitespecific damage, but hard to do with CPT or MMS (which themselves don't appear to have the same effect). The abstract suggests Srs2 is "temporally and spatially regulated to both allow timely checkpoint termination and to prevent superfluous RPA removal." But where is the checkpoint termination if there's no evident checkpoint? And "superfluous" is probably not the right word (= unnecessary); probably the authors intend "excessive"? As noted above, it also isn't clear if the displacement is of RPA or of Rad51, which normally replaces RPA and which is well-known to be itself displaced by Srs2. Again, if CPT is causing enough damage to kill orders of magnitudes of cells (are the plate and liquid concentrations comparable, we suddenly wonder) then why isn't there some stronger evidence for a cell cycle response to the DDC?
As described in the Public Review section, we have previously shown that a lack of Srs2-mediated checkpoint downregulation leads to a 4-6 fold increase of RPA on chromatin, which was rescued by rfa1-zm2 (Dhingra et al., 2021). On its own, rfa1-zm2 did not cause defective chromatin association in our assays, despite modestly reducing ssDNA binding in vitro (Dhingra et al., 2021). This discrepancy could be due to a lack of sensitivity of chromatin fractionation assay in revealing moderate changes of RPA residence on DNA. Considering this, we decided to employ functional assays (Figure 2-3) that are more effective in identifying the specific Srs2 features pertaining to RPA regulation.
We respectfully disagree with the reviewer’s point that there is “no evident checkpoint” in CPT. Previous studies have shown that CPT induces the DNA damage checkpoint as evidenced by Mec1 activation and phosphorylation of Rad53 and Rad9, and delaying exit from G2/M (Dhingra et al., 2021; Menin et al., 2018; Redon et al., 2003). Our data are fully consistent with these reports. It is important to note that DNA damage checkpoint can manifest at a range of strengths depending on the genotoxic conditions and treatment, but the fundamental principles are the same. For example, we found that the Srs2-RPA antagonism not only affects the checkpoint downregulation in CPT, but also does so in MMS treatment and in a DSB system. We focused on CPT condition in this work, since CPT only induces the DNA damage checkpoint but not DNA replication checkpoint while MMS induces both. Further investigating the Srs2-RPA antagonism in a DSB system can be interesting to pursue in the future.
We believe that “superfluous removal” is appropriately used when discussing RPA regulation at genomic sites wherein it supports ssDNA protection and DNA repair, rather than DDC. Examples of these sites include R-loops and negatively supercoiled regions. These sites lack 3’ and 5’ DNA ends at the ss-dsDNA junctions for loading PCNA and the 9-1-1 checkpoint factors, and thus are not designated for checkpoint regulation.
We addressed the reviewer’s point regarding Rad51 in the Public Review section. We disagree with reviewer’s view that “Rad51 normally replaces RPA”. RPA is involved in many more processes than Rad51 wherein it is not replaced by Rad51.
Regarding toxicity of CPT, our view is that it stems from a combination of checkpoint regulation and other processes that also involve the Srs2-RPA antagonism. While this work focused on the checkpoint aspect of this antagonism, future studies will be conducted to address the latter.
One reference is entered as Lee Zhou and Stephen J. Elledge as opposed to "Zhou and Elledge."
Corrected.
Reviewer #3 (Recommendations For The Authors):
(1) It would be nice to see the additional point mutants (srs2-Y775A, srs2-F891A) be tested, as they showed little to no phenotypes in the previously reported analyses, which did not specifically test the function surveyed here.
This point is addressed in the Public Reviews section.
(2) Maybe the caveat of using deletion versus point mutations could be discussed.
This point is addressed in the Public Reviews section.
(3) Please plot individual data points of the two independent experiments in Figures 4D and 5A so that the reader can evaluate reproducibility. N=2 does not really allow deriving SD.
This point is addressed in the Public Reviews section and three individual data points are now included in both panels.
(4) It will help the reader to have the exact strains used in each experiment listed in each figure legend. Minor point.
The strain table is now updated to address this point.
(5) Page 7 middle paragraph: The reference to Figure 4A in line 11 should probably be Figure S3A.
Corrected.
-