26,199 Matching Annotations
  1. Last 7 days
    1. Reviewer #1 (Public Review):

      The study shows a new mechanism of NFkB-p65 regulation mediated by Vangl2-dependent autophagic targeting. Autophagic regulation of p65 has been reported earlier; this study brings an additional set of molecular players involved in this important regulatory event, which may have implications for chronic and acute inflammatory conditions.

      Comments on the revised version:

      The authors have addressed the earlier concerns and I am satisfied with the revised version. I have no additional comments to make.

    2. eLife assessment

      This valuable manuscript describes a novel role of Vangl2, a core planar cell polarity protein, in linking the NF-kB pathway to selective autophagic protein degradation in myeloid cells. The mechanistic studies suggest that Vangl2 targets p65 for NDP52-mediated autophagic degradation, limiting inflammatory NF-kB response, with functional significance of the proposed mechanism in sepsis. The presented evidence is convincing. Additional studies dissecting autophagic Vangl2 functions in various myeloid subsets in the context of inflammation could be informative, and additional Vangl2 targets in the inflammatory pathway, including IKK2, could also be explored. Overall, this exciting study will likely advance our understanding of NF-kB control, particularly in the context of inflammatory diseases.

    3. Reviewer #2 (Public Review):

      Vangl2, a core planar cell polarity protein involved in Wnt/PCP signaling, cell proliferation, differentiation, homeostasis, and cell migration. Vangl2 malfunctioning has been linked to various human ailments, including autoimmune and neoplastic disorders. Interestingly, it was shown that Vangl2 interacts with the autophagy regulator p62, and autophagic degradation limits the activity of inflammatory mediators, such as p65/NF-κB. However, the possible role of Vangl2 in inflammation has not been investigated. In this manuscript, Lu et al. describe that Vangl2 expression is upregulated in human sepsis-associated PBMCs and that Vangl2 mitigates experimental sepsis in mice by negatively regulating p65/NF-κB signaling in myeloid cells. Their mechanistic studies further revealed that Vangl2 recruits the E3 ubiquitin ligase PDLIM2 to promote K63-linked poly-ubiquitination of p65. Vangl2 also facilitated the recognition of ubiquitinated p65 by the cargo receptor NDP52. These molecular processes caused selective autophagic degradation of p65. Indeed, abrogation of PDLIM2 or NDP52 functions rescued p65 from autophagic degradation, leading to extended p65/NF-κB activity in myeloid cells. Overall, the manuscript presents convincing evidence for novel Vangl2-mediated control of inflammatory p65/NF-kB activity. The proposed pathway may expand interventional opportunities restraining aberrant p65/NF-kB activity in human ailments.

      IKK is known to mediate p65 phosphorylation, which instructs NF-kB transcriptional activity. In this manuscript, Vangl2 deficiency led to an increased accumulation of phosphorylated p65 and IKK also at 30 minutes post-LPS stimulation; however, autophagic degradation of p-p65 may not have been initiated at this early time point. Therefore, this set of data put forward the exciting possibility that Vangl2 could also be regulating the immediate early phase of inflammatory response involving the IKK-p65 axis - a proposition that may be tested in future studies.

    4. Reviewer #3 (Public Review):

      Lu et al. describe Vangl2 as a negative regulator of inflammation in myeloid cells. The primary mechanism appears to be through binding p65 and promoting its degradation, albeit in an unusual autolysosome/autophagy dependent manner. Overall, these findings are novel, valuable and the crosstalk of PCP pathway protein Vangl2 with NF-kappaB is of interest. While generally solid, some concerns still remain about the rigor and conclusions drawn.

      Comments on the revised version:

      Lu et al. address my comments through responses and new experimental data. However, some of the explanations provided are inadequate.

      The new experimental data using phosphomutants indeed adds to their claim that this is a PCP-independent function of Vangl2.

      The addition of statistics and testing JNK pathway is appreciated by this Reviewer.

      However, in response to my enquiry regarding directly exploring PCP effects, the authors simply assert "Our study revealed that Vangl2 recruits the E3 ubiquitin ligase PDLIM2 to facilitate K63-linked ubiquitination of p65, which is subsequently recognized by autophagy receptor NDP52 and then promotes the autophagic degradation of p65. Our findings by using autophagy inhibitors and autophagic-deficient cells indicate that Vangl2 regulates NFkB signaling through a selective autophagic pathway, rather than affecting the PCP pathway, WNT, HH/GLI, Fat-Dachsous or even mechanical tension."

      I do not agree that the use of autophagy inhibitors and autophagy-deficient cells can rule out the contributions of PCP or any other pathways. Only experimentally inhibiting the pathway(s) with adequate demonstration of target inhibition/abolition of well-known effector function and documenting unaltered p65 regulation under these conditions can be considered proof. Autophagy inhibitors and autophagy-deficient cells only prove that this particular pathway is necessary. Nonetheless, I do not want to dwell on proving a negative and agree that Vangl2 is a novel regulator of p65 through its role in promoting p65 degradation. The inclusion of a statement discussing the limitations of their approach would have sufficed. The response from the authors could have been better.

      I am also not satisfied with the explanation that "immune cells represent a minor fraction of the lungs and liver". There are lots of resident immune cells in the lungs and liver (alveolar macrophages in the lung and Kuppfer cells in the liver). For example, it may be so that Vangl2 is important in monocytes and not in the resident population. This might be a potential explanation. But this is not explored. The restricted tissue-specificity of the interaction between two ubiquitously present proteins is still a challenge to understand. The response from the authors is not satisfactory. There is plenty of Vangl2 in the liver in their western blot.

      I had also simply pointed out PMID: 34214490 with reference to the findings described in the manuscript. There were no suggestions of contradiction. In fact, I would refer to the publication in discussion to support the findings and stress the novelty. The response from the authors could have been better.

      The response to my enquiry regarding homo- or heterozygosity is unsupported by any reference or data.

      The listing of 8 patients and healthy controls are also appreciated. The body temperature of #6 doesn't fall in the <36 or >38 degree C SIRS criteria. The inclusion of CRP, PCT, heart rate and respiratory rate, and other lab values would have further improved the inclusion criteria. Moreover, it is difficult to understand why there are 16 value points for healthy and sepsis cohorts in Fig 1 when there are 8 patients.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      In the manuscript titled "Vangl2 suppresses NF-κB signaling and ameliorates sepsis by targeting p65 for NDP52-mediated autophagic degradation" by Lu et al, the authors show that Vangl2, a planner cell polarity component, plays a direct role in autophagic degradation of NFkB-p65 by facilitating its ubiquitination via PDLIM2 and subsequent recognition and autophagic targeting via the autophagy adaptor protein NDP52. Conceptually it is a wonderful study with excellent execution of experiments and controls. The concerns with the manuscript are mainly on two counts - First issue is the kinetics of p65 regulation reported here, which does not fit into the kinetics of the mechanism proposed here, i.e., Vangl2-mediated ubiquitination followed by autophagic degradation of p65. The second issue is more technical- an absolute lack of quantitative analyses. The authors rely mostly on visual qualitative interpretation to assess an increase or decrease in associations between partner molecules throughout the study. While the overall mechanism is interesting, the authors should address these concerns as highlighted below:

      Major points:

      (1) Kinetics of p65 regulation by Vangl2: As mentioned above, authors report that LPS stimulation leads to higher IKK and p65 activation in the absence of Vangl2. The mechanism of action authors subsequently work out is that- Vangl2 helps recruit E3 ligase PDLIM to p65, which causes K63 ubiquitination, which is recognised by NDP52 for autophagic targeting. Curiously, peak p65 activation is achieved within 30 minutes of LPS stimulation. The time scale of all other assays is way longer. It is not clear that in WT cells, p65 could be targeted to autophagic degradation in Vangl2 dependent manner within 30 minutes. The HA-Myc-Flag-based overexpression and Co-IP studies do confirm the interactions as proposed. However, they do not prove that this mechanism was responsible for the Vangl2-mediated modulation of p65 activation upon LPS stimulation. Moreover, the Vangl2 KO line also shows increased IKK activation. The authors do not show the cause behind increased IKK activation, which in itself can trigger increased p65 phosphorylation.

      We thank the reviewer for this valuable suggestion.

      Indeed, we agreed with the reviewer that peak p65 activation is achieved within 30 minutes of LPS stimulation in vitro, and p65 could not be targeted to autophagic degradation in a Vangl2 dependent manner within 30 minutes. Given that the protein and mRNA levels of Vangl2 were elevated at 3-6 h of LPS stimulation (Fig. S1 C-E), we extended the stimulation time scale in the revised manuscript. The data (Fig. 2A-D in the revised manuscript) demonstrated that IKK phosphorylation was enhanced in Vangl2 KO myeloid cells during the early phase (within 3 h) of LPS stimulation, but not for the prolonged period of LPS stimulation. The underlying mechanism may be complex. Only p65 phosphorylation was continuously enhanced after long-term LPS stimulation in Vangl2 KO cells, compared to WT cells. Furthermore, the overexpression of Vangl2 in A549 cells also demonstrated a reduction of phosphorylation and total endogenous p65 (Fig. 2 I, J in the revised manuscript). These findings were corroborated by overexpression and Co-IP experiments, which collectively indicated that Vangl2 regulates the stability of p65 by promoting its interaction with NDP52 and autophagic degradation. (Page 7; Line 183-185).  

      (2) The other major concern is regarding the lack of quantitative assessments. For Co-IP experiments, I can understand it is qualitative observation. However, when the authors infer that there is an increase or decrease in the association through co-IP immunoblots, it should also be quantified, especially since the differences are quite marginal and could be easily misinterpreted.

      We are grateful to the reviewer for this suggestion. The quantitative analysis has been updated in the revised version.

      (3) Figure 4E and F: It is evident that inhibiting Autolysosome (CQ or BafA1) or autophagy (3MA) led to the recovery of p65 levels and inducing autophagy by Rapamycin led to faster decay in p65 levels. Did the authors also note/explore the possibility that Vangl2 itself may be degraded via the autophagy pathway? IB of WCL upon CQ/BAF/3MA or upon Rapa treatment does indicate the same. If true, how would that impact the dynamics of p65 activation?

      We thank the reviewer for this question. Previous studies have shown that Vangl2 is primarily degraded by the proteasome pathway, rather than by the autolysosomal pathway (doi: 10.1126/sciadv.abg2099; doi: 10.1038/s41598-019-39642-z). In our experiments, Vangl2 recruits E3 ligase PDLIM2 to enhance K63-linked ubiquitination on p65, which serves as a recognition signal for cargo receptor NDP52-mediated selective autophagic degradation. Vangl2 facilitated the interaction between p65 and NDP52, yet itself did not undergo significant autophagic degradation.

      (4) Autophagic targeting of p65 should also be shown through alternate evidence, like microscopy etc., in the LPS-stimulated WT cells.

      We thank the reviewer for this suggestion. We have added the data (co-localization of p65 and LC3 was detected by immunofluorescence) in the revised version (Fig. S4 H in the revised manuscript). (Page 9, lines 267-268)

      Reviewer #2 (Public Review):

      Vangl2, a core planar cell polarity protein involved in Wnt/PCP signaling, mediates cell proliferation, differentiation, homeostasis, and cell migration. Vangl2 malfunctioning has been linked to various human ailments, including autoimmune and neoplastic disorders. Interestingly, Vangl2 was shown to interact with the autophagy regulator p62, and indeed, autophagic degradation limits the activity of inflammatory mediators such as p65/NF-κB. However, if Vangl2, per se, contributes to restraining aberrant p65/NF-kB activity remains unclear.

      In this manuscript, Lu et al. describe that Vangl2 expression is upregulated in human sepsis-associated PBMCs and that Vangl2 mitigates experimental sepsis in mice by negatively regulating p65/NF-κB signaling in myeloid cells. Vangl2 recruits the E3 ubiquitin ligase PDLIM2 to promote K63-linked poly-ubiquitination of p65. Vangl2 also facilitates the recognition of ubiquitinated p65 by the cargo receptor NDP52. These molecular processes cause selective autophagic degradation of p65. Indeed, abrogation of PDLIM2 or NDP52 functions rescued p65 from autophagic degradation, leading to extended p65/NF-κB activity.

      As such, the manuscript presents a substantial body of interesting work and a novel mechanism of NF-κB control. If found true, the proposed mechanism may expand therapeutic opportunities for inflammatory diseases. However, the current draft has significant weaknesses that need to be addressed.

      We appreciate the reviewer’s comments on our manuscript, and we have further improved the manuscript as suggested.

      Specific comments

      (1) Vangl2 deficiency did not cause a discernible increase in the cellular level of total endogenous p65 (Fig 2A and Fig 2B) but accumulated also phosphorylated IKK.

      Even Fig 4D reveals that Vangl2 exerts a rather modest effect on the total p65 level and the figure does not provide any standard error for the quantified data. Therefore, these results do not fully support the proposed model (Figure 7) - this is a significant draw back. Instead, these data provoke an alternate hypothesis that Vangl2 could be specifically mediating autophagic removal of phosphorylated IKK and phosphorylated IKK, leading to exacerbated inflammatory NF-κB response in Vangl2-deficient cells. One may need to use phosphorylation-defective mutants of p65, at least in the over-expression experiments, to dissect between these possibilities.

      We appreciate the reviewer’s comments on our manuscript, and we have further improved the manuscript as suggested.

      (1) Indeed, we agreed with the reviewer that Vangl2 deficiency did not cause a discernible increase in the cellular level of total p65 after a short time of LPS stimulation in vitro, and p65 could not be targeted to autophagic degradation in a Vangl2 dependent manner within 30 minutes. Given that the protein and mRNA levels of Vangl2 were elevated at 3-6 h of LPS stimulation (Fig. S1 C-E), we extended the stimulation time scale in the revised manuscript. The data (Fig. 2A-D in the revised manuscript) demonstrated that IKK phosphorylation was enhanced in Vangl2 KO myeloid cells during the early phase (within 3 h) of LPS stimulation, but not for the prolonged period of LPS stimulation. The underlying mechanism may be complex. Only phosphorylation of p65 and total endogenous p65 was continuously enhanced after long-term LPS stimulation in Vangl2 KO cells, compared to WT cells. Furthermore, the overexpression of Vangl2 in A549 cells also demonstrated a reduction of phosphorylation and total endogenous p65 (Fig. 2 I, J in the revised manuscript). These findings were corroborated by overexpression and Co-IP experiments, which collectively indicated that Vangl2 regulates the stability of p65 by promoting its interaction with NDP52 and autophagic degradation. (Page 7; Line 183-185).  

      (2) Similarly, the stimulation time scale in Fig 4D was extended, and it was demonstrated that p65 was more stable in Vangl2-deficient cells.

      3) Moreover, we constructed phosphorylation-defective mutants of p65 (S536A), and found that Vangl2 could also promote the degradation of the p65 phosphorylation mutants (Fig. S4 A, B in the revised manuscript). Thus, Vangl2 promote the degradation of the basal/unphosphorylated p65. (Page 8, lines 237-240)

      (2) Fig 1A: The data indicates the presence of two subgroups within the sepsis cohort - one with high Vangl2 expressions and the other with relatively normal Vangl2 expression. Was there any difference with respect to NF-κB target inflammatory gene expressions between these subgroups?

      As suggested, we conducted an analysis of NF-kB target inflammatory gene expressions between the high and relatively low Vangl2 expression groups in sepsis patients. The results showed that the serum of the high Vangl2 expression group exhibited lower levels of IL-6, WBC, and CRP than the low Vangl2 expression group, which suggested an inverse correlation between Vangl2 and the inflammatory response (Fig. S1 A in the revised manuscript) (Page 5, lines 126-128).

      (3) The effect of Vangl2 deficiency was rather modest in the neutrophil. Could it be that Vangl2 mediates its effect mostly in macrophages?

      As showed in Fig. S1C-E, the induction of Vangl2 by LPS stimulation is more rapid in macrophages than in neutrophils. This may contribute to its dominant effect in macrophages. Consequently, we primarily focused our investigation on the role of Vangl2 in macrophages.

      (4) Fig 1D and Figure 1E: Data for unstimulated Vangl2 cells should be provided. Also, the source of the IL-1β primary antibody has not been mentioned.

      Thank you for the suggestion. We have updated the data for unstimulated cells in the revised manuscript (Fig. 1 D, E in the revised manuscript). Also, IL-1β primary antibody was purchased from Cell Signaling Technology and the information has been included in the Materials and Methods section (Table S1).

      (5) The relevance and the requirement of RNA-seq analysis are not clear in the present draft. Figure 1E already reveals upregulation of the signature NF-κB target inflammatory genes upon Vangl2 deficiency.

      We agreed with the reviewer that the data presented in Figure 1E demonstrated the upregulation of the signature NF-kB target inflammatory genes upon Vangl2 deficiency in a murine model of LPS induced sepsis. Subsequently, we proceeded to investigate the mechanism by which Vangl2 regulates NF-kB target inflammatory genes at the cellular level in Figure 2. To this end, we performed RNA-seq analysis to screen signal pathways involved in LPS-induced septic shock by comparing LPS-stimulated BMDMs from Vangl2ΔM and WT mice, and identified that TNF signaling pathway and cytokine-cytokine receptor interaction were found to be significantly enriched in Vangl2ΔM BMDMs upon LPS stimulation. This analysis provides further evidence that Vangl2 plays a role in regulating NF-kB signaling pathways and the release of related inflammatory cytokines.

      (6) Fig 2A reveals an increased accumulation of phosphorylated p65 and IKK in Vangl2-deficient macrophages upon LPS stimulation within 30 minutes. However, Vangl2 accumulates at around 60 minutes post-stimulation in WT cells. Similar results were obtained for neutrophils (Fig 2B). There appears to be a temporal disconnect between Vangl2 and phosphorylated p65 accumulation - this must be clarified.

      This concern has been addressed above (see response to questions 1 from reviewer #2). 

      (7) Figure 2E and 2F do not have untreated controls. Presentations in Fig 2E may be improved to more clearly depict IL6 and TNF data, preferably with separate Y-axes.

      Thank you for the suggestion. We have added untreated controls and separated Y-axes for IL-6 and TNF data in the revised manuscript (Fig. 2 E, F in the revised manuscript).

      (8) Line 219: "strongly with IKKα, p65 and MyD88, and weak" - should be revised.

      We have improved the manuscript as suggested in the revised manuscript (Page 7; Line 203).

      (9) It is not clear why IKKβ was excluded from interaction studies in Fig S3G.

      We added the Co-IP experiment and showed that HA-tagged Vangl2 only interacted with Flag-tagged p65, but not with Flag-tagged IKKb in 293T cells (Fig S3H). Furthermore, endogenous co-IP immunoblot analyses showed that Vangl2 did not associate with IKKb (Fig. S3I)

      (10) Fig 3F- In the text, authors mentioned that Vangl2 strongly associates with p65 upon LPS stimulation in BMDM. However, no controls, including input or another p65-interacting protein, were used.

      As reviewer suggested, we have added input and positive control (IkBa) in this experiment (Fig. 3F in the revised manuscript). The results demonstrated that the interaction between p65 and IkBa was attenuated, although the total IkBa did not undergo significant degradation over long-term course of LPS stimulation.

      (11) Figure 4D - Authors claim that Vangl2-deficient BMDMs stabilized the expression of endogenous p65 after LPS treatment. However, p65 levels were particularly constitutively elevated in knockout cells, and LPS signaling did not cause any further upregulation. This again indicates the role of Vangl2 in the basal state. The authors need to explain this and revise the test accordingly.

      Thank you for the reviewer's comments. We repeated the experiment to ascertain whether Vangl2 could stabilize the expression of endogenous p65 before and after LPS treatment. It was found that, due to the extremely low expression of Vangl2 in WT cells in the absence of stimulation, there was no observable difference on the basal level of p65 between WT and Vangl2DM cells. However, upon prolonged LPS stimulation, Vangl2 expression was induced, resulting in p65 degradation in WT cells. In contrast, p65 protein was more stable in Vangl2 deficient cells after LPS stimulation (Fig. 4D in the revised manuscript).

      Reviewer #3 (Public Review):

      Lu et al. describe Vangl2 as a negative regulator of inflammation in myeloid cells. The primary mechanism appears to be through binding p65 and promoting its degradation, albeit in an unusual autolysosome/autophagy dependent manner. Overall, the findings are novel and the crosstalk of PCP pathway protein Vangl2 with NF-kappaB is of interest. …….Regardless, Vangl2 as a negative regulator of NF-kappaB is an important finding. There are, however, some concerns about methodology and statistics that need to be addressed.

      Thank you for your comments on our manuscript, and we have further improved the manuscript as suggested.

      (1) Whether PCP is anyway relevant or if this is a PCP-independent function of Vangl2 is not directly explored (the later appears more likely from the manuscript/discussion). PCP pathways intersect often with developmentally important pathways such as WNT, HH/GLI, Fat-Dachsous and even mechanical tension. It might be of importance to investigate whether Vangl2-dependent NF-kappaB is influenced by developmental pathways.

      Thank you for the reviewer's insightful comments. Our study revealed that Vangl2 recruits the E3 ubiquitin ligase PDLIM2 to facilitate K63-linked ubiquitination of p65, which is subsequently recognized by autophagy receptor NDP52 and then promotes the autophagic degradation of p65. Our findings by using autophagy inhibitors and autophagic-deficient cells indicate that Vangl2 regulates NF-kB signaling through a selective autophagic pathway, rather than affecting the PCP pathway, WNT, HH/GLI, Fat-Dachsous or even mechanical tension. Moreover, a discussion section has been added to the revised version. (Page 12, lines 377-393)

      (2) Are Vangl2 phosphorylations (S5, S82 and S84) in anyway necessary for the observed effects on NF-kappaB or would a phospho-mutant (alanine substitution mutant) Vangl2 phenocopy WT Vangl2 for regulation of NF-kappaB?

      As suggested, we generated phospho-mutants of Vangl2 (S82/84A) and observed that Vangl2 (S82/84A) could still facilitate the degradation of p65 (Fig. S4 B in the revised manuscript), suggesting that Vangl2 regulates the NF-kB pathway independently of its phosphorylation.

      (3) Another area to strengthen might be with regards to specificity of cell types where this phenomenon may be observed. LPS treatment in mice resulted in Vangl2 upregulation in spleen and lymph nodes, but not in lung and liver. What explains the specificity of organ/cell-type Vangl2 upregulation and its consequences observed here? Why is NF-kappaB signaling not more broadly or even ubiquitously affected in all cell types in a Vangl2-dependent manner, rather than being restricted to macrophages, neutrophils and peritoneal macrophages, or, for that matter, in spleen and LN and not liver and lung? After all, one may think that the PCP proteins, as well as NF-kappaB, are ubiquitous.

      Thank you for the reviewer's comments.

      (1) LPS is an important mediator to trigger sepsis with excessive immune activation. As is well known, the spleen and lymph nodes are important peripheral immune organs, where immune cells (e.g., macrophages) are abundant and respond sensitively to LPS stimulation. Nevertheless, immune cells represent a minor fraction of the lungs and liver. Consequently, Vangl2 represents a pivotal regulator of immune function, exhibiting a more pronounced increase in the immune organs and cells.

      2) Induction of Vangl2 expression by LPS stimulation is cell specific. Given that different cells exhibit varying protein abundances, the molecular events involved may also differ. Moreover, we observed high Vangl2 expression in the liver at the basal state (Author response image 1), whereas it was not induced after 12 h of LPS stimulation. Therefore, the functional role of Vangl2 exhibits significant phenotype in macrophages and neutrophils/spleen and LN, rather than in liver or lung cells.

      Author response image 1.

      Vangl2 showed no significant changes in the liver after LPS treatment.

      Mice (n≥3) were treated with LPS (30 mg/kg, i.p.). Livers were collected at 12 h after LPS treatment. Immunoblot analysis of Vangl2.

      Recommendations For The Authors:

      Reviewer #1 (Recommendations For The Authors):

      General points:

      Figure 4G- panels appear mislabeled. Pl correct.

      We have corrected this mislabeling as you suggested.

      The dynamics of Vangl2 interaction with p65 and autophagy adaptors is not clear/apparent. For example, Vangl2 expression destabilises p65 levels (as in Fig. 4), but in Fig. 5, it seems there is no decline in the p65 protein level, and a large fraction of it coprecipitates with NDP52.

      We appreciate the reviewer’s comments. In the co-IP assay, we used the lysosomal inhibitor CQ to inhibit p65 degradation to observe the interaction between p65 and NDP52 or Vangl2.

      Fig 5E- I would expect p65 levels to be lower in WT cells than Vangl2 KO cells. But as such, there is no difference between the two.

      We appreciate the reviewer’s comments. We repeated the experiments and updated the data. Firstly, Vangl2 was not induced in WT cells in the absence of LPS stimulation, thus there was no difference in p65 expression between the two groups at the basal level. Secondly, we used CQ/Baf-A1 to inhibit the degradation of Vangl2 in the co-IP assay to observe the interaction between p65 and other molecule.

      Reviewer #2 (Recommendations For The Authors):

      A few points that can be looked at and revised.

      (1) Quantification of the presented data is needed for Fig 4D and Fig 4E.

      We added the quantification analysis as suggested.  

      (2) The labeling of Fig 4G should be scrutinized.

      We have corrected this mislabeling as you suggested.

      (3) Fig 6B and Fig 6C should be explained in the result section more elaborately.

      We thank the reviewer for the suggestion, and we have rephrased this sentence to better describe the results. (Page 10, lines 306-313)

      (4) Line 85: "Vangl2 mediated downstream of Toll-like or interleukin (IL)-1" - unclear.

      We appreciate the reviewer’s comments on our manuscript, and we have further improved the manuscript as suggested in the revised manuscript. (Page 3, lines 68)

      (5) Line 181: "mice. Differentially expression analysis" - this should be revised.

      We appreciate the reviewer’s comments on our manuscript, and we have further improved the manuscript as suggested in the revised manuscript. (Page 11, lines 323)

      (6) Line 261-264- CHX-chase assay showed the degradation rate of p65 in Vangl2-deficient BMDM was slower compared with WT cells. However, Vangl2 is not induced in WT BMDMs upon CHX treatment (Fig. S4B).

      We appreciate the reviewer’s comments on our manuscript, and we have further improved the manuscript as suggested in the revised manuscript (Fig. S4D).

      (7) Finally, some editing to provide data only critical for the conclusions could improve the ease of reading.

      We have further improved the manuscript as suggested in the revised manuscript.

      Reviewer #3 (Recommendations For The Authors):

      Comments (general, please address at least in Discussion. Some experimental data, for example the role, if any, of Vangl2 phosphorylations will be very useful):

      (1) It might be interesting to explore whether there are any potential effects of developmental pathways on the observed effect mediated by Vangl2 or if the effects are entirely a PCP-independent function of Vangl2. Please see above public review.

      Thank you for the reviewer's insightful comments. Our study revealed that Vangl2 recruits the E3 ubiquitin ligase PDLIM2 to facilitate K63-linked ubiquitination of p65, which is subsequently recognized by autophagy receptor NDP52 and then promotes the autophagic degradation of p65. Our findings by using autophagy inhibitors and autophagic-deficient cells indicate that Vangl2 regulates NF-kB signaling through a selective autophagic pathway, rather than affecting the PCP pathway, WNT, HH/GLI, Fat-Dachsous or even mechanical tension. Furthermore, we generated phospho-mutants of Vangl2 (S82/84A) and observed that Vangl2 (S82/84A) could still facilitate the degradation of p65 (Fig. S4 B), suggesting that Vangl2 regulates the NF-kB pathway independently of its phosphorylation. In addition, a discussion section has been added to the revised version. (Page 12, lines 377-393)

      (2) What explains the specificity of organ/cell-type Vangl2 upregulation and its consequences observed here? Why is NF-kappaB signaling not more broadly or even ubiquitously affected in all cell types in a Vangl2-dependent manner, rather than being restricted to macrophages, neutrophils and peritoneal macrophages, or, for that matter, in spleen and LN and not liver and lung? Afterall, one may think that the PCP proteins, as well as NF-kappaB, are ubiquitous.

      Thank you for the reviewer's comments. A similar question has been addressed above (refer to the response to question 3 of reviewer 3).

      (3) Another specificity-related question that comes to mind is whether the Vangl2 function in autolysomal/autophagic degradation is restricted to p65 as the exclusive substrate? The cytosolic targeting of p65 as opposed to the more well-known nuclear-targeting is interesting.

      Our previous finding demonstrated that Vangl2 inhibits antiviral IFN-I signaling by targeting TBK1 for autophagic degradation (doi: 10.1126/sciadv.adg2339), thereby indicating that p65 is not the sole substrate for Vangl2. However, in the NF-kB pathway, p65 is a specific substrate for Vangl2. Moreover, our findings indicate that the interaction between Vangl2 and p65 occurs predominantly in the cytoplasm, rather than in the nucleus (Fig. S4 C).

      (4) Pharmacological approach is used to tease apart autolysosome versus proteasome pathway. What is the physiological importance of autophagic degradation? It is interesting to note that Vangl2 was already previously implicated in degrading LAMP-2A and increasing chaperon-mediated autophagy (CMA)-lysosome numbers (PMID: 34214490).

      Previous literature has domonstrated that Vangl2 can inhibit CMA degradation (PMID: 34214490). However, in our study, we found that Vangl2 can promote the selective autophagic degradation of p65. It is important to note that CMA degradation and selective autophagic degradation are two distinct degradation modes, which is not contradictory.

      (5) Are these phenotypes discernable in heterozygotes or only when ablated in homozygosity? Any phenotypes recapitulated in the looptail heterozygote mice?

      We found that these phenotypes discernable only in homozygosity.

      (6) What is the conservation of the Vangl2 p65-interaction site between Vangl2 and Vangl1? PDLIM2 recruitment between Vangl2 and Vangl1?

      We appreciate the reviewer’s comments on our manuscript. Previous studies have shown that human Vangl1 and Vangl2 exhibit only 72% identity and exhibit distinct functional properties (doi: 10.1530/ERC-14-0141).Thus, the interaction of Vangl2 with p65 and PDLIM2 recruitment may not necessarily occur in Vangl1.

      Comments (specific to experiments and data analyses. Please address the following):

      (7) The patient population used in Fig 1 is not described in the Methods. This is a critical omission. Were age, sex etc. controlled for between healthy and disease? How was the diagnosis made? What times during sepsis were the samples collected? As presented, this data is impossible to evaluate and interpret.

      We appreciate the reviewer’s comments on our manuscript, and we have further improved the manuscript as suggested in the revised supplement materials. (Supplementary information, Page 12, lines 146-147)

      (8) In general, the statistical method should be described for each experiment presented in the figures. Comparisons should not be made only at the time point with maximal difference (such as in Fig 1F or Fig 2C, but at all time points using appropriate statistical methods). The sample size should also be included to allow determination appropriateness of parametric or non-parametric tests.

      We appreciate the reviewer’s comments on our manuscript, and we have further improved the manuscript as suggested in the revised manuscript (Figures 1F and 2C).

      (9) PCP pathways can activate p62/SQSTM1 or JNK via RhoA. JNK activation should be tested experimentally.

      According to the reviewer's comments, we further examined the effect of Vangl2 on the JNK pathway. The results showed that Vangl2 did not affect the JNK pathway (Author response image 2). This suggests that Vangl2 functions independently of the PCP pathway.

      Author response image 1.

      Vangl2 did not affect the JNK pathway. WT and Vangl2-deficient (n≥3) BMDMs were stimulated with LPS (100 ng/ml) for the indicated times. Immunoblot analysis of total and phosphorylated JNK.

      (10) Why are different cells such as A549, HEK293, CHO, 293T, THP-1 used during the studies for different experiments? Consistency would improve rigor. At least, logical explanation driving the cell type of choice for each experiment should be included in the manuscript. Nonetheless, one aspect of using a panel of cell lines indicate that the effect of Vangl2 on NF-kappa B is pleiotropic.

      We are grateful to the reviewer for their comments on our manuscript. A549, HEK293, CHO, and 293T cells are commonly utilized in protein-protein interaction studies. The selection of cell lines for overexpression (exogenous) experiment is dependent on their transfection efficiency and the ability to express TLR4 (the receptor for LPS). Additionally, we conducted endogenous experiments by using THP-1 and BMDMs, which are human macrophage cell lines and murine primary macrophages, respectively. Moreover, we generated Vangl2f/f lyz-cre mice by specifically knocking out Vangl2 in myeloid cells, and investigated the effect of Vangl2 on NF-kB signaling in vivo.

    1. eLife assessment

      This valuable study presents findings on the role of the ubiquitin-conjugating enzyme UBE2D/eff in maintaining proteostasis during aging. The evidence supporting the conclusions is solid, although one reviewer had concerns about the readout for protein aggregation and the loss-of-function studies. In the future, mechanistic insights explaining the impact of UBE2D/eff deficiency on the accumulation of poly-ubiquitinated proteins and in shortening lifespan would be interesting. The present study is of broad interest to cell biologists working in aging and age-related diseases.

    2. Reviewer #1 (Public Review):

      In this study, Hunt et al investigated the role of the ubiquitin-conjugating enzyme UBE2D/effete (eff) in maintaining proteostasis during aging. Utilizing Drosophila as a model, the researchers observed diverse roles of E2 ubiquitin-conjugating enzymes in handling the aggregation-prone protein huntingtin-polyQ in the retina. While some E2s facilitated aggregate assembly, UBE2D/eff and other E2s were crucial for degradation of htt-polyQ. The study also highlights the significance of UBE2D/eff in skeletal muscle, showing that declining levels of eff during aging correlate with proteostasis disruptions. Knockdown of eff in muscle led to accelerated accumulation of poly-ubiquitinated proteins, shortened lifespan, and mirrored proteomic changes observed in aged muscles. The introduction of human UBE2D2, analogous to eff, partially rescued the deficits in lifespan and proteostasis caused by eff-RNAi expression in muscles.

      Comments on revised version:

      In this revised manuscript, the authors have addressed some of my concerns, yet several significant caveats remain unaddressed.

      One major concern stems from the unexpected outcome observed in the UBE2D/eff loss-of-function experiment. Despite its known role as a ubiquitin-conjugating enzyme (E2), reducing UBE2D/eff levels led to an increase in poly-ubiquitinated proteins and p62 accumulation, suggesting a more complex and multifaceted phenotype seemingly unrelated to the expected role of UBE2D/eff. The authors proposed that an overall disruption of protein quality control, indirectly caused by effRNAi, could explain these phenotypes. However, while the authors noted that effRNAi does not affect proteasome activity, they have not explored other possibilities, leaving a mechanistic explanation still missing.

      Furthermore, the comparative analysis of the old versus young proteome identified 10 out of 21 E2 enzymes, suggesting that other E2s may also contribute to age-related changes in proteostasis and lifespan. In this context, the authors mentioned that overexpression of human UBE2D2 in skeletal muscle does not influence lifespan, indicating that the reduced Eff levels observed during aging may not necessarily contribute to the aging phenotype.<br /> At this point, I believe the manuscript remains largely descriptive.

    3. Reviewer #2 (Public Review):

      The authors screened 21 E2 enzymes for their role in HTTExon1Q72-mCherry (HTT) aggregation in the Drosophila eye. They identified UBE2D, whose knockdown leads to increased HTT aggregation that can be rescued by ectopic expression of the human homolog. The protein levels of UBE2D decrease with aging and knockdown of UBED2 leads to an accumulation of ubiquitinated proteins and a shortened lifespan that can be rescued by ectopic expression of the human homolog. Knockdown of UBE2D leads to proteomic changes with up- and down-regulated proteins that include both components of the proteostasis network.

      Comments on revised version:

      The authors have not addressed a single critical point experimentally. Their explanations are not resolving my concerns and hence the following critical points remain:

      • The readout of HTT aggregation (with methods that are not suitable) as proxy for the role of UBE2D in proteostasis is not convincing.

      • UBE2D knockdown increases the number of HTT foci (Fig. 1A), but the quantification is less convincing as depicted in Fig. 1B and other E2 enzymes show a stronger effect (e.g. Ubc6 that is only studied in Figs. 1 + 2 without an explanation and Ubc84D). It does not help or add anything to this study that the authors refer to a previous publication. This review assesses this manuscript.

      • The quantification of the HTT fluorescence cannot be used as proxy for HTT aggregation. The authors should assess HTT aggregation by e.g. SDD-AGE, FRAP, filter retardation etc. The quantification of the higher MW species of HTT in the SDS-PAGE is not ideal either as this simply reflects material that is stuck in the wells that could not enter the gel. Aggregation and hence high MW size could be one reason, but it can also be HTT trapped in cell debris etc. This point is critical and I disagree with the response of the authors.

      • Does UBE2D ubiquitinate HTT? And thus, is HTT accumulation a suitable readout for the functional assessment of the E2 enzyme UBE2D? The authors state that UBE2D does not ubiquitinate HTT. Thus, HTT accumulation is an indirect consequence of perturbed proteostasis. There are certainly better readouts for the role of UBE2D once they have identified substrates.

      • The proteomic analyses could help to identify potential substrates for UBE2D. I think its is a missed chance to not follow up on the proteomic analysis to identify substrates and define the role of UBE2D in maintainig proteostasis.

      • Are there mutants available for UBE2D or conditional mutants? One caveat of RNAi are: first not complete knockdown and second, variable knockdown efficiencies that increases variability. So mutants are available and yet the authors refuse to use those.

      • The analysis of the E3 enzymes does not add anything to this manuscript and the author's response that this manuscript is a follow-up study on a previous publication of the lab is certainly not a valid argument.

      • The manuscript remains at this stage rather descriptive.

    4. Reviewer #3 (Public Review):

      This is an interesting paper that defines E2 and E3 genes in Drosophila that can impact the accumulation of the Q72-GFP protein in the fly eye. The authors then focus on the eff gene, showing which human homolog can rescue fly knockdown. They extend to skeletal muscle during natural aging to show that eff by TMT mass spec decreases with age normally in the fly muscle and that there is a significant overlap of proteins that are disrupted with eff knockdown in young animals in muscle vs aged animals normally in muscle.

      Overall these data suggest that eff decrease with age may contribute to the increase in ubiquitinated proteins in muscle with age, and that upregulation of eff activity might be of interest to extend lifespan. Because eff function can be performed by a human homologue the findings may also apply to human situations of aging.

      These data are overall interesting and of relevance for those interested in neurodegenerative disease and aging.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):  

      In this study, Hunt et al investigated the role of the ubiquitin-conjugating enzyme UBE2D/effete (eff) in maintaining proteostasis during aging. Utilizing Drosophila as a model, the researchers observed diverse roles of E2 ubiquitinconjugating enzymes in handling the aggregation-prone protein huntingtin-polyQ in the retina. While some E2s facilitated aggregate assembly, UBE2D/eff and other E2s were crucial for degradation of hL-polyQ. The study also highlights the significance of UBE2D/eff in skeletal muscle, showing that declining levels of eff during aging correlate with proteostasis disruptions. Knockdown of eff in muscle led to accelerated accumulation of poly-ubiquitinated proteins, shortened lifespan, and mirrored proteomic changes observed in aged muscles. The introduction of human UBE2D2, analogous to eff, partially rescued the deficits in lifespan and proteostasis caused by eff-RNAi expression in muscles. 

      The conclusions of this paper are mostly well supported by data, although a more precise mechanistic explanation of phenotypes associated with UBE2D/eff deficiency would have strengthened the study. Additionally, some aspects of image quantification and data analysis need to be clarified and/or extended.  

      We thank reviewer #1 for the thoughtful assessment of our work. We have amended the discussion to better explain the phenotypes associated with UBE2D/eff deficiency. We have also improved the methods describing the procedures for image quantification and data analysis.

      Reviewer #2 (Public Review):  

      Important findings: 

      - Knockdown of UBE2D increases HTT aggregation. 

      - Knockdown of UBE2D leads to an accumulation of ubiquitinated proteins and reduces the lifespan of Drosophila, which is rescued by an ectopic expression of the human homolog. 

      - UBE2D protein levels decline with aging. 

      - UBE2D knockdown is associated with an up- and downregulation of several different cellular pathways, including proteostasis components. 

      Thank you for reviewing our manuscript.

      Caveats: 

      - The readout of HTT aggregation (with methods that are not suitable) as a proxy for the role of UBE2D in proteostasis is not convincing. It would probably improve the manuscript to start with the proteomic analysis of UBE2D to demonstrate that its protein levels decrease with aging. The authors could then induce UBE2D in aged animals to assess the role of UBE2D in the proteome with aging.  

      While presenting the data in a different order would be possible, we prefer to keep the current order in which from a general screen with a proteostasis readout (HTT aggregates; see the answer below for a discussion on the methods) we proceed to identify a candidate (UBE2D) which is then studied in more detail with additional focused analyses in the retina and skeletal muscle during aging. Concerning the induction of UBE2D in aged animals, our analyses in Figure 4E demonstrate that muscle-specific induction of UBE2D2 throughout life does not increase lifespan alone: this could be explained by UBE2D2 only partially recapitulating the function and substrate diversity of Drosophila eff/UBE2D due to divergence from a single Drosophila UBE2D enzyme (eff) to multiple UBE2D enzymes in humans (UBE2D1/2/3/4).

      - UBE2D knockdown increases the number of HTT foci (Figure 1A), but the quantification is less convincing as depicted in Figure 1B, and other E2 enzymes show a stronger effect (e.g. Ubc6 that is only studied in Figures 1 and 2 without an explanation and Ubc84D). The graph is hard to interpret. What is the sample size and which genetic conditions show a significant change? P values and statistical analyses are missing.  

      The full data underlying this genetic screen is reported in Supplementary Table 1. The role of UBC6/UBE2A/B is thoroughly examined in Hunt et al 2021 (PMID: 33658508). We agree that Ubc84D has an important effect and that it should be considered for future studies. We have amended the legend of Figure 1 to indicate that each data point in the graph represents a single RNAi line targeting the corresponding gene. The mean of 5 biological replicates is shown for each RNAi, with each biological replicate representing a single eye imaged from a distinct fly. Therefore, the data points that do not show large magnitude changes may indicate RNAi lines that were not effective at knocking down the target protein (or that did not affect HTT aggregates). The E2s worth pursuing were identified because of multiple RNAi lines scoring consistently: this is the case of UBC6 (studied previously in PMID: 33658508) and eff/UBE2D (pursued in this study). This screen was therefore utilized to identify and select candidate genes (i.e. eff/UBE2D) for more in-depth studies on proteostasis.

      - The quantification of the HTT fluorescence cannot be used as a proxy for HTT aggregation. The authors should assess HTT aggregation by e.g. SDD-AGE, FRAP, filter retardation, etc. The quantification of the higher MW species of HTT in the SDS-PAGE is not ideal either as this simply reflects material that is stuck in the wells that could not enter the gel. Aggregation and hence high MW size could be one reason, but it can also be HTT trapped in cell debris, etc.  

      We agree that the use of multiple methods is a good way to assess the impact of E2 enzymes on HTT protein aggregation. In this regard, we estimated HTT aggregates by fluorescence microscopy and by western blot. Microscopy-based analyses demonstrate both the accumulation of the HTT-GFP pathogenic protein into aggregates (HTT polyQ polypeptides aggregating into one spatial region; Fig. 1 and Fig. 2B) as well as their potential cytotoxicity, resulting in the disruption of the ommatidial ultrastructure and cellular degeneration (Fig. 2A). Similar to native gels and filter retardation, we have utilized SDS-PAGE and western blotting of cellular samples isolated with strong chaotropic and denaturing reagents (8M urea plus detergents and reducing reagents used in the lysis). These experimental conditions maintain the higher-order organization of HTT into high-molecular-weight aggregates that are not broken down into individual polypeptides and that therefore do not readily travel through a gel or filter. Therefore, the biochemical methods we have used are equivalent to those proposed by the reviewer. In addition to combining microscopy-based and biochemical approaches to examine the impact of eff/UBE2D on the HTT aggregates, we have analyzed eff/UBE2D during skeletal muscle aging and found consistent phenotypes as those observed in the HTT model: RNAi for eff/UBE2D leads to the accumulation of detergent-insoluble ubiquitinated proteins that associate with protein aggregates.

      - Does UBE2D ubiquitinate HTT? And thus, is HTT accumulation a suitable readout for the functional assessment of the E2 enzyme UBE2D? 

      We propose that the accumulation of HTT in response to eff/UBE2D RNAi may be due to a generalized loss of protein quality control rather than to a direct decline in the ubiquitination of HTT by eff/UBE2D. In a previous study that examined the UBE2D interactome (Hunt et al. 2023; PMID: 37963875), we did not find an interaction between UBE2D and HTT, suggesting that HTT may not be directly modulated by eff/UBE2D via ubiquitination.

      - The proteomic analyses could help to identify potential substrates for UBE2D.

      The proteomic analyses in Figure 5 identify several proteins that are modulated by RNAi for eff and by its human homolog, UBE2D2. Such eff/UBE2D2-modulated proteins may indeed be potential substrates for UBE2D-mediated ubiquitination. For example, this is the case for Pex11 and Pex13, which were found to be upregulated upon UBE2D RNAi also in human cells, where they are ubiquitinated in a UBE2D-dependent manner (Hunt et al. 2023; PMID: 37963875).

      - Are there mutants available for UBE2D or conditional mutants? One caveat of RNAi is: first not complete knockdown and second, variable knockdown efficiencies that increase variability.

      There are potential hypomorphic alleles of eff/UBE2D that may be available, but they would present the same caveats of incomplete loss of eff/UBE2D function as RNAi. Given the strong phenotype that we find with partial eff knockdown, a caveat of full eff/UBE2D knockout is that this could be lethal.

      - The analysis of the E3 enzymes does not add anything to this manuscript. 

      The analysis of E3 enzymes relates to our recent publication (Hunt et al. 2023; PMID: 37963875) that reports the physical interactions between E2 and E3 enzymes. Analysis of these E2-E3 pairs in the genetic screen in Fig.1 therefore follows this IP-MS study to provide insight into the functional interaction between these E2-E3 pairs in proteostasis.

      - Figure 2B: the fluorescence intensities in images 2 and 4 are rather similar, yet the quantification shows significant differences. 

      Please note that some of the GFP fluorescence in image 4 is not punctate, but rather diffuse fluorescence that is not related to HTT-GFP aggregates. Our image quantitation methods utilized thresholding to identify GFP-positive puncta while eliminating background fluorescence not corresponding to HTT-GFP puncta.

      - The proteomic analyses could provide insights into the functional spectrum of UBE2D or even the identification of substrates. Yet apart from a DAVID analysis, none of the hits were followed up. In addition, only a few hits were labelled in the volcano plots (Figure 5). On what basis did the authors select those?

      Please see the previous answer above regarding the identification of eff/UBE2D protein substrates from our proteomic analysis in Fig. 5. Only some of the top-regulated hits could be labeled in Fig.5 to avoid overcrowding.

      - The manuscript remains at this stage rather descriptive. 

      Our study has demonstrated a key role for the eff/UBE2D ubiquitin-conjugating enzyme in regulating protein quality control during aging in the Drosophila retina and skeletal muscle. Our study has identified key proteins that are modulated by eff/UBE2D RNAi in Drosophila muscle, that are rescued by expression of human UBE2D2, and that may underlie the accelerated decline in proteostasis that occurs upon eff/UBE2D RNAi. While more could be known about the regulation of these eff/UBE2D-modulated proteins in Drosophila, we have previously demonstrated that some of the proteins that are upregulated by UBE2DRNAi in human cells (e.g. some peroxins) are indeed direct ubiquitination targets of UBE2D via associated E3 ubiquitin ligases (Hunt et al. 2023; PMID: 37963875).

      Reviewer #3 (Public Review):  

      This is a potentially quite interesting paper that defines E2 and E3 genes in Drosophila that can impact the accumulation of the Q72-GFP protein in the fly eye. The authors then focus on the eff gene, showing which human homolog can rescue fly knockdown. They extend to skeletal muscle, from the hL protein, to show that eff by TMT mass spec decreases with age normally in the fly muscle and that there is a significant overlap of proteins that are disrupted with eff knockdown in young animals in muscle vs aged animals normally in muscle. 

      Overall these data suggest eff decrease with age may contribute to the increase in ubiquitinated proteins in muscle with age, and that upregulation of eff activity might be of interest to extending lifespan. Because eff function can be performed by a human homologue, the findings may also apply to human situations of aging. 

      These data are overall interesting and are of relevance for those interested in neurodegenerative disease and aging, although a number of points from the figures seem confusing and need more explanation or clarity. 

      Thank you for reviewing our manuscript, we have improved the explanations and clarity of the manuscript.

      Recommendations for the authors:

      We would like to keep the manuscript title as it is currently to report the partial overlap in the proteomic changes induced by aging and effRNAi (Fig. 6).

      Reviewer #1 (Recommendations For The Authors): 

      (1) A significant concern arises from the unexpected outcome observed in the UBE2D/eff loss-of-function experiments. Despite its role as a ubiquitin-conjugating enzyme (E2), the reduction in UBE2D/eff levels paradoxically increased polyubiquitinated proteins and p62 accumulation, presenting a more intricate and seemingly unrelated phenotype to its anticipated function. 

      eff/UBE2D represents one out of 21 different Drosophila E2 ubiquitin-conjugating enzymes and therefore eff RNAi alone is unlikely to reduce the total pool of ubiquitinated proteins. The generalized increase in insoluble polyubiquitinated proteins results from an overall derangement of protein quality control caused by effRNAi. In agreement with this scenario, the protein categories that were found to be modulated by effRNAi (Fig. 5) include proteins associated with protein quality control such as proteasome components and chaperones. Therefore, derangement in the levels of a wide range of regulators of proteostasis may lead to a generalized loss of protein quality control upon effRNAi.

      I believe elucidating the mechanisms underlying the impact of UBE2D/eff deficiency on the observed phenotypes would contribute to a more comprehensive understanding of the study's implications. For instance, investigating whether the loss of UBE2D/eff influences muscle proteostasis by impeding proteasome assembly or function, modulating autophagy, etc. 

      We have previously utilized luciferase assays to measure the proteolytic activity of the proteasome in human cells treated with siRNAs targeting UBE2D1/2/3/4 but found no effect of UBE2D knockdown compared to control nontargeting siRNAs (Hunt et al. 2023; PMID: 37963875). In Drosophila muscles, we have examined the levels of GFP-CL1 (a GFP fused with a proteasomal degron) and found that effRNAi does not impact GFP-CL1 levels (data shown in author response image 1). Overall, these results suggest that effRNAi reduces protein quality control without affecting proteasome activity.

      Author response image 1.

      (2) Related to Figures 1B-C: It is not clear to this reviewer the quantification methodology used in the experiment. Does each point represent the Average +/- SD for each replicate? If so, it appears that not all cases align with the n=5 as indicated in the figure legend. Additionally, how many animals per replicate were quantified? 

      We have amended the legend of Figure 1 to indicate that each data point in the graph represents a single RNAi line targeting the corresponding gene. The mean of 5 biological replicates is shown for each RNAi line, with each biological replicate representing a single eye imaged from a distinct fly. Therefore, the data points that do not show large magnitude changes may indicate RNAi that were not effective at knocking down the target protein (or with no effect on HTT aggregates).  

      (3) Related to the previous point: The analysis of pathogenic Huntingtin aggregation in the Materials and Methods section lacks information regarding the number of individuals, replicates, etc. 

      Please see the response above.

      (4) Related to Figure 1 B: In the case of eff/UBE2D, it appears that 3 out of 9 replicates demonstrate a significant increase in HL-polyQ aggregates. Considering the strength of this result, it raises questions about whether it justifies using eff for future analyses. 

      Please see the response to point (2) above. These results indicate that 3 distinct UAS-RNAi lines targeting eff/UBE2D produced the same effect whereas 6 other effRNAi lines did not, possibly because they are less efficacious in knocking down eff/UBE2D. We have now amended the legend of Fig. 1B to better explain these results.

      (5) Related to Figure 1 D-E: Could the authors provide clarification regarding the tissue type and animal age utilized in these experiments? 

      Whole flies were utilized at 1 week of age.

      (6) Related to Figure 3: Incorporating the normal accumulation of poly-ubiquitinated proteins during aging could provide context to better interpret the effect of eff/UBE2D KD at 3 weeks of age. 

      Several papers from us and others have previously demonstrated a progressive increase in the insoluble levels of poly-ubiquitinated proteins during aging in Drosophila skeletal muscle (PMID: 36640359; PMID: 31249065; PMID: 33773104; PMID: 33658508; PMID: 24092876; PMID: 21111239; PMID: 24244197; PMID: 25199830; PMID: 28878259; PMID: 36213625). Our analyses now indicate that such age-related loss of protein quality control is accelerated by eff/UBE2D knockdown.

      (7) Related to Figure 3: Would it be possible for the authors to include a list or table detailing the specific E2, deubiquitinating enzymes, and E3s identified in the comparative analysis of the old vs young proteome? This would provide a clear reference for the identified regulatory proteins involved in the age-related proteomic changes. 

      We have added a tab to Supplementary Table 2 to report the list of age-regulated deubiquitinating enzymes (DUBs) and E1, E2, and E3 enzymes.

      (8) Related to Figures 3 and 4: Given that the comparative analysis of the old versus young proteome identified 10 out of 21 E2 ubiquitin-conjugating enzymes, exploring the impact of eff/UBE2D overexpression becomes pivotal to understanding its role in age-related changes in proteostasis and lifespan. Conducting an experiment involving eff overexpression could provide valuable insights into whether restoring eff levels mitigates aging-related phenotypes. 

      Although we have not done this experiment with eff overexpression, Fig. 4E reports that the overexpression of human UBE2D2 in skeletal muscle does not appear to influence lifespan by itself (green line in Fig. 4E), although it can partially rescue the short lifespan of flies with muscle-specific effRNAi (purple line in Fig. 4E).

      (9) Providing a more detailed description of the Supplementary Tables would significantly enhance the reader's comprehension of their content. 

      A description has been added at the end of the methods.

      Reviewer #2 (Recommendations For The Authors): 

      In addition, to the points listed above: 

      - The title does not reflect the content of the manuscript and should be changed. There is no evidence that UBE2D maintains a "youthful" (needs to be changed as well) proteome. Rather, its expression declines with aging and its depletion leads to an increase of ubiquitinated proteins. This is true for essentially the entire proteostasis network. 

      While proteostasis generally declines with aging, it is incompletely understood what specific components of the proteostasis network are dysregulated with aging. Our study now identifies the E2 ubiquitin-conjugating enzyme eff/UBE2D as a key regulator of proteostasis that is transcriptionally downregulated with aging. Comparison of the proteomic changes induced by aging versus those induced by effRNAi in young age indicates a partial overlap (Fig. 6), indicating that eff/UBE2D is, at least in part, necessary to maintain the proteome composition that is found in young age (“youthful”). On this basis, we would like to keep the current title but have amended the manuscript to indicate that such regulation of the proteome composition is only in part dependent on eff/UBE2D.

      - Molecular weight markers are missing for the gels/western blot depicted in Fig 1E, 2C, 3E, and 4A. 

      Thank you for pointing this out, these have been added.

      - Fig. 4A, the Ponceau staining for the detergent insoluble samples shows almost no signal for lane 7 and the data should hence not be analyzed. 

      The western blot membrane in Fig. 4A shows a reliable signal in all lanes (including lane 7) when probed with antibodies for ubiquitin, Ref(2)P, and tubulin. Therefore, there is no reason for excluding lane 7 from the analysis. Ponceau S staining is provided as an additional loading control but was not used to normalize the data.

      Reviewer #3 (Recommendations For The Authors): 

      There are a number of confusing or not sufficiently explained points in the figures that require clarity. 

      In Figure 1, panels B and C, one assumes the gray broad line across means no difference from control. For the genes, many have points that are scattered both above and below that control line. What do the dots and range represent for each gene, and why are the data so scattered. How do the authors explain data ranging from no effect, to a negative effect to a positive effect, all for the same gene? Akt1 and Hsp83 are controls but are not quantitated to appreciate how variable the assay is. Can they explain the figure better, and also why the data for any one gene are so variable?

      We have amended the legend of Figure 1 to indicate that each data point in the graph represents a single RNAi line targeting the corresponding gene. The mean of 5 biological replicates is shown for each RNAi line, with each biological replicate representing a single eye imaged from a distinct fly. Therefore, the data points that do not show large magnitude changes may indicate RNAi lines that were not effective at knocking down the target protein (or that did not affect HTT aggregates). Therefore, the variability in the analysis of a single gene arises because different RNAi lines targeting that gene may have different efficacy. RNAi lines for Akt1 and Hsp83 are merely used as controls (these have been quantified in Jiao et al. 2023; PMID: 36640359).

      In Figure 2A, it is not clear which animals have the hL-Q72-GFP (which eyes are "rough eyes"?). Also, do ubc6-RNAi and eff-RNAi have an impact on the normal eye? That is, can they explain the images and genotypes more clearly. 

      UBC6 and eff RNAi produce these rough eye phenotypes in the absence of HTT-polyQ and these are rescued by the expression of their human homologs. The panel images indicated in bold here below are those that have “rough eye” phenotypes: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 (a green R has been added to these panels in Fig. 2A).

      In Figure 2B, panel 3 looks very different from 1 and 4 and yet is not different from them by quantitation. Can they replace it with a more representative panel or is 3 lower (but not significantly so)? 

      Please note that some of the GFP fluorescence in image 4 is not punctate, but rather diffuse fluorescence that is not related to HTT-GFP aggregates. Our image quantitation methods utilized thresholding to identify GFP-positive puncta while eliminating background fluorescence not corresponding to HTT-GFP puncta.

      In Figures 3E and F, it would be helpful in F to put the detergent soluble bar graphs all on the left so that those data are on the left in both E and F, and then detergent-insoluble in E and F to the right. This would make the figure and quantitation easier to follow. 

      Done.

      The same point as above for Figures 4 A and B. 

      Done.

      In Figure 3A, CG7656 is nearly as reduced with age as eff. One wonders if that gene would give a different or similarly overlapping proteome with age as eff. Was CG7656 not focused on because not conserved? 

      As indicated in Figure 1B, CG7656 is orthologous to UBE2R1 (also called CDC34) and UBE2R2 in humans. In this screen, however, RNAi targeting CG7656 did not appear to influence HTT aggregates and therefore was not selected for further analyses. However, it may play a role in skeletal muscle proteostasis during aging.

      In Figure 6, the R2 value correlating age with eff-RNAi is weak. Although they discuss this in the text, it might also be helpful to include Venn diagrams for gene overlaps and the significance to make the argument more clear that there is a significant correlation in proteins up and down to indicate that eff largely recapitulates the changes of aging. Correlating this with proteins that are restored with UBE2D in muscle in a more clear manner may also be helpful for readers interested in aging. 

      We have amended the text to indicate that this relatively low correlation (R2\=~0.2, but corresponding to a consistent regulation of 70% of proteins by aging and effRNAi) could indicate that eff/UBE2D is only in part responsible for maintaining a youthful composition of the muscle proteome during aging. Other changes that occur with aging likely account for non-correlated alterations in protein levels. We have also added Venn diagrams (Fig. 6E) to further display the overlap in protein regulation by aging vs. effRNAi.

      In Figure 7, they might indicate that the accumulated insoluble protein is ubiquitinated. That is left out of the figure, although indicated in the legend. 

      Done.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Our revised version of the manuscript addresses all the comments and suggestions raised, as clarified in our point-by-point answer to the reviewers. We have performed additional experiments regarding the effects on proliferation and differentiation of additional cell types in the muscle, such as myogenic and mesenchymal progenitors as well as chondrogenesis in parental hMSCs that did not express exogenous ACVR1. Moreover, as suggested by reviewer #2, we performed all the chondrogenic experiments with addition of TGFβ in the differentiation media and analyzed chondrogenesis by both Alcian blue staining and qPCR analysis of gene markers (Sox9, Acan, Col2a1 and Mmp3). We also extended our RNA-seq analysis and included new data using both hMSCs expression wild type or R206H ACVR1 receptor, with or without different ACVR1 ligands (BMP6 and Activin A) and treated or not with the inhibitor BYL719. The new data suggests that BYL719 is able to inhibit the expression of genes involved in ossification and osteoblast differentiation irrespective of the presence of the mutation. We also discuss the effect of BYL719 in mTOR signaling and addressed all the minor comments suggested by both reviewers.

      We addressed the specific comments of the reviewers as follows:

      Reviewer # 1:

      Specific points:

      Point #1 and #2. The authors showed that BYL719 inhibited HO in FOP model mice. Did they have HO not only in the muscle but also in the bone marrow? The progenitor cells of chondrocytes and osteoblasts may differ between the muscle and bone marrow. The authors should examine the effects of BYL719 on some other types of cells in the muscle, such as myoblasts and fibro-adipogenic cells, in addition to the bone marrow-derived MSCs. Furthermore, it was unclear whether they were human or murine MSCs in the text.

      The inhibitory effect of BYL719 on HO in FOP mice was clear, but the molecular mechanisms or target cells were still unclear because BYL719 affected multiple types of cells and molecules. The authors are encouraged to show clearer mechanisms and target cells' critical inhibition of HO. Again, this reviewer believes that in vivo and in vitro experiments using muscle and bone marrow and cells prepared from them should provide additional critical information.

      As detailed in the introduction, it is known that Heterotopic Ossification develops in the skeletal muscle and connective tissues. Consistent with the current knowledge of the field, none of the mice showed HO in the bone marrow. Additionally, since activation of the mutant allele is achieved by injection of CRE-expressing adenovirus and cardiotoxin in the muscle hindlimb, it is unlikely that mesenchymal progenitors in the bone marrow would be strongly affected. Interestingly, single-cell RNA sequencing from multiple mouse tissues identified a very strong transcriptional similarity between FAPs and non-muscle mesenchymal progenitors (PMID: 37599828). As suggested, we examined the effects of BYL719 in proliferation and differentiation in additional cell types such as muscle progenitors. In this new version of the manuscript, we show that BYL719 reduces the proliferation of muscle and mesenchymal progenitors while it blocks myoblast differentiation in vitro (Figure 7, Figure Supplement 1). MSCs were murine on those experiments shown in Figure 3; whereas assays shown in Figures 5 and 6 were of human origin. We have further clarified this in the respective Figure legends.

      All the data generated strongly suggests that there is not a single mechanism supporting all the effects of BYL719 in HO. Instead, BYL719 affects multiple cell types involved in efficient HO (e.g. reduction in proliferation and osteochondrogenic specification of mesenchymal precursors (MPs), reduction on proliferation, migration, and inflammatory gene expression on monocytes, etc.). Interestingly, our data suggests that BYL719 is able to inhibit these effects on MPs and monocytes irrespective of the presence of the ACVR1-R206H mutation (Figures 5, 6 and 7). Additionally, there are several signaling mechanisms affected. BYL719 reduces SMAD1/5, p38, AKT and mTOR signaling in parental MPs or with mutations in ACVR1 (Figure 3 and our previous publication PMID: 31373426), being all these pathways required for efficient osteochondrogenic specification of MPs. We consider that the different detailed mechanisms by which BYL719 inhibits osteochondrogenic specification enhances the robustness of the findings in this study.

      Point #3. In FOP model mice, ACVR1 was mutated as Q207D. However, R206H was used in in vitro experiments. Do they have the same characteristics? This reviewer would like to recommend examining the effect of BYL719 on wild-type ACVR1, R206H, and Q207D simultaneously in each experiment.

      We already performed these experiments, assaying in parallel ACVR1-WT, ACVR1-Q207D and ACVR1-R206H, in the transcriptional responses of MPs in our previous work (PMID: 31373426). Both mutations had similar responses, being ACVR1-Q207D stronger than ACVR1-R206H, as it has been shown in vivo in mouse models of HO (PMID: 34633114). In any case, BYL719 inhibits these transcriptional responses induced by both mutant alleles.

      Point #4. Figure 5: What was the effect of BYL719 on the differentiation of parental cells that did not express exogenous ACVR1?

      We performed new assays of chondrogenic differentiation of hMSCs that are shown in the new Figure 5. BYL719 inhibits chondrogenic differentiation of parental hMSCs and also inhibits chondrogenic specification irrespective of the expression of either wild type or mutant ACVR1.

      Point #5. Figure 6: In this experiment, gene expression was examined in pretreated MSCs-ALK2 (ACVR1?) R206H with and without BYL719. It was clear whether suppression of gene expression by BYL719 was specifically caused in cells expressing R206H. What were the effects of BYL719 on parental cells that did not express exogenous ACVR1?

      To be consistent, we relabeled ALK2 to ACVR1 in the figure. We expanded the conditions analyzed in the RNA-sequencing. We included conditions where we activate ACVR1 (either WT or R206H) with their known physiological ligand BMP6. In both, human MSCs expressing ACVR1-R206H and human MSCs expressing Wild Type ACVR1, we observed a downregulation of differentially expressed genes upon addition of BYL719, irrespective of ligand (BMP6 or Activin A) or receptor (RH or WT) (added new Figure 6: B and C).

      Point #6. Figure 7: BYL719 suppressed cell proliferation of all cells examined partially at 2 uM and almost completely at 10 uM, respectively. There is a possibility that BYL719 inhibits HO by inhibiting osteochondroprogenitor proliferation. The authors are encouraged to show data on the effect of BYL719 on the proliferation of other types of cells, such as myoblasts, fibro-adipogenic cells, or bone marrow cells.

      We examined the effects of BYL719 in proliferation in additional cell types such as muscle and mesenchymal progenitors. BYL719 slightly reduced the proliferation of myoblasts and mesenchymal cells in vitro (Figure 7, Figure Supplement 1). However, the reduction in the proliferation in myoblasts or MPs did not reach the extent to that observed in monocytes or macrophages (Figure 7).

      Point #7. Figure 8: How was the effect of BYL719 on muscle regeneration in wild-type? It was reported that mTOR signaling is important in HO in FOP. The authors are encouraged to show the effect of BYL719 on mTOR signaling.

      Muscle regeneration in wild-type mice has also been shown in our previous results PMID: 31373426. In addition, we included images of the muscle regeneration after 23 days of treatment with BYL719 in mice ACVR1Q207D with or without PI3Kα deletion after induction of HO in the new Figure 2, Figure Supplement 2. These mice showed full muscle regeneration or small calcifications surrounded by muscle at most. The effects of PI3Kα inhibitors, either BYL719 or A66, on mTOR signaling had been previously shown by our group (PMID: 31373426). Both inhibitors strongly reduced signaling of mTOR, visualized by activation of p70 S6-kinase, a surrogate marker of mTOR activity.

      Minor points:

      (9) SMAD 1/5 should be SMAD1/5.

      (10) The source of human MSCs should be indicated in the text.

      (11) ALK2 should be ACVR1 in Figure 6A.

      (12) The protein levels of each receptor should be examined in Fig. 4.

      We introduced the suggested changes in the manuscript and Figure 6 and indicated the source of human MSCs in Materials and Methods. We also examined the levels of each receptor that are shown in the new Figure 4, Figure Supplement 1.

      Reviewer # 2:

      Specific points:

      Point #1. Because the involvement of PI3K in HO of FOP, was already reported by authors' group and also others (Hino et al, Clin Invest, 2017), the main purpose of this study was to disclose the mechanism of how PI3K was activated in FOP cells. In the published study (Hino et al, Clin Invest, 2017), PI3K was activated by the ENPP2-LPA-LPR cascade. Unfortunately, there were no new data for this important issue.

      The main purpose of this study is to demonstrate that the pharmacological and genetic inhibition of PI3Kα in HO progenitors at injury sites reduces HO in vivo, to extend the insights into the molecular and cellular mechanisms responsible for the therapeutic effect of PI3K inhibition, and to optimize the timing of the administration of BYL719. Class I PI3Ks are heterodimers of a p110 catalytic subunit in complex with a regulatory subunit. They engage in signaling downstream of tyrosine kinases, G protein-coupled receptors and monomeric small GTPases. Therefore, a plethora of growth factors, cytokines, inflammatory agents, hormones and additional external and internal stimuli are able to activate PI3Kα (PMID: 31110302). In fact, TGF-β family members, including activin A, are able to activate PI3K and mediate some of their non-canonical responses (PMID: 19114990). Multiple factors with known increased expression in the ossifying niche in HO and FOP (e.g. activin A, TGF-β, inflammatory agents such as TNFα, IL6, IL3, etc.) are known activators of PI3K (PMID: 30429363). Interestingly, in our RNA-seq analysis in hMSCs we did not observe increased expression levels of Enpp2 when comparing wild type and R206H mutated cells treated with activin A.

      Point #2. The HO formation of ACVR1/Q207D model mice in this study is extremely unstable (Figure 1B, DMSO). Even the bone volume of some red symbols, which indicate the presence of HO, is located on the base (0.00) line. I would examine carefully the credibility of the data. Also, it is well known that the molecular behavior of mice Acvr1/Q207D and human ACVR1/R206H was different.

      We agree with the reviewer that induction of HO is variable between mice showing variations in penetrance and intensity of the ossifying lesions. This variability is a known common trend that appears in all the models of HO published so far (e.g. PMID: 28758906, PMID: 26333933). Accordingly, we did not exclude any animal that has been injected with CRE-expressing adenovirus plus cardiotoxin in the μCT analysis. Regarding the behavior of mice Acvr1/Q207D and human ACVR1/R206H, it is well known that Q207D produces more robust and stronger responses in terms of signaling and formation of heterotopic ossification (PMID: 34633114). Therefore, reduction of HO by BYL719 would be more stringent in the Acvr1/Q207D model.

      Point #3. The experimental design of Figure 5 experiments is confusing. Although the authors mentioned that the data in Figure 5A were taken seven days after chondrogenic induction, I am skeptical whether the chondrogenic induction was successful. Based on the description of Material and Methods, the authors did not include TGFβ in their "Differentiation Medium", which is an essential growth factor to induce chondrogenic differentiation of human MSC. Why did the ALP activity increase after chondrogenic induction? The authors should demonstrate the evidence of successful chondrogenic induction by showing the expression of key chondrogenic genes such as SOX9, ACAN, or COL2A1. The data in Figure 5B-E are also confusing. The addition of Activin A showed no difference between ACVR1/WT and ACVR1/R206H cells, suggesting that these cells did not reproduce the situation of FOP.

      We performed new assays of chondrogenic differentiation of hMSCs that are shown in the new Figure 5. We included TGFβ1 in the differentiation medium and also included the parental cell line in the analysis. In addition of being a marker of osteoblast differentiation, alkaline phosphatase (ALPL) has also been shown to be induced during chondroblast differentiation in vitro (PMID: 19855136; PMID: 9457080; PMID: 18377198; PMID: 23388029). Moreover, expression data of SOX9, COL2A1, ACAN and MMP13 of cells after chondrogenic differentiation is included in the new Figure 5. Expression of some markers (e.g. ACAN) are increased by the expression of ACVR1R206H, however, we did not observe significant differences in chondroblast differentiation gene expression between ACVR1wt and ACVR1R206H expressing cells. In any case, BYL719 could inhibit chondrogenic differentiation of parental hMSCs and also the chondrogenic specification irrespective of the expression of either wild type or mutant ACVR1.

      Point #4. The experimental design and data analyses of RNA-seq were inappropriate and insufficient, which is disappointing for the reviewer because this will be a key experiment in this study. Because the most important point is to identify the signal for PI3Kα induced by Activin A via ACVR1/R206H, they should also use hMSC-ACVR1/WT for this experiment. Because the authors clearly demonstrated that TGFBR were not targets of BYL719, they should compare the expression profiles between MSC-ACVR1/WT and MSC-ACVR1/WT with BYL719 to identify the targets of BYL719 unrelated to Activin A signal. Then the expression profiles of ACVR1/R206H cells treated with Activin A and Activin A plus BYL719 were compared. Among down-regulated signals by BYL719, those found also in MSC-ACVR1/WT should be discarded. It is important to investigate whether the GO term of ossification or osteoblast differentiation is found also in MSC-ACVR1/WT. If it is so, the effect of BYL719 is not specific for FOP cells.

      We extended our RNA sequencing analysis with additional experimental conditions and comparisons. In new Figure 6, we now compare hMSCs expressing wild type or R206H receptors, with or without BYL719 inhibition, and with or without different ligand activations (BMP6 or Activin A) (New Figure 6A). New Figure 6B shows the Gene ontology analysis of the differentially expressed genes between cells expressing WT and RH receptors under control conditions. We can observe that ossification (GO:0001503) and osteoblast differentiation (GO:0001649) were detected within the top 10 significantly differentially regulated biological processes between these conditions. Therefore, we analyzed these relevant identified GO terms in 5 different comparisons upon GO enrichment analysis (Figure 6C). In addition to the comparison between cells expressing WT and RH receptors under control conditions explained above, we also compared cells expressing WT or RH receptor, with different ACVR1 ligands (BMP6 and Activin A), and with or without BYL719 inhibitor. The addition of BYL719 resulted in a downregulation of the GO terms “ossification” and “osteoblast differentiation” (new Figure 6C). These results confirm the inhibitory effect of BYL719 on ossification and osteoblast differentiation biological processes, and inform that this inhibitory effect remains consistent upon BMP6 or Activin A ligand activation, and with ACVR1 WT and RH expression.

      Point #5. The data in Figure 7 were not related to the aim of this study because cell lines used in these experiments did not have ACVR1/R206H mutations. It is not appropriate to extrapolate these data in the FOP situation.

      We utilized immune cell lines where we could activate ACVR1 with their known physiological ligand BMP6. Mutated ACVR1 gains response to activin A in addition to maintaining the physiological response to BMP6 as the wild type form. Therefore, in these assays we interrogated in vitro, with addition of BMP6, the effects of BYL719 in the growth, migration and inflammatory gene expression upon conditions of activated ACVR1 receptor downstream signaling. We consider that understanding the effects of PI3Kα inhibition in the regulation of proliferation, migration and inflammatory cytokine expression in monocytes, macrophages and mast cells is essential to better define the potential outcome of BYL719 treatment for heterotopic ossifications.

      Minor comments:

      (1) The legends for Figure 1C were those for Figure 1D, and there were no descriptions for Figure 1C in the legends and methods section. The reviewer was unable to understand the meaning of BV/TV. What is TV?

      (2) “However, in PI3Kα deficient mice ACVR1Q207D expression only led to minor ectopic calcifications that were already surrounded by fully regenerated muscle tissue on the 23rd day after injury (Figure 2D, Figure 2-Figure Supplement 1B)": There were no histological data either Figure 2D, Figure 2-Figure Supplement 1B), which showed muscle tissues.

      (3) "The overexpression of Acvr1R206H increased basal and activin dependent expression of canonical (Id1 and Sp7) and non-canonical (Ptgs2) BMP target genes (Figure 3C),": There was no increase of Ptgs2 gene in basal level.

      (4) Materials and Methods. Production of human fetal mesenchymal stem cells expressing ACVR1.: Is it derived from a fetus?

      (5) Figure 6C: There was no description of the meaning of each column. What does AA mean and what is the number?

      We introduced the missing information in the manuscript, Figure legends and material and methods section for points #1, 4 and 5. AA was Activin A, the number was the number of replicates. This has been detailed in the figure legend. We included images of the muscle regeneration after 23 days of treatment with BYL719 in mice after induction of HO in the new Figure 2, Figure Supplement 2 (point #2). We corrected the mistake in the manuscript refraining for suggesting increase of Ptgs2 gene expression by ACVR1-R206 at the basal level (Point #3).

    2. eLife assessment

      This study provides valuable insights by demonstrating that BYL719 is a promising therapeutic agent for the treatment of heterotopic ossification (HO), with inhibition of PI3Ka via BYL719 appearing to be a critical factor. However, the results of the study are incomplete because BYL719 affects multiple intracellular signaling pathways beyond PI3Ka, and it thus remains uncertain whether BYL719 attenuates HO exclusively through suppression of the PI3Ka pathway or through modulation of alternative signaling pathways. A detailed elucidation of the molecular mechanisms of action of BYL719 is essential for a thorough understanding of its effects.

    3. Reviewer #1 (Public Review):

      Summary:

      In the present study, the authors examined the possibility of using phosphatidyl-inositol kinase 3-kinase alpha (PI3Ka) inhibitors for heterotopic ossification (HO) in fibrodysplasia ossificans progressiva (FOP). Administration of BYL719, a chemical inhibitor of PI3Ka, prevented HO in a mouse model of FOP that expressed a mutated ACVR1 receptor. Genetic ablation of PI3Ka (p110a) also suppressed HO in mice. BYL719 blocked osteochondroprogenitor specification and reduced inflammatory responses, such as pro-inflammatory cytokine expression and migration/proliferation of immune cells. The authors claimed that inhibition of PI3Ka is a safe and effective therapeutic strategy for HO.

      This is a revision of the original manuscript by Valer et al. The authors performed new experiments and added those data to the manuscript to respond to this reviewer's comments and questions.

      Strengths:

      Now it became clear that BYL719 inhibited the multiple signaling pathways in multiple types of cells.

      Weaknesses:

      However, it was not clear the critical role of PI3K in the inhibition of HO by this compound.

    4. Reviewer #2 (Public Review):

      Summary:

      The authors in this study previously reported that BYL719, an inhibitor of PI3Kα, suppressed heterotopic ossification in mice model of a human genetic disease, fibrodysplasia ossificans progressive, which is caused by the activation of mutant ACVR1/R206H by Activin A. The aim of this study is to identify the mechanism of BYL719 for the inhibition of heterotopic ossification. They found that BYL719 suppressed heterotopic ossification in two ways: one is to inhibit the specification of precursor cells for chondrogenic and osteogenic differentiation and the other is to suppress the activation of inflammatory cells.

      Strengths:

      This study is based on authors' previous reports and the experimental procedures including the animal model are established. In addition, to confirm the role of PI3Kα, authors used the conditional knock-out mice of the subunit of PI3Kα. They clearly demonstrated the evidence indicating that the targets of PI3Kα are not members of TGFBR by a newly established experimental method.

      Weaknesses:

      Overall, the presented data were closely related to those previously published by authors' group or others and there were very few new findings. The molecular mechanisms through which BYL719 inhibits HO remain unclear, even in the revised manuscript.

      Heterotopic ossification in the mice model was not stable and inappropriate for the scientific evaluation.

      The method for chondrogenic differentiation was not appropriate, and the scientific evidence of successful differentiation was lacking.

      The design of the gene expression profile comparison was not appropriate and failed to obtain the data for the main aim of this study.

      The experiments of inflammatory cells were performed in cell lines without ACVR1/R206H mutation, and therefore the obtained data were not precisely related to the inflammation in FOP.

    1. eLife assessment

      This study presents the cryo-EM structures of two human biotin-dependent mitochondria carboxylases involved in various biological pathways, including the metabolism of certain amino acids, cholesterol, and odd chain fatty acids. The cryo-EM structures offer a valuable addition to the structural description of biotin-dependent carboxylases and provide solid evidence to support the major conclusions of this study. This paper would be of interest to biochemists and structural biologists working on biotin-dependent carboxylases.

    2. Reviewer #1 (Public Review):

      Summary:

      The manuscript by Zhou et al offers new high-resolution Cryo-EM structures of two human biotin-dependent enzymes: propionyl-CoA carboxylase (PCC) and methycrotonyl-CoA carboxylase (MCC). While X-ray crystal structures and Cryo-EM structures have previously been reported for bacterial and trypanosomal versions of MCC and for bacterial versions of PCC, this marks one of the first high resolution Cryo-EM structures of the human version of these enzymes. Using the biotin cofactor as an affinity tag, this team purified a group of four different human biotin-dependent carboxylases from cultured human Expi 293F (kidney) cells (PCC, MCC, acetyl-CoA carboxylase (ACC), and pyruvate carboxylase). Following further enrichment by size-exclusion chromatography, they were able to vitrify the sample and pick enough particles of MCC and PCC to separately refine the structures of both enzymes to relatively high average resolutions (the Cryo-EM structure of ACC also appears to have been determined from these same micrographs, though this is the subject of a separate publication). To determine the impact of substrate binding on the structure of these enzymes and to gain insights into substrate selectivity, they also separately incubated with propionyl-CoA and acetyl-CoA and vitrified the samples under active turnover conditions, yielding a set of cryo-EM structures for both MCC and PCC in the presence and absence of substrates and substrate analogues.

      Strengths:

      The manuscript has several strengths. It is clearly written, the figures are clear and the sample preparation methods appear to be well described. This study demonstrates that Cryo-EM is an ideal structural method to investigate the structure of these heterogeneous samples of large biotin-dependent enzymes. As a consequence, many new Cryo-EM structures of biotin-dependent enzymes are emerging, thanks to the natural inclusion of a built-in biotin affinity tag. While the authors report no major differences between the human and bacterial forms of these enzymes, it remains an important finding that they demonstrate how/if the structure of the human enzymes are or are not distinct from the bacterial enzymes. The MCC structures also provide evidence for a transition for BCCP-biotin from an exo-binding site to an endo-binding site in response to acetyl-CoA binding. This contributes to a growing number of biotin-dependent carboxylase structures that reveal BCCP-biotin binding at locations both inside (endo-) and outside (exo-) of the active site.

      Weaknesses:

      There are some minor weaknesses. Notably, there are not a lot of new insights coming from this paper. The structural comparisons between MCC and PCC have already been described in the literature and there were not a lot of significant changes (outside of the exo- to endo- transition) in the presence vs. absence of substrate analogues. There is not a great deal of depth of analysis in the discussion. For example, no new insights were gained with respect to the factors contributing to substrate selectivity (the factors contributing to selectivity for propionyl-CoA vs. acetyl-CoA in PCC). The authors state that the longer acyl group in propionyl-CoA may mediate stronger hydrophobic interactions that stabilize the alpha carbon of the acyl group at the proper position. This is not a particularly deep analysis and doesn't really require a cryo-EM structure to invoke. The authors did not take the opportunity to describe the specific interactions that may be responsible for the stronger hydrophobic interaction nor do they offer any plausible explanation for how these might account for an astounding difference in the selectivity for propionyl-CoA vs. acetyl-CoA. This suggests, perhaps, that these structures do not yet fully capture the proper conformational states. The authors also need to be careful with their over-interpretation of structure to invoke mechanisms of conformational change. A snapshot of the starting state (apo) and final state (ligand-bound) is insufficient to conclude *how* the enzyme transitioned between conformational states. I am constantly frustrated by structural reports in the biotin-dependent enzymes that invoke "induced conformational changes" with absolutely no experimental evidence to support such statements. Conformational changes that accompany ligand binding may occur through an induced conformational change or through conformational selection and structural snapshots of the starting point and the end point cannot offer any valid insight into which of these mechanisms is at play.

      Some of these minor deficiencies aside, the overall aim of contributing new cryo-EM structures of the human MCC and PCC has been achieved. While I am not a cryo-EM expert, I see no flaws in the methodology or approach. While the contributions from these structures are somewhat incremental, it is nevertheless important to have these representative examples of the human enzymes and it is noteworthy to see a new example of the exo-binding site in a biotin-dependent enzyme.

    3. Reviewer #2 (Public Review):

      Summary:

      This paper reports the structures of two human biotin-dependent carboxylases. The authors used endogenously purified proteins and solved the structures in high resolutions. Based on the structures, they defined the binding site for acyl-CoA and biotin and reported the potential conformational changes in biotin position.

      Strengths:

      The authors effectively utilized the biotin of the two proteins and obtained homogeneous proteins from human cells. They determined the high-resolution structures of the two enzymes in apo and substrate-bound states.

      Comments and questions to the manuscripts:

      (1) I'm quite impressed with the protein purification and structure determination, but I think some functional characterization of the purified proteins should be included in the manuscript. The activity of enzymes should be the foundation of all structures and other speculations based on structures.

      (2) In Figure 1B, the structure of MCC is shown as two layers of beta units and two layers of alpha units, while there is only one layer of alpha units resolved in the density maps. I suggest the authors show the structures resolved based on the density maps and show the complete structure with the docked layer in the supplementary figure.

      (3) In the introduction, I suggest the author provide more information about the previous studies about the structure and reaction mechanisms of BDCs, what is the knowledge gap, and what problem you will resolve with a higher resolution structure. For example, you mentioned in line 52 that G437 and A438 are catalytic residues, are these residues reported as catalytic residues or this is based on your structures? Has the catalytic mechanism been reported before? Has the role of biotin in catalytic reactions revealed in previous studies?

      (4) In the discussion, the authors indicate that the movement of biotin could be related to the recognition of acyl-CoA in BDCs, however, they didn't observe a change in the propionyl-CoA bound MCC structure, which is contradictory to their speculation. What could be the explanation for the exception in the MCC structure?

      (5) In the discussion, the authors indicate that the selectivity of PCC to different acyl-CoA is determined by the recognition of the acyl chain. However, there are no figures or descriptions about the recognition of the acyl chain by PCC and MCC. It will be more informative if they can show more details about substrate recognition in Figures 3 and 4.

      (6) How are the solved structures compared with the latest Alphafold3 prediction?

    4. Author response:

      Reviewer #1 (Public Review):

      Weaknesses:

      There are some minor weaknesses.

      Notably, there are not a lot of new insights coming from this paper. The structural comparisons between MCC and PCC have already been described in the literature and there were not a lot of significant changes (outside of the exo- to endo- transition) in the presence vs. absence of substrate analogues.

      We agree that the structures of the human MCC and PCC holoenzymes are similar to their bacterial homologs. That is due to the conserved sequences and functions of MCC and PCC across different species.

      There is not a great deal of depth of analysis in the discussion. For example, no new insights were gained with respect to the factors contributing to substrate selectivity (the factors contributing to selectivity for propionyl-CoA vs. acetyl-CoA in PCC). The authors state that the longer acyl group in propionyl-CoA may mediate stronger hydrophobic interactions that stabilize the alpha carbon of the acyl group at the proper position. This is not a particularly deep analysis and doesn't really require a cryo-EM structure to invoke. The authors did not take the opportunity to describe the specific interactions that may be responsible for the stronger hydrophobic interaction nor do they offer any plausible explanation for how these might account for an astounding difference in the selectivity for propionyl-CoA vs. acetyl-CoA. This suggests, perhaps, that these structures do not yet fully capture the proper conformational states.

      We appreciate this comment. Unfortunately, in the cryo-EM maps of the PCC holoenzymes, the acyl groups were not resolved (fig. S6), so we were unable to analyze the specific interactions between the acyl-CoAs and PCC. We will discuss this limitation in our revised manuscript.

      The authors also need to be careful with their over-interpretation of structure to invoke mechanisms of conformational change. A snapshot of the starting state (apo) and final state (ligand-bound) is insufficient to conclude *how* the enzyme transitioned between conformational states. I am constantly frustrated by structural reports in the biotin-dependent enzymes that invoke "induced conformational changes" with absolutely no experimental evidence to support such statements. Conformational changes that accompany ligand binding may occur through an induced conformational change or through conformational selection and structural snapshots of the starting point and the end point cannot offer any valid insight into which of these mechanisms is at play.

      Point accepted. We will revise our manuscript to use "conformational differences" instead of "conformational changes" to describe the differences between the apo and ligand-bound states.

      Reviewer #2 (Public Review):

      Comments and questions to the manuscripts:

      I'm quite impressed with the protein purification and structure determination, but I think some functional characterization of the purified proteins should be included in the manuscript. The activity of enzymes should be the foundation of all structures and other speculations based on structures.

      We appreciate this comment. However, since we purified the endogenous BDCs and the sample we obtained was a mixture of four BDCs, the enzymatic activity of this mixture cannot accurately reflect the catalytic activity of PCC or MCC holoenzyme. We will acknowledge this limitation in the discussion section of our revised manuscript.

      In Figure 1B, the structure of MCC is shown as two layers of beta units and two layers of alpha units, while there is only one layer of alpha units resolved in the density maps. I suggest the authors show the structures resolved based on the density maps and show the complete structure with the docked layer in the supplementary figure.

      We appreciate this comment. We have shown the cryo-EM maps of the PCC and MCC holoenzymes in fig. S8 to indicate the unresolved regions in these structures. The BC domains in one layer of MCCα in the MCC-apo structure were not resolved. However, we think it would be better to show a complete structure in Fig. 1 to provide an overall view of the MCC holoenzyme. We will revise Fig. 1B and the figure legend to clearly point out which domains were not resolved in the cryo-EM map and were built in the structure through docking.

      In the introduction, I suggest the author provide more information about the previous studies about the structure and reaction mechanisms of BDCs, what is the knowledge gap, and what problem you will resolve with a higher resolution structure. For example, you mentioned in line 52 that G437 and A438 are catalytic residues, are these residues reported as catalytic residues or this is based on your structures? Has the catalytic mechanism been reported before? Has the role of biotin in catalytic reactions revealed in previous studies?

      Point accepted. It was reported that G419 and A420 in S. coelicolor PCC, corresponding to G437 and A438 in human PCC, were the catalytic residues (PMID: 15518551). The same study also reported the catalytic mechanism of the carboxyl transfer reaction. The role of biotin in the BDC-catalyzed carboxylation reactions has been extensively studied (PMIDs: 22869039, 28683917). We will include these information in the introduction section of our revised manuscript.

      In the discussion, the authors indicate that the movement of biotin could be related to the recognition of acyl-CoA in BDCs, however, they didn't observe a change in the propionyl-CoA bound MCC structure, which is contradictory to their speculation. What could be the explanation for the exception in the MCC structure?

      We appreciate this comment. We do not have a good explanation for why we did not observe a change in the propionyl-CoA bound MCC structure. It is noteworthy that neither acetyl-CoA nor propionyl-CoA is the natural substrate of MCC. Recently, a cryo-EM structure of the human MCC holoenzyme in complex with its natural substrate, 3-methylcrotonyl-CoA, has been resolved (PDB code: 8J4Z). In this structure, the binding site of biotin and the conformation of the CT domain closely resemble that in our acetyl-CoA-bound MCC structure. Therefore, the movement of biotin induced by acetyl-CoA binding mimics that induced by the binding of MCC's natural substrate, 3-methylcrotonyl-CoA, indicating that in comparison with propionylCoA, acetyl-CoA is closer to 3-methylcrotonyl-CoA regarding its ability to bind to MCC. We will discuss this possibility in our revised manuscript.

      In the discussion, the authors indicate that the selectivity of PCC to different acyl-CoA is determined by the recognition of the acyl chain. However, there are no figures or descriptions about the recognition of the acyl chain by PCC and MCC. It will be more informative if they can show more details about substrate recognition in Figures 3 and 4.

      We appreciate this comment. Unfortunately, in the cryo-EM maps of the PCC holoenzymes, the acyl groups were not resolved (fig. S6), so we were unable to analyze the specific interactions between the acyl-CoAs and PCC. We will discuss this limitation in our revised manuscript.

      How are the solved structures compared with the latest Alphafold3 prediction?

      Since AlphaFold3 was not released when our manuscript was submitted, we did not compare the solved structures with the AlphaFold3 predictions. We have now carried out the predictions using Alphafold3. Due to the token limitation of the AlphaFold3 server, we can only include two α and six β subunits of human PCC or MCC in the prediction. The overall assembly patterns of the Alphafold3-predicted structures are similar to that of the cryo-EM structures. The RMSDs between PCCα, PCCβ, MCCα, and MCCβ in the apo cryo-EM structures and those in the AlphaFold3-predicted structures are 7.490 Å, 0.857 Å, 7.869 Å, and 1.845 Å, respectively. The PCCα and MCCα subunits adopt an open conformation in the cryo-EM structures but adopt a closed conformation in the AlphaFold-3 predicted structures, resulting in large RMSDs.

    1. eLife assessment

      This study presents an important contribution to cardiac arrhythmia research by demonstrating long noncoding RNA Dachshund homolog 1 (lncDACH1) tunes sodium channel functional expression and affects cardiac action potential conduction and rhythms. The evidence supporting the major claims are convincing. The work will be of broad interest to cell biologists and cardiac electrophysiologists.

    2. Reviewer #1 (Public Review):

      Summary:

      In this study, the authors show that a long-non coding RNA lncDACH1 inhibits sodium currents in cardiomyocytes by binding to and altering the localization of dystrophin. The authors use a number of methodologies to demonstrate that lncDACH1 binds to dystrophin and disrupts its localization to the membrane, which in turn downregulates NaV1.5 currents. Knockdown of lncDACH1 upregulates NaV1.5 currents. Furthermore, in heart failure, lncDACH1 is shown to be upregulated which suggests that this mechanism may have pathophysiological relevance.

      Strengths:

      (1) This study presents a novel mechanism of Na channel regulation which may be pathophysiologically important.

      (2) The experiments are comprehensive and systematically evaluate the physiological importance of lncDACH1.

    3. Reviewer #2 (Public Review):

      This manuscript by Xue et al. describes the effects of a long noncoding RNA, lncDACH1, on the localization of Nav channel expression, the magnitude of INa, and arrhythmia susceptibility in the mouse heart. Because lncDACH1 was previously reported to bind and disrupt membrane expression of dystrophin, which in turn is required for proper Nav1.5 localization, much of the findings are inferred through the lens of dystrophin alterations.

      The results report that cardiomyocyte-specific transgenic overexpression of lncDACH1 reduces INa in isolated cardiomyocytes; measurements in the whole heart show a corresponding reduction in conduction velocity and enhanced susceptibility to arrhythmia. The effect on INa was confirmed in isolated WT mouse cardiomyocytes infected with a lncDACH1 adenoviral construct. Importantly, reducing lncDACH1 expression via either a cardiomyocyte-specific knockout or using shRNA had the opposite effect: INa was increased in isolated cells, as was conduction velocity in the heart. Experiments were also conducted with a fragment of lnDACH1 identified by its conservation with other mammalian species. Overexpression of this fragment resulted in reduced INa and greater proarrhythmic behavior. Alteration of expression was confirmed by qPCR.

      The mechanism by which lnDACH1 exerts its effects on INa was explored by measuring protein levels from cell fractions and immunofluorescence localization in cells. In general, overexpression was reported to reduce Nav1.5 and dystrophin levels and knockout or knockdown increased them.

      The strengths of this manuscript include convincing evidence of a link between lncDACH1 and Na channel function. The identification of a lncDACH1 segment conserved among mammalian species is compelling. The observation that lncDACH1 is increased in a heart failure model and provides a plausible hypothesis for disease mechanism.

    4. Reviewer #3 (Public Review):

      Summary:

      In this manuscript, the authors report the first evidence of Nav1.5 regulation by a long noncoding RNA, LncRNA-DACH1, and suggest its implication in the reduction in sodium current observed in heart failure. Since no direct interaction is observed between Nav1.5 and the LncRNA, they propose that the regulation is via dystrophin and targeting of Nav1.5 to the plasma membrane.

      Strengths:

      (1) First evidence of Nav1.5 regulation by a long noncoding RNA.<br /> (2) Implication of LncRNA-DACH1 in heart failure and mechanisms of arrhythmias.<br /> (3) Demonstration of LncRNA-DACH1 binding to dystrophin.<br /> (4) Potential rescuing of dystrophin and Nav1.5 strategy.

    5. Author response:

      The following is the authors’ response to the previous reviews.

      eLife assessment

      This study presents an important contribution to cardiac arrhythmia research by demonstrating long noncoding RNA Dachshund homolog 1 (lncDACH1) tunes sodium channel functional expression and affects cardiac action potential conduction and rhythms. The evidence supporting the major claims are solid. The work will be of broad interest to cell biologists and cardiac electrophysiologists.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this study, the authors show that a long-non coding RNA lncDACH1 inhibits sodium currents in cardiomyocytes by binding to and altering the localization of dystrophin. The authors use a number of methodologies to demonstrate that lncDACH1 binds to dystrophin and disrupt its localization to the membrane, which in turn downregulates NaV1.5 currents. Knockdown of lncDACH1 upregulates NaV1.5 currents. Furthermore, in heart failure, lncDACH1 is shown to be upregulated which suggests that this mechanism may have pathophysiological relevance.

      Strengths:

      (1) This study presents a novel mechanism of Na channel regulation which may be pathophysiologically important.

      (2) The experiments are comprehensive and systematically evaluate the physiological importance of lncDACH1.

      Reviewer #2 (Public Review):

      This manuscript by Xue et al. describes the effects of a long noncoding RNA, lncDACH1, on the localization of Nav channel expression, the magnitude of INa, and arrhythmia susceptibility in the mouse heart. Because lncDACH1 was previously reported to bind and disrupt membrane expression of dystrophin, which in turn is required for proper Nav1.5 localization, much of the findings are inferred through the lens of dystrophin alterations.

      The results report that cardiomyocyte-specific transgenic overexpression of lncDACH1 reduces INa in isolated cardiomyocytes; measurements in whole heart show a corresponding reduction in conduction velocity and enhanced susceptibility to arrhythmia. The effect on INa was confirmed in isolated WT mouse cardiomyocytes infected with a lncDACH1 adenoviral construct. Importantly, reducing lncDACH1 expression via either a cardiomyocyte-specific knockout or using shRNA had the opposite effect: INa was increased in isolated cells, as was conduction velocity in heart. Experiments were also conducted with a fragment of lnDACH1 identified by its conservation with other mammalian species. Overexpression of this fragment resulted in reduced INa and greater proarrhythmic behavior. Alteration of expression was confirmed by qPCR.

      The mechanism by which lnDACH1 exerts its effects on INa was explored by measuring protein levels from cell fractions and immunofluorescence localization in cells. In general, overexpression was reported to reduce Nav1.5 and dystrophin levels and knockout or knockdown increased them.

      The strengths of this manuscript include convincing evidence of a link between lncDACH1 and Na channel function. The identification of a lncDACH1 segment conserved among mammalian species is compelling. The observation that lncDACH1 is increased in a heart failure model and provides a plausible hypothesis for disease mechanism.

      One limitation of the fractionation approach is the uncertain disposition of Na channel protein deemed "cytoplasmic." It seems likely that the membrane fraction includes ER membrane. The signal may reasonably be attributed to Na channel protein in stalled transport vesicles, or alternatively in stress granules, but this was not directly addressed.

      Reviewer #3 (Public Review):

      Summary:

      In this manuscript, the authors report the first evidence of Nav1.5 regulation by a long noncoding RNA, LncRNA-DACH1, and suggest its implication in the reduction in sodium current observed in heart failure. Since no direct interaction is observed between Nav1.5 and the LncRNA, they propose that the regulation is via dystrophin and targeting of Nav1.5 to the plasma membrane.

      Strengths:

      (1) First evidence of Nav1.5 regulation by a long noncoding RNA.

      (2) Implication of LncRNA-DACH1 in heart failure and mechanisms of arrhythmias.

      (3) Demonstration of LncRNA-DACH1 binding to dystrophin.

      (4) Potential rescuing of dystrophin and Nav1.5 strategy.

      Weaknesses:

      (1) The fact that the total Nav1.5 protein is reduced by 50% which is similar to the reduction in the membrane reduction questions the main conclusion of the authors implicating dystrophin in the reduced Nav1.5 targeting. The reduction in membrane Nav1.5 could simply be due to the reduction in total protein.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Weaknesses:

      (1) What is indicated by the cytoplasmic level of NaV1.5, a transmembrane protein?

      This is still confusing. Since Nav1.5 is an integral membrane protein, I am not sure what is really meant here by cytosolic fraction. From the workflow, it seems a separate organelle fraction is also collected. Is the amount of Nav1.5 in this fraction (which I assume includes for e.g. lysosome) also increased with lncDACH1? I recommend the authors to refer to the Nav channels not at the plasma membrane as 'intracellular' rather than 'cytoplasmic'.

      Thanks for the insightful comment. We completely agree. Accordingly, we have changed “cytoplasmic” to “ intracellular“.

      Line 226. "In consistent with the results" Perhaps unnecessary to have "in"

      Thank you for the insightful comment. We have corrected it.

      Line 228. Is it optimal or optical?

      Sorry for the mistake, it should be optical. We have corrected it.

      Reviewer #3 (Recommendations For The Authors):

      I still have an issue with the total reduction in Nav1.5 which is about the same as the reduction in membrane and currents. The authors argue that there is an increase in cytoplasmic Nav1.5. However the controls that they provide for membrane and cytoplasmic fractions are not convincing.

      Thank you for the insightful comment. We can not rule out the possibility that the reduction in membrane Nav1.5 maybe be due to the reduction in total protein. Our data indicates that the membrane and total protein levels of Nav1.5 were reduced by 50%. However, the intracellular Nav1.5 was not decreased, but increased in the hearts of lncDACH1-TG mice than WT controls, which indicates that the intracellular Nav1.5 failed to traffic to the membrane.

    1. eLife assessment

      In this study, the authors provide valuable evidence that the LGE is not a significant source of oligodendrocytes for the cortex. The reviewers did find some technical considerations that call for some modulation of the strength of the authors' conclusions and also pointed out some aspects of the data that were incomplete as presented.

    2. Reviewer #1 (Public Review):

      Summary:

      In this study, the authors generated a novel transgenic mouse line OpalinP2A-Flpo-T2A-tTA2 to specifically label mature oligodendrocytes, and at the same time their embryonic origins by crossing with a progenitor cre mouse line. With this clever approach, they found that LGE/CGE-derived OLs make minimum contributions to the neocortex, whereas MGE/POA-derived OLs make a small but lasting contribution to the cortex. These findings are contradictory to the current belief that LGE/CGE-derived OPCs make a sustained contribution to cortical OLs, whereas MGE/POA-derived OPCs are completely eliminated. Thus, this study provides a revised and more comprehensive view on the embryonic origins of cortical oligodendrocytes. To specifically label mature oligodendrocytes, and at the same time their embryonic origins by crossing with a progenitor cre mouse line. With this clever approach, they found that LGE/CGE-derived OLs make minimum contributions to the neocortex, whereas MGE/POA-derived OLs make a small-but-lasting contribution to to cortex. These findings are contradictory to the current belief that LGE/CGE-derived OPCs make a sustained contribution to cortical OLs, whereas MGE/POA-derived OPCs are completely eliminated. Thus, this study has provided a revised and updated view on the embryonic origins of cortical oligodendrocytes.

      Strengths:

      The authors have generated a novel transgenic mouse line to specifically label mature differentiated oligodendrocytes, which is very useful for tracing the final destiny of mature myelinating oligodendrocytes. Also, the authors carefully compared the distribution of three progenitor cre mouse lines and suggested that Gsh-cre also labeled dorsal OLs, contrary to the previous suggestion that it only marks LGE-derived OPCs. In addition, the author also analyzed the relative contributions of OLs derived from three distinct progenitor domains in other forebrain regions (e.g. Pir, ac). Finally, the new transgenic mouse lines and established multiple combinatorial genetic models will facilitate future investigations of the developmental origins of distinct OL populations and their functional and molecular heterogeneity.

      Comments on latest version: In this revised and improved manuscript, the authors have adequately addressed my concerns, and I have no further issues to raise.

    3. Reviewer #2 (Public Review):

      In this manuscript, Cai et al use a combination of mouse transgenic lines to re-examine the question of the embryonic origin of telencephalic oligodendrocytes (OLs). Their tools include a novel Flp mouse for labelling mature oligodendrocytes and a number of pre-existing lines (some previously generated by the last author in Josh Huang's lab) that allowed combinatorial or subtractive labelling of oligodendrocytes with different origins. The conclusion is that cortically-derived OLs are the predominant OL population in the motor and somatosensory cortex and underlying corpus callosum, while the LGE/CGE generates OLs for the piriform cortex and anterior commissure rather than the cerebral cortex. Small numbers of MGE-derived OLs persist long-term in the motor, somatosensory and piriform cortex.

      Strengths:<br /> The strength and novelty of the manuscript lie in the elegant tools generated and used. These have enabled the resolution of the issue regarding the contribution of different telencephalic progenitor zones to the cortical oligodendrocyte population.

      Comments on latest version:

      The revised manuscript by Cai et al has addressed all the issues raised. I have some minor comments:

      Figure 2: The y axis in figure 2L should be the same as the y axis in 2M to make the contribution to Mo and SS more clear.

      Figure 3: Although this is clear in the figure, A an B should be labelled as classical model and new model to help the reader understand immediately what the two figures show.

      Suppl Fig 2: It is not clear what 1-7 represent. It should be made clear in the legend which areas have been pooled into the different bins. The X axis should be labelled.

    4. Reviewer #3 (Public Review):

      In the manuscript entitled "Embryonic Origins of Forebrain Oligodendrocytes Revisited by Combinatorial Genetic Fate Mapping," Cai et al. used an intersectional/subtractional strategy to genetically fate-map the oligodendrocyte populations (OLs) generated from medial ganglionic eminence (NKX2.1+), lateral ganglionic eminences, and dorsal progenitor cells (EMX1+). Specifically, they generated an OL-expressing reporter mouse line OpalinP2A-Flpo-T2A-tTA2 and bred with region-specific neural progenitor-expressing Cre lines EMX1-Cre for dOL and NKX2.1-Cre for MPOL. They used a subtractional strategy in the OpalinFlp::Emx1Cre::Nkx2.1Cre::RC::FLTG mouse line to predict the origins of OLs from lateral/caudal ganglionic eminences (LC). With their genetic tools, the authors concluded that neocortical OLs primarily consist of dOLs. Although the populations of OLs (dOLs or MP-OLs) from Emx1+ or Nkx2.1+ progenitors are largely consistent with previous findings, they observed that MP-OLs contribute minimally but persist into adulthood without elimination as in the previous report (PMID: 16388308).

      Intriguingly, by using an indirect subtraction approach, they hypothesize that both Emx1-negative and Nkx2.1-negative cells represent the progenitors from lateral/caudal ganglionic eminences (LC), and conclude that neocortical OLs are not derived from the LC region. This is in contrast to the previous observation for the contribution of LC-expressing progenitors (marked by Gsx2-Cre) to neocortical OLs (PMID: 16388308). The authors claim that Gsh2 is not exclusive to progenitor cells in the LC region (PMID: 32234482). However, Gsh2 exhibits high enrichment in the LC during early embryonic development. The presence of a small population of Gsh2-positive cells in the late embryonic cortex could originate/migrate from Gsh2-positive cells in the LC at earlier stages (PMID: 32234482). Consequently, the possibility that cortical OLs derived from Gsh2+ progenitors in LC could not be conclusively ruled out. Notably, a population of OLs migrating from the ventral to the dorsal cortical region was detected after eliminating dorsal progenitor-derived OLs (PMID: 16436615).

      The indirect subtraction data for LC progenitors drawn from the OpalinFlp-tdTOM reporter in Emx1-negative and Nkx2.1-negative cells in the OpalinFlp::Emx1Cre::Nkx2.1Cre::RC::FLTG mouse line present some caveats that could influence their conclusion. The extent of activity from the two Cre lines in the OpalinFlp::Emx1Cre::Nkx2.1Cre::RC::FLTG mice remains uncertain. The OpalinFlp-tdTOM expression could occur in the presence of either Emx1Cre or Nkx2.1Cre, raising questions about the contribution of the individual Cre lines. To clarify, the authors should compare the tdTOM expression from each individual Cre line, OpalinFlp::Emx1Cre::RC::FLTG or OpalinFlp::Nkx2.1Cre::RC::FLTG, with the combined OpalinFlp::Emx1Cre::Nkx2.1Cre::RC::FLTG mouse line. This comparison is crucial as the results from the combined Cre lines could appear similar to only one Cre line active.

      Overall, the authors provided intriguing findings regarding the origin and fate of oligodendrocytes from different progenitor cells in embryonic brain regions. However, further analysis is necessary to substantiate their conclusion about the fate of LC-derived OLs convincingly.

      Comments on latest version: The overall responses by the authors are satisfactory.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      In this study, the authors generated a novel transgenic mouse line OpalinP2A-Flpo-T2A-tTA2 to specifically label mature oligodendrocytes, and at the same time their embryonic origins by crossing with a progenitor cre mouse line. With this clever approach, they found that LGE/CGE-derived OLs make minimum contributions to the neocortex, whereas MGE/POA-derived OLs make a small but lasting contribution to the cortex. These findings are contradictory to the current belief that LGE/CGE-derived OPCs make a sustained contribution to cortical OLs, whereas MGE/POA-derived OPCs are completely eliminated. Thus, this study provides a revised and more comprehensive view on the embryonic origins of cortical oligodendrocytes. To specifically label mature oligodendrocytes, and at the same time their embryonic origins by crossing with a progenitor cre mouse line. With this clever approach, they found that LGE/CGE-derived OLs make minimum contributions to the neocortex, whereas MGE/POA-derived OLs make a small-but-lasting contribution to to cortex. These findings are contradictory to the current belief that LGE/CGE-derived OPCs make a sustained contribution to cortical OLs, whereas MGE/POA-derived OPCs are completely eliminated. Thus, this study has provided a revised and updated view on the embryonic origins of cortical oligodendrocytes.

      Strengths:

      The authors have generated a novel transgenic mouse line to specifically label mature differentiated oligodendrocytes, which is very useful for tracing the final destiny of mature myelinating oligodendrocytes. Also, the authors carefully compared the distribution of three progenitor cre mouse lines and suggested that Gsh-cre also labeled dorsal OLs, contrary to the previous suggestion that it only marks LGE-derived OPCs. In addition, the author also analyzed the relative contributions of OLs derived from three distinct progenitor domains in other forebrain regions (e.g. Pir, ac). Finally, the new transgenic mouse lines and established multiple combinatorial genetic models will facilitate future investigations of the developmental origins of distinct OL populations and their functional and molecular heterogeneity.

      Weaknesses:

      Since OpalinP2A-Flpo-T2A-tTA2 only labels mature oligodendrocytes but not OPCs, the authors can not suggest that the lack of LGE/CGE-derived-OLs in the neocortex is less likely caused by competitive postnatal elimination, but more likely due to limited production and/or allocation (line 118-9). It remains possible that LGE/CGE-derived OPCs migrate into the cortex but are later eliminated.

      We are glad that the reviewer appreciates our work and are grateful for the positive comments and the constructive suggestion. We agree with the reviewer that our methodology by itself cannot suggest whether the lack of LGE/CGE-derived-OLs in the neocortex is caused by competitive postnatal elimination or not. That is why we cited a parallel work by Li et al. (ref [17] in the original manuscript; ref [19] in the revised manuscript), in which in utero electroporation (IUE) failed to label LGE-derived OL lineage cells in both embryonic and early postnatal brains. Although they did not directly explore CGE using IUE, their fate mapping results using Emx1-Cre; Nkx2.1-Cre; H2B-GFP at P0 and P10 revealed very low percentage of LGE/CGE-derived OL lineage cells. The lack of adult labeling in our study together with the lack of developmental labeling in the other study prompted us to hypothesize that the lack of LGE/CGE-derived-OLs in the neocortex is less likely caused by competitive postnatal elimination, but more likely due to limited production and/or allocation. In the revised manuscript, we have expanded the discussion to explain this point more clearly.

      Reviewer #2 (Public Review):

      Summary:

      In this manuscript, Cai et al use a combination of mouse transgenic lines to re-examine the question of the embryonic origin of telencephalic oligodendrocytes (OLs). Their tools include a novel Flp mouse for labelling mature oligodendrocytes and a number of pre-existing lines (some previously generated by the last author in Josh Huang's lab) that allowed combinatorial or subtractive labelling of oligodendrocytes with different origins. The conclusion is that cortically-derived OLs are the predominant OL population in the motor and somatosensory cortex and underlying corpus callosum, while the LGE/CGE generates OLs for the piriform cortex and anterior commissure rather than the cerebral cortex. Small numbers of MGE-derived OLs persist long-term in the motor, somatosensory and piriform cortex.

      Strengths:

      The strength and novelty of the manuscript lies in the elegant tools generated and used and which have the potential to elegantly and accurately resolve the issue of the contribution of different progenitor zones to telencephalic regions.

      We are glad that the reviewer appreciates our work and are grateful for the overall positive comments.

      Weaknesses:

      (1) Throughout the manuscript (with one exception, lines 76-78), the authors quantified OL densities instead of contributions to the total OL population (as a % of ASPA for example). This means that the reader is left with only a rough estimation of the different contributions.

      We thank the reviewer for this constructive suggestion. We have replaced the density quantification (Figure 2F and 3D in the original manuscript) with contributions to the total OL population (% of ASPA) (Figure 2J and 2N in the revised manuscript).

      (2) All images and quantifications have been confined to one level of the cortex and the potential of the MGE and the LGE/CGE to produce oligodendrocytes for more anterior and more posterior cortical regions remains unexplored.

      The quantifications were not confined to one level of the cortex but were performed in brain sections ranging from Bregma +1.94 to -2.80 mm, as shown in Supplementary Figure 2A-B in the original manuscript. We apologize for not having stated and presented this information clearly enough, and for the confusions it may have caused. In the revised manuscript, we have added relevant descriptions in the “Material and Methods” section (line 199-200*) and schematics along with representative images of more anterior and more posterior cortical regions (Supplementary Figure 2A-D).

      (3) Hence, the statement that "In summary, our findings significantly revised the canonical model of forebrain OL origins (Figure 4A) and provided a new and more comprehensive view (Figure 4B )." (lines 111, 112) is not really accurate as the findings are neither new nor comprehensive. Published manuscripts have already shown that (a) cortical OLs are mostly generated from the cortex [Tripathi et al 2011 (https://doi.org/10.1523/JNEUROSCI.6474-10.2011), Winker et al 2018 (https://doi.org/10.1523/JNEUROSCI.3392-17.2018) and Li et al (https://doi.org/10.1101/2023.12.01.569674)] and (b) MGE-derived OLs persist in the cortex [Orduz et al 2019 (https://doi.org/10.1038/s41467-019-11904-4) and Li et al 2024 (https://doi.org/10.1101/2023.12.01.569674)]. Extending the current study to different rostro-caudal regions of the cortex would greatly improve the manuscript.

      As explained in the response to comment (2), our original quantifications included different rostro-caudal regions of the cortex. In the revised manuscript, we have added more schematics and representative images in the Supplementary Figure 2 for better illustration to resolve the concern of comprehensiveness.

      We thank the reviewer for listing and summarizing highly relevant published researches along with the parallel study by Li et al. submitted to eLife. We apologize for the omission of the first two references in our original manuscripts and have cited them in appropriate places (ref [10] and ref [11] in the revised manuscript). However, we believe these works do not compromise the novelty and significance of our work for the following reasons:

      (1) Tripathi et al. 2011 (ref [10] in the revised manuscript) analyzed OL lineage cells in the corpus callosum and the spinal cord, but not in the cortex and anterior commissure. Their analysis was performed in juvenile mice (P12/13), not in adulthood. Most importantly, their analysis of ventrally derived OL lineage cells relied on lineage tracing using Gsh2Cre, which in fact also label OLs derived from Gsh2+ dorsal progenitors. In contrast, we analyzed mature OLs in the cortex, corpus callosum and anterior commissure in 2-month-old adult mice. We used intersectional and subtractive strategy to label OLs derived from dorsal, LGE/CGE and MGE/POA origins. Our strategy differentiated the two different ventral lineages (LGE/CGE vs. MGE/POA) and avoided mixed labeling of OLs from ventral and dorsal Gsh2+ progenitors.

      (2) Winkler et al. 2018 (ref [11] in the revised manuscript) analyzed OLs derived from dorsal progenitors but only quantified those in the gray matter and the white matter of somatosensory cortex. Their quantification relied on co-staining with Olig2/Sox10, and thereby included both oligodendrocyte precursors (OPCs) and OLs. In contrast, we analyzed mature OLs from three origins and quantified not only neocortical regions (Mo and SS) but also an archicortical region (Pir). Our analysis revealed that although dorsally derived OLs dominate neocortex, ventrally derived OLs, especially the LGE/CGE-derived ones, dominate piriform cortex.

      (3) Orduz et al. 2019 (ref [7] in the original manuscript and the revised manuscript) mainly focused on POA-derived OLs in the somatosensory cortex. Although they performed limited analysis on MGE/POA-derived OPCs at postnatal day 10 and 19, no quantification of MGE/POA-derived OLs was performed in terms of their density, contribution to the total OL population and spatial distribution in the cortex. In contrast, we performed systematic quantification on these aspects to demonstrate that MGE/POA-derived OLs make small but sustained contribution to cortex with a distribution pattern distinctive from those derived from the dorsal origin.

      (4) Li et al. 2024 (ref [17] in the original manuscript and [19] in the revised manuscript) is a parallel study submitted to eLife. Their and our independent discoveries nicely complemented each other. Using different sets of techniques and experiments but some shared genetic mouse models, we both found that LGE/CGE made minimum contribution to neocortical OLs. Their analysis in the prenatal and early postnatal stages together with our analysis in the adult brain painted a more comprehensive picture of cortical oligodendrogenesis. The uniqueness of our work is that we performed systematic quantification of all three origins and uncovered the differential contributions to neocortex, piriform cortex, corpus callosum and anterior commissure.

      In summary, our work developed novel strategies to faithfully trace OLs from the three different origins and performed systematic analysis in the adult brain. Our data uncovered their differential contributions to neocortex, piriform cortex and the two commissural white matter tracts, which significantly differ not only from the canonical view but also from other previous studies in aspects discussed above. We believe our discoveries did significantly revise the canonical model of forebrain OL origins and provided a new and more comprehensive view.

      Reviewer #3 (Public Review):

      In the manuscript entitled "Embryonic Origins of Forebrain Oligodendrocytes Revisited by Combinatorial Genetic Fate Mapping," Cai et al. used an intersectional/subtractional strategy to genetically fate-map the oligodendrocyte populations (OLs) generated from medial ganglionic eminence (NKX2.1+), lateral ganglionic eminences, and dorsal progenitor cells (EMX1+). Specifically, they generated an OL-expressing reporter mouse line OpalinP2A-Flpo-T2A-tTA2 and bred with region-specific neural progenitor-expressing Cre lines EMX1-Cre for dOL and NKX2.1-Cre for MPOL. They used a subtractional strategy in the OpalinFlp::Emx1Cre::Nkx2.1Cre::RC::FLTG mouse line to predict the origins of OLs from lateral/caudal ganglionic eminences (LC). With their genetic tools, the authors concluded that neocortical OLs primarily consist of dOLs. Although the populations of OLs (dOLs or MP-OLs) from Emx1+ or Nkx2.1+ progenitors are largely consistent with previous findings, they observed that MP-OLs contribute minimally but persist into adulthood without elimination as in the previous report (PMID: 16388308).

      Intriguingly, by using an indirect subtraction approach, they hypothesize that both Emx1-negative and Nkx2.1-negative cells represent the progenitors from lateral/caudal ganglionic eminences (LC), and conclude that neocortical OLs are not derived from the LC region.The authors claim that Gsh2 is not exclusive to progenitor cells in the LC region (PMID: 32234482). However, Gsh2 exhibits high enrichment in the LC during early embryonic development. The presence of a small population of Gsh2-positive cells in the late embryonic cortex could originate/migrate from Gsh2-positive cells in the LC at earlier stages (PMID: 32234482). Consequently, the possibility that cortical OLs derived from Gsh2+ progenitors in LC could not be conclusively ruled out. Notably, a population of OLs migrating from the ventral to the dorsal cortical region was detected after eliminating dorsal progenitor-derived OLs (PMID: 16436615).

      The indirect subtraction data for LC progenitors drawn from the OpalinFlp-tdTOM reporter in Emx1-negative and Nkx2.1-negative cells in the OpalinFlp::Emx1Cre::Nkx2.1Cre::RC::FLTG mouse line present some caveats that could influence their conclusion. The extent of activity from the two Cre lines in the OpalinFlp::Emx1Cre::Nkx2.1Cre::RC::FLTG mice remains uncertain. The OpalinFlp-tdTOM expression could occur in the presence of either Emx1Cre or Nkx2.1Cre, raising questions about the contribution of the individual Cre lines. To clarify, the authors should compare the tdTOM expression from each individual Cre line, OpalinFlp::Emx1Cre::RC::FLTG or OpalinFlp::Nkx2.1Cre::RC::FLTG, with the combined OpalinFlp::Emx1Cre::Nkx2.1Cre::RC::FLTG mouse line. This comparison is crucial as the results from the combined Cre lines could appear similar to only one Cre line active.

      Overall, the authors provided intriguing findings regarding the origin and fate of oligodendrocytes from different progenitor cells in embryonic brain regions. However, further analysis is necessary to substantiate their conclusion about the fate of LC-derived OLs convincingly.

      We thank the reviewer for these thoughtful comments. We agree with the reviewer that the presence of Gsh2-positive cells in the late embryonic cortex by itself could not rule out the possibility that they originate/migrate from Gsh2-positive cells in the LC at earlier stages. Staining dorsal-lineage intermediate progenitors with Gsh2, or performing intersectional lineage tracing using Gsh2Cre along with a dorsal-specific Flp driver, would provide more direct evidence on this issue. Nonetheless, as our lineage tracing of LGE/CGE-derive OLs did not employ Gsh2Cre, the doubt on the identity of Gsh2+ cortical progenitors should not affect the interpretation of our data.

      Regarding the subtractional LCOL labeling strategy used in our study, we wonder if there was any misunderstanding by the reviewer. As stated in our manuscript (line 59-61) and reiterated by the reviewer, OpalinFlp::Emx1Cre::Nkx2.1Cre::RC::FLTG labels OLs derived from progenitors that express neither Emx1Cre nor Nkx2.1Cre. As these two progenitor pools do not overlap with each other, there is a purely additive effect of their actions. If there is any concern about efficiency and specificity, it would be non-adequate Cre-mediated recombinations that lead to mislabeling of dOLs or MPOLs as LCOLs (i.e., OLs derived from Emx1 or Nkx2.1-expressing progenitors were not successfully “subtracted” and thereby “wrongly” retained RFP expression). Therefore, the bona-fide LGE/CGE-derive OLs would only be fewer but not more than RFP+ LCOLs labeled by our subtractional strategy, even if any of the Cre lines did not work efficiently enough. In any case, this would not affect our conclusion that LGE/CGE-derive OLs make a minimal contribution to neocortex, as the “ground truth” contribution by LGE/CGE could only be less but not more than what we have observed using the current strategy.

      In support of our conclusion, a parallel study by Li et al. 2024 (ref [17] in the original manuscript; ref [19] in the revised manuscript) also provided independent experimental evidence that “any contribution of oligodendrocyte precursors to the developing cortex from the lateral ganglionic eminence is minimal in scope (quoted from its eLife assessment).” In addition, in their revision, they performed Gsh2 immunostaining in P0 Emx1Cre::HG-loxP mouse and found nearly all Gsh2+ cells in the cortical SVZ were derived from the Emx1+ lineage. We are glad that this additional piece of evidence further clarified the case, but still want to emphasize that the subtractional strategy we took was designed purposefully to avoid the potential uncertainty of Gsh2Cre and to more faithfully label LGE/CGE-derived OLs. Therefore, the validity of our conclusion about the fate of LC-derived OLs should be independent from the question on the identity of Gsh2+ cortical progenitors and stands well by itself.

      We hope that these explanations have adequately addressed the reviewer’s concerns. 

      Recommendations for the authors:

      Reviewer #2 (Recommendations For The Authors):

      In Figures 2C, 2D, 2E and 3D, the authors should provide counts of labelled cells as a % of ASPA+ cells. This will give an accurate picture of the contribution of the different progenitor regions to OLs.

      The graphs in Figure 2F are unnecessary since they are simply repeats of C-E but re-arranged.

      We thank the reviewer for the valuable suggestions. These two recommendations are sort of related, and thereby we made the following changes. We replaced the density quantification in Figure 2F and 3D with % of ASPA (Figure 2J and 2N in the revised manuscript) to give an accurate picture of the contribution of the different progenitor regions to OLs, as suggested by the reviewer. We still retained the density counts in Figure 2C-E (Figure 2G-I in the revised manuscript). Together with quantifications of rotral-caudal and larminar distributions presented in Supplementary Figure 2, these data demonstrated that OLs from differential origins display distinct spatial distribution patterns.

      At what ages were the quantifications performed in all the figures?

      We apologize for the omission of this information in the original manuscript. All quantifications were performed in 2-month-old adult mice. We have added this information in the “Material and Methods” section of the revised manuscript.

      In 2D, and 3B the GFP should have been activated but the authors do not show it or quantify it presumably because GFP would flood the sections in the presence of Emx1Cre. Nevertheless, since eGFP is shown in the diagram in 2B, the authors should mention why they chose not to show it.

      We thank the reviewer for the helpful comment and the suggestion. We have modified the schematic in Figure 2B and added explanation in the figure legend (line 308-313). We also added a schematic in Supplementary Figure 1A along with images of GFP channel in Supplementary Figure 1D (line 338-350).

      All the main figures and supplementary figures are too small to see properly.

      We are sorry that there was severe compression of images in the combined manuscript file at the conversion step during the initial submission. We apologize for the compromised image quality and have re-uploaded full-size figures as individual files on BioRxiv soon after receiving the reviews. For the revised manuscript, we also take care to upload full-size figures at high resolution as individual files to ensure their quality of presentation.

      Supplementary Figure 2E is unnecessary and perhaps misleading the reader that cortical-derived OLs have a preference for the lower layers whereas the distribution may simply reflect the distribution of OLs in the cortex.

      We thank the reviewer for the helpful comment and the suggestion. We have removed this panel and replaced it with quantifications of relative laminar distributions of the total (ASPA+) OLs along with those from the three different origins (Supplementary Figure 2G in the revised manuscript). Indeed, the preference for the lower layers of dorsally-derived OLs mirrored the distribution of total OLs in the cortex, while the MGE/POA-derived OLs deviate significantly from others and exhibit higher preference towards layer 4.

      Quantification of labelled cells as a % of ASPA should also be performed in Supplementary Figure 3.

      We thank the reviewer for this suggestion. In the revised manuscript, we have included quantifications of labelled cells as % of ASPA for both OpalinFlp::Emx1Cre::Ai65 and  OpalinFlp::Nkx2.1Cre::Ai65 (Figure 2J and N). The sum of the these two data sets will be equivalent to those of OpalinFlp::Emx1Cre::Nkx2.1Cre::Ai65 shown in Supplementary Figure 3, and thereby we did not perform additional quantifications to avoid redundant efforts.

      Imaging and quantification should be extended to more posterior regions of the cortex to find out whether the contribution is different from the areas already examined.

      We thank the reviewer for the suggestion on imaging and apologize for the confusion about the range of quantification. As explained in the response to comment (2) of weakness, the quantifications were not confined to one level of the cortex but were performed in brain sections ranging from Bregma +1.94 to -2.80 mm, as shown in Supplementary Figure 2A-B in the original manuscript. In the revised manuscript, we have added relevant descriptions in the “Material and Methods” section (line 199-200) and schematics along with representative images of more anterior and more posterior cortical regions (Supplementary Figure 2A-D).

      Reviewer #3 (Recommendations For The Authors):

      (1) The authors should provide Opalin reporter expression data across various brain regions at different developmental stages to clarify the expression pattern of the reporter.

      We appreciate the reviewer’s comment. We chose to performed all quantifications in adult mice as Opalin is a well-established marker for differentiated OLs and the recombinase-dependent reporter expression is accumulative and irreversible. If there is any non-specific labeling in any earlier developmental stage, it would be retained and manifested at the timepoint we examined as well. In another word, the fact that we did not detect any non-specific labeling in the current dataset but only confined labeling in mature OLs ensured that no non-OL labeling was present in earlier timepoint. As shown in Figure 1D-F, reporter expression activated by the Opalin driver is presented at high OL specificity in all analyzed brain regions. This is further corroborated by results from combinatorically labeled samples (Figure 2 and Supplementary Figure 2), in which only OLs but not any other cell types were labeled in all analyzed brain regions too. Following the reviewers’ suggestions, we have added representative images of more rostral and more caudal cortical regions (Supplementary Figure 2B-D), which also showed highly specific OL labeling.  

      (2) In Figure 1D, please specify the developmental stage of the mice used for staining.

      We apologize for the omission of this information in the original manuscript. All quantifications were performed in 2-month-old adult mice. We have added this information in the “Material and Methods” section (line 199-200) of the revised manuscript.

      (3) The authors should clarify if the Opalin reporter expressed in OPCs and astrocytes at developmental stages of mice, such as P0, P7, and P30.

      We appreciate the reviewer’s comment, but as explained in response to comment (1), Opalin is a well-established marker for differentiated OLs which is not expressed in OPCs or astrocytes. As shown in Figure 1D-E, reporter expression is confined to CC1+ differentiated OLs with no colocalization with Sox9 (astrocyte marker). In support with this observation, only ASPA+ differentiated OLs but no OPC or astrocyte were labeled in any of the combinatorial lineage tracing samples generated using this line combined with progenitor-Cre lines. In addition to marker staining, we also did not observe any RFP+ cells with OPC or astrocyte morphology. As the recombinase-dependent reporter expression is accumulative and irreversible, the fact no non-specific labeling was observed in adult brain retrospectively proved the specificity of Oplain-Flp in earlier developmental stages.

      (4) In Figure 1E, authors should address why the efficiency of the tdTomato line is notably lower compared to that of H2B-GFP and whether the stability of reporters could impact the conclusions drawn.

      The difference in reporting efficiency is mainly caused by differences inherent to the two reporting systems. The TRE-RFP reporter is derived from Ai62, composed of a Tet response element and tdTomato inserted into the T1 TIGRE locus. The tdTomato expression is driven by tTA-TRE transcriptional activation. The HG-loxP reporter is derived from HG-Dual, composed of a CAG promoter, a frt-flanked STOP cassette, and H2B-GFP inserted into the Rosa26 locus. The H2B-GFP expression is driven by CAG promoter after Flp-mediated removal of the STOP cassette. A Flp-dependent tdTomato reporter designed in the same way as the HG-FRT reporter would have similar efficiency. In fact, the RC::FLTG reporter can be viewed as such a reporter in the absence of Cre, which did show similarly high efficiency as HG-FRT and supported efficient subtractive labeling of LGE/CGE-derived OLs. We apologize for a typo in the title of the Y-axis of the right panel in the original Figure 1F which may have caused potential misunderstanding. The “RFP+CC1+/CC1” should be “XFP+CC1/CC1”. We have corrected this mistake and revised the figure legend for clearer description of the data (Line 293-302 in the revised manuscript).

      (5) In Figure 2, please clarify the developmental stage of the mice used for staining. Authors should present the eGFP image in addition to tdTOM.

      We apologize for the omission of the age information in the original manuscript. All quantifications were performed in 2-month-old adult mice. We have added this information in the “Material and Methods” section (line 199-200) of the revised manuscript. We thank the reviewer for the suggestion on eGFP image and have presented it in supplementary Figure 1 in the revised manuscript.

      (6) in Figure 2D, authors should display the eGFP image alongside the tdTomato image. It is difficult to assess the efficiency of Emx-Cre and Nkx2.1-Cre.

      We thank the reviewer for the suggestion on eGFP image and have presented eGFP image in Supplementary Figure 1D in the revised manuscript. There are two reasons why we chose to present it in the supplementary figure instead of main figure. First, we added ASPA staining in the green channel along with quantifications of RFP cells as % of ASPA in Figure 2 in the revised manuscript, following reviewer #2’s suggestion. Second, as pointed out by reviewer #2, GFP would flood the sections in the presence of Emx1Cre and could be quite distractive if it was shown together with RFP.

      We were not entirely sure what exactly the reviewer means by “assess the efficiency of Emx-Cre and Nkx2.1-Cre”, but we believe that the quantifications of RFP cells as % of ASPA clarified the contribution of each origin to the total OLs (Figure 2J and 2N in the revised manuscript).

      (7) Figure 3 depicts the entire brain, replicating the image presented in Figure 2. It would be beneficial to consolidate Figures 2 and 3, as they showcase identical brain scans of different regions.

      We thank the reviewer for the constructive suggestion and have consolidated Figures 2 and 3 in the original manuscript into Figure 2 in the revised manuscript.

    1. eLife assessment

      This important study provides a new perspective on how human immunity shapes the antigenic evolution of pathogens. By combining theory and simulation the authors make a solid case for the importance of eco-evolutionary interactions in population-level virus-host dynamics, which arise due to coupling between the dynamics of immune memories and viral variants. Although the work does not propose improved data-driven viral forecasting methods, it makes a conceptual contribution that advances the field's understanding of this problem's intrinsic difficulty.

    2. Reviewer #1 (Public Review):

      In this work, the authors study the dynamics of fast-adapting pathogens under immune pressure in a host population with prior immunity. In an immunologically diverse population, an antigenically escaping variant can perform a partial sweep, as opposed to a sweep in a homogeneous population. In a certain parameter regime, the frequency dynamics can be mapped onto a random walk with zero mean, which is reminiscent of neutral dynamics, albeit with differences in higher order moments. Next, they develop a simplified effective model of time dependent selection with expiring fitness advantage, and posit that the resulting partial sweep dynamics could explain the behaviour of influenza trajectories empirically found in earlier work (Barrat-Charlaix et al. Molecular Biology and Evolution, 2021). Finally, the authors put forward an interesting hypothesis: the mode of evolution is connected to the age of a lineage since ingression into the human population. A mode of meandering frequency trajectories and delayed fixation has indeed been observed in one of the long-established subtypes of human influenza, albeit so far only over a limited period from 2013 to 2020. The paper is overall interesting and well-written. Some aspects, detailed below, are not yet fully convincing and should be treated in a substantial revision.

      Major points

      (1) The quasi-neutral behaviour of amino acid changes above a certain frequency (reported in Fig, 3), which is the main overlap between influenza data and the authors' model, is not a specific property of that model. Rather, it is a generic property of travelling wave models and more broadly, of evolution under clonal interference (Rice et al. Genetics 2015, Schiffels et al. Genetics 2011). The authors should discuss in more detail the relation to this broader class of models with emergent neutrality. Moreover, the authors' simulations of the model dynamics are performed up to the onset of clonal interference \rho/s_0 = 1 (see Fig. 4). Additional simulations more deeply in the regime of clonal interference (e.g. \rho / s_0 = 5) show more clearly the behaviour in this regime.

      In this context, I also note that the modelling results of this paper, in particular the stalling of frequency increase and the decrease in the number of fixations, are very similar to established results obtained from similar dynamical assumptions in the broader context of consumer resource models; see, e.g., Good et al. PNAS 2018. The authors should place their model in this broader context.

      (2) The main conceptual problem of this paper is the inference of generic non-predictability from the quasi-neutral behaviour of influenza changes. There is no question that new mutations limit the range of predictions, this problem being most important in lineages with diverse immune groups such as influenza A(H3N2). However, inferring generic non-predictability from quasi-neutrality is logically problematic because predictability refers to individual trajectories, while quasi-neutrality is a property obtained by averaging over many trajectories (Fig. 3). Given an SIR dynamical model for trajectories, as employed here and elsewhere in the literature, the up and down of individual trajectories may be predictable for a while even though allele frequencies do not increase on average. The authors should discuss this point more carefully.

      (3) To analyze predictability and population dynamics (section 5), the authors use a Wright-Fisher model with expiring fitness dynamics. While here the two sources of the emerging neutrality are easily tuneable (expiring fitness and clonal interference), the connection of this model to the SIR model needs to be substantiated: what is the starting selection s_0 as a function of the SIR parameters (f, b, M, \epsilon), the selection decay \nu = \nu(f, b, M, \epsilon, \gamma)? This would enable the comparison of the partial sweep timing in both models and corroborate the mapping of the SIR onto the simplified W-F model. In addition, the authors' point would be strengthened if the SIR partial sweeps in Fig.1 and Fig.2 were obtained for a combination of parameters that results in a realistic timescale of partial sweeps.

    3. Reviewer #2 (Public Review):

      Summary:

      This work addresses a puzzling finding in the viral forecasting literature: high-frequency viral variants evince signatures of neutral dynamics, despite strong evidence for adaptive antigenic evolution. The authors explicitly model interactions between the dynamics of viral adaptations and of the environment of host immune memory, making a solid theoretical and simulation-based case for the essential role of host-pathogen eco-evolutionary dynamics. While the work does not directly address improved data-driven viral forecasting, it makes a valuable conceptual contribution to the key dynamical ingredients (and perhaps intrinsic limitations) of such efforts.

      Strengths:

      This paper follows up on previous work from these authors and others concerning the problem of predicting future viral variant frequency from variant trajectory (or phylogenetic tree) data, and a model of evolving fitness. This is a problem of high impact: if such predictions are reliable, they empower vaccine design and immunization strategies. A key feature of this previous work is a "traveling fitness wave" picture, in which absolute fitnesses of genotypes degrade at a fixed rate due to an advancing external field, or "degradation of the environment". The authors have contributed to these modeling efforts, as well as to work that critically evaluates fitness prediction (references 11 and 12). A key point of that prior work was the finding that fitness metrics performed no better than a baseline neutral model estimate (Hamming distance to a consensus nucleotide sequence). Indeed, the apparent good performance of their well-adopted "local branching index" (LBI) was found to be an artifact of its tendency to function as a proxy for the neutral predictor. A commendable strength of this line of work is the scrutiny and critique the authors apply to their own previous projects. The current manuscript follows with a theory and simulation treatment of model elaborations that may explain previous difficulties, as well as point to the intrinsic hardness of the viral forecasting inference problem.

      This work abandons the mathematical expedience of traveling fitness waves in favor of explicitly coupled eco-evolutionary dynamics. The authors develop a multi-compartment susceptible/infected model of the host population, with variant cross-immunity parameters, immune waning, and infectious contact among compartments, alongside the viral growth dynamics. Studying the invasion of adaptive variants in this setting, they discover dynamics that differ qualitatively from the fitness wave setting: instead of a succession of adaptive fixations, invading variants have a characteristic "expiring fitness": as the immune memories of the host population reconfigure in response to an adaptive variant, the fitness advantage transitions to quasi-neutral behavior. Although their minimal model is not designed for inference, the authors have shown how an elaboration of host immunity dynamics can reproduce a transition to neutral dynamics. This is a valuable contribution that clarifies previously puzzling findings and may facilitate future elaborations for fitness inference methods.

      The authors provide open access to their modeling and simulation code, facilitating future applications of their ideas or critiques of their conclusions.

      Weaknesses:

      The current modeling work does not make direct contact with data. I was hoping to see a more direct application of the model to a data-driven prediction problem. In the end, although the results are compelling as is, this disconnect leaves me wondering if the proposed model captures the phenomena in detail, beyond the qualitative phenomenology of expiring fitness. I would imagine that some data is available about cross-immunity between strains of influenza and sarscov2, so hopefully some validation of these mechanisms would be possible.

      After developing the SIR model, the authors introduce an effective "expiring fitness" model that avoids the oscillatory behavior of the SIR model. I hoped this could be motivated more directly, perhaps as a limit of the SIR model with many immune groups. As is, the expiring fitness model seems to lose the eco-evolutionary interpretability of the SIR model, retreating to a more phenomenological approach. In particular, it's not clear how the fitness decay parameter nu and the initial fitness advantage s_0 relate to the key ecological parameters: the strain cross-immunity and immune group interaction matrices.

    4. Reviewer #3 (Public Review):

      Summary:

      In this work the authors start presenting a multi-strain SIR model in which viruses circulate in an heterogeneous population with different groups characterized by different cross-immunity structures. They argue that this model can be reformulated as a random walk characterized by new variants saturating at intermediate frequencies. Then they recast their microscopic description to an effective formalism in which viral strains lose fitness independently from one another. They study several features of this process numerically and analytically, such as the average variants frequency, the probability of fixation, and the coalescent time. They compare qualitatively the dynamics of this model to variants dynamics in RNA viruses such as flu and SARS-CoV-2

      Strengths:

      The idea that a vanishing fitness mechanisms that produce partial sweeps may explain important features of flu evolution is very interesting. Its simplicity and potential generality make it a powerful framework. As noted by the authors, this may have important implications for predictability of virus evolution and such a framework may be beneficial when trying to build predictive models for vaccine design. The vanishing fitness model is well analyzed and produces interesting structures in the strains coalescent. Even though the comparison with data is largely qualitative, this formalism would be helpful when developing more accurate microscopic ingredients that could reproduce viral dynamics quantitatively.<br /> This general framework has a potential to be more universal than human RNA viruses, in situations where invading mutants would saturate at intermediate frequencies.

      Weaknesses:

      The authors build the narrative around a multi-strain SIR model in which viruses circulate in an heterogeneous population, but the connection of this model to the rest of the paper is not well supported by the analysis.<br /> When presenting the random walk coarse-grained description in section 3 of the Results, there is no quantitative relation between the random walk ingredients - importantly P(\beta) - and the SIR model, just a qualitative reasoning that strains would initially grow exponentially and saturate at intermediate frequencies. So essentially any other microscopic description with these two features would give rise to the same random walk.

      Currently it's unclear whether the specific choices for population heterogeneity and cross-immunity structure in the SIR model matter for the main results of the paper. In section 2, it seems that the main effect of these ingredients are reduced oscillations in variants frequencies and a rescaled initial growth rate. But ultimately a homogeneous population would also produce steady state coexistence between strains, and oscillation amplitude likely depends on parameters choices. Thus a homogeneous population may lead to a similar coarse-grained random walk.

      Similarly, it's unclear how the SIR model relates to the vanishing fitness framework, other than on a qualitative level given by the fact that both descriptions produce variants saturating at intermediate frequencies. Other microscopic ingredients may lead to a similar description, yet with quantitative differences.

      At the same time, from the current analysis the reader cannot appreciate the impact of such a mean field approximation where strains lose fitness independently from one another, and under what conditions such assumption may be valid.

      In summary, the central and most thoroughly supported results in this paper refer to a vanishing fitness model for human RNA viruses. The current narrative, built around the SIR model as a general work on host-pathogen eco-evolution in the abstract, introduction, discussion and even title, does not seem to match the key results and may mislead readers. The SIR description rather seems one of the several possible models, featuring a negative frequency dependent selection, that would produce coarse-grained dynamics qualitatively similar to the vanishing fitness description analyzed here.

    1. eLife assessment

      This study reports on the in vivo dynamics of insulin-producing cells (IPCs) in Drosophila. IPC activity is shown to be modulated by the nutritional state and age of the animal, with convincing evidence for an incretin-like effect. These important findings establish IPCs in Drosophila as a system to study circuits governing behaviors related to the internal state in competition with the feeding state, and will be of interest to both neuroscientists and cell biologists.

    2. Reviewer #1 (Public Review):

      Summary:

      This study presents useful insights into the in vivo dynamics of insulin-producing cells (IPCs), key cells regulating energy homeostasis across the animal kingdom. The authors provide compelling evidence using adult Drosophila melanogaster that IPCs, unlike neighboring DH44 cells, do not respond to glucose directly, but that glucose can indirectly regulate IPC activity after ingestion supporting an incretin-like mechanism in flies, similar to mammals. The authors link the decreased activity of IPCs to hyperactivity observed in starved flies, a locomotive behavior aimed at increasing food search.

      Furthermore, there is supporting evidence in the paper that IPCs receive inhibitory inputs from Dh44 neurons, which are linked to increased locomotor activity. However, although the electrophysiological data underlying the dynamics of IPCs in vivo is compelling, the link between IPCs and other potential elements of the circuitry (e.g. octopaminergic neurons) regulating locomotive behaviors is not clear and would benefit from more rigorous approaches.

      This paper is of interest to cell biologists and electrophysiologists, and in particular to scientists aiming to understand circuit dynamics pertaining to internal state-linked behaviors competing with the feeding state, shown here to be primarily controlled by the IPCs.

      Strengths:

      (1) By using whole-cell patch clamp recording, the authors convincingly showed the activity pattern of IPCs and neighboring DH44 neurons under different feeding states.

      (2) The paper provides compelling evidence that IPCs are not directly and acutely activated by glucose, but rather through a post-ingestive incretin-like mechanism. In addition, the authors show that Dh44 neurons located adjacent to the IPCs respond to bath application of glucose contrary to the IPCs.

      (3) The paper provides useful data on the firing pattern of 2 key cell populations regulating food-related brain function and behavior, IPCs and Dh44 neurons, results which are useful to understand their in vivo function.

      Weaknesses:

      (1) The term nutritional state generally refers to the nutrients which are beneficial to the animal. In Figure 1, the authors showed that IPCs respond to glucose but not proteins. To validate the term nutritional state the authors could test the effect of a non-nutritive sugar (e.g. D-arabinose or L-Glucose) on the post-ingestive physiological responses of the IPCs.

      (2) It is difficult to grasp the main message from the figures in the result section as some figures have several results subsections referring to different points the authors want to make. The key results of a figure will be easier to understand if they are summarized in one section of the results. Alternatively, a figure can be split into 2 figures if there are several key messages in those figures, e.g. Figures 2 and 3.

      (3) The prime investigation of the paper is about the physiological response and locomotive behavioral readout linked to IPCs. The authors do not show a link between OANs and IPCs in terms of functional or behavioral readouts. In Figure 2 the authors first start with stating a link between OAN neurons and locomotion changes resulting from internal feeding states. The flow of the paper would be better if the authors focused on the effect of optogenetic activation of IPCs under different feeding states and their impact on fly locomotion. If the experiments done on optogenetic activation of OANs were to validate the experimental approach the data on OAN neurons is better suited for the supplement without the need of a subsection in the result section on the OANs.

      (4) Figure 2F shows that optogenetic activation of IPCs in fed flies does not influence their locomotor output. In the text, the conclusion linked to Figure 2F-H states that IPC activation reduces starvation-induced hyperactivity which is a statement more suited to Figure 2I-K.

      (5) The authors show activation of Dh44 neurons leads to hyperpolarisation of the IPCs. What is the functional link between non-PI Dh44 neurons and the IPCs? Do IPCs express DH44R or is DH44 required for this effect on IPCs? Investigating a potential synaptic or peptidergic link between DH44 neurons and IPCs and its effect on behavior would benefit the paper, as it is so far not well connected.

    3. Reviewer #2 (Public Review):

      Summary:

      In this study, Bisen et al. characterized the state-dependency of insulin-producing cells in the brain of *Drosophila melanogaster*. They successfully established that IPC activity is modulated by the nutritional state and age of the animal. Interestingly, they demonstrate that IPCs respond to the ingestion of glucose, rather than to perfusion with it, an observation reminiscent of the incretin effect in mammals. The study is well conducted and presented and the experimental data convincingly support the claims made.

      Strengths:

      The study makes great use of the tools available in *Drosophila* research, demonstrating the effect that starvation and subsequent refeeding have on the physiological activity of IPCs as well as on the behavior of flies to then establish causal links by making use of optogenetic tools.

      It is particularly nice to see how the authors put their findings in context to published research and use for example TDC2 neuron activation or DH44 activity to establish baselines to relate their data to.

      Weaknesses:

      I find the inability of SD to rescue the IPC starvation effect in Figure 1G&H surprising, given that the fully fed flies were raised and kept on that exact diet. Did the authors try to refeed flies with SD for longer than 24 hours? I understand that at some point the age effect would also kick in and counteract potential IPC activity rescue. I think the manuscript would benefit if the authors could indicate the exact age of the SD refed flies and expand a bit on the discussion of that point.

      The incretin-like effect is exciting and it will be interesting in the future to find out what might be the signal mediating this effect. It is interesting that IPCs in explants seem to be responsive to glucose. I think it would help if the authors could briefly discuss possible sources for the different findings between these in fact very different preparations. Could the the absence of the inhibitory DH44 feedback in the *ex-vivo* recordings for example play a role?

      The incretin-like effect the authors observed seems to start only after 5h which seems longer than in mammals where, as far as I know, insulin peaks around 1h. Do the authors have ideas on how this timescale relates to ingestion and glucose dynamics in flies?

      The authors mention "a decrease in the FV of IPC-activated starved flies even before the first optogenetic stimulation (Figure 2I),". Could this be addressed by running an experiment in darkness, only using the IR illumination of their behavioral assay?

      The authors show an inhibitory effect of DH44 neuron activation on IPC activity. They further demonstrate that DH44PI neurons are not the ones driving this and thus conclude that "...IPCs are inhibited by DH44Ns outside the PI.". As the authors mentioned the broad expression of the DH44-Gal4 line, can they be sure that the cells labeled outside the PI are actually DH44+? If so they should state this more clearly, if not they should adapt the discussion accordingly.

    4. Reviewer #3 (Public Review):

      Although insulin release is essential in the control of metabolism, adjusted to nutritional state, and plays major roles in normal brain function as well as in aging and disease, our knowledge about the activity of insulin-producing (and releasing) cells (IPCs) in vivo is limited.

      In this technically demanding study, IPC activity is studied in the Drosophila model system by fine in vivo patch clamp recordings with parallel behavioral analyses and optogenetic manipulation.

      The data indicate that IPC activity is increased with a slow time course after feeding a high-glucose diet. By contrast, IPC activity is not directly affected by increasing blood glucose levels. This is reminiscent of the incretin effect known from vertebrates and points to a conserved mechanism in insulin production and release upon sugar feeding.

      Moreover, the data confirm earlier studies that nutritional state strongly affects locomotion. Surprisingly, IPC activity makes only a negligible contribution to this. Instead, other modulatory neurons that are directly sensitive to blood glucose levels strongly affect modulation. Together, these data indicate a network of multiple parallel and interacting neuronal layers to orchestrate the physiological, metabolic, and behavioral responses to nutritional state. Together with the data from a previous study, this work sets the stage to dissect the architecture and function of this network.

      Strengths:

      State-of-the-art current clamp in situ patch clamp recordings in behaving animals are a demanding but powerful method to provide novel insight into the interplay of nutritional state, IPC activity, and locomotion. The patch clamp recordings and the parallel behavioral analyses are of high quality, as are the optogenetic manipulations. The data showing that starvation silences IPC activity in young flies (younger than 1 week) are compelling. The evidence for the claim that locomotor activity is not increased upon IPC activity but upon the activity of other blood glucose-sensitive modulatory neurons (Dh44) is strong. The study provides a great system to experimentally dissect the interplay of insulin production and release with metabolism, physiology, and behavior.

      Weaknesses:

      Neither the mechanisms underlying the incretin effect, nor the network to orchestrate physiological, metabolic, and behavioral responses to nutritional state have been fully uncovered. Without additional controls, some of the conclusions would require significant downtoning. Controls are required to exclude the possibility that IPCs sense other blood sugars than glucose. The claim that IPC activity is controlled by the nutritional state would require that starvation-induced IPC silencing in young animals can be recovered by feeding a normal diet. At current firing in starvation, silenced IPCs can only be induced by feeding a high-glucose diet that lacks other important ingredients and reduces vitality. Therefore, feasible controls are needed to exclude that diet-induced increases in IPC firing rate are caused by stress rather than nutritional changes in normal ranges. The finding that refeeding starved flies with a standard diet had no effect on IPC activity but a strong effect on the locomotor activity of starved flies contradicts the statement that locomotor activity is affected by the same dietary manipulations that affect IPC activity. The compelling finding that starvation induces IPC firing would benefit from determining the time course of the effect. The finding that IPCs are not active in fed animals older than 1 week is surprising and should be further validated.

    1. eLife assessment

      This study reports valuable insights into the interactome of the RNA-binding protein SERBP1 and possible links through PARylation to a diverse set of processes including splicing, cell division, and ribosome biogenesis. The diversity of processes SERBP1 may regulate means this work would be of very broad interest to the cell biology community. However, whereas the proteomics data are solid, the functional connection to downstream processes and the link to Alzheimer's disease are still incomplete, as they rely on a very limited set of experiments and patient samples.

    2. Reviewer #1 (Public Review):

      Summary:

      Here the authors convincingly identify and characterize the SERBP1 interactome and further define its role in the nucleus, where it is associated with complexes involved in splicing, cell division, chromosome structure, and ribosome biogenesis. Many of the SERBP1-associated proteins are RNA-binding proteins and SERBP1 exerts its impact, at least in part, through these players. SERBP1 is mostly disordered but along with its associated proteins displays a preference for G4 binding and can can bind to PAR and be PARylated. They present data that strongly suggest that complexes in which SERBP1 participates are assembled through G4 or PAR binding. The authors suggest that because SERBP1 lacks traditional functional domains yet is clearly involved in distinct regulatory complexes, SERBP1 likely acts in the early steps of assembly through the recognition of interacting sites present in RNA, DNA, and proteins.

      Strengths:

      The data is very convincing and demonstrated through multiple approaches.

      Weaknesses:

      No weaknesses were identified by this reviewer.

    3. Reviewer #2 (Public Review):

      Summary:

      In this study the authors have used pull-down experiments in a cell line overexpressing tagged SERPINE1 mRNA binding protein 1 (SERBP1) followed by mass spectrometry-based proteomics, to establish its interactome. Extensive analyses are performed to connect the data to published resources. The authors attempt to connect SERBP1 to stress granules and Alzheimer's disease-associated tau pathology. Based on the interactome, the authors propose a cross-talk between SERBP1 and PARP1 functions.

      Strengths:

      The main strength of this study lies in the proteomics data analysis, and its effort to connect the data to published studies.

      Weaknesses:

      While the authors propose a feedback regulatory model for SERBP1 and PARP1 functions, strong evidence for PARylation modulating SERBP1 functions is lacking. PARP inhibition decreasing the amount of PARylated proteins associated with SERBP1 and likely all other PARylated proteins is expected. This study is also incomplete in its attempt to establish a connection to Alzheimer's disease related tauopathy. A single AD case is not sufficient, and frozen autopsy tissue shows unexplained punctate staining likely due to poor preservation of cellular structures for immunohistochemistry. There is a lack of essential demographic data, source of the tissue, brain regions shown, and whether there was an IRB protocol for the human brain tissue. The presence of phase-separated transient stress granules in an autopsy brain is unlikely, even if G3BP1 staining is present. Normally, stress granule proteins move to the cytoplasm under cellular stress, whereas SERBP1 becomes nuclear. The co-localization of abundant cytoplasmic G3BP1 and SERBP1 under normal conditions does not indicate an association with stress granules.

    4. Reviewer #3 (Public Review):

      Summary:

      A survey of SERBP1-associated functions and their impact on the transcriptome upon gene depletion, as well as the identification of chemical inhibitors upon gene over-expression.

      Strengths:

      (1) Provides a valuable resource for the community, supported by statistical analyses.

      (2) Offers a survey of different processes with correlation data, serving as a good starting point for the community to follow up.

      Weaknesses:

      (1) The authors provided numerous correlations on diverse topics, from cell division to RNA splicing and PARP1 association, but did not follow up their findings with experiments, offering little mechanistic insight into the actual role of SERBP1. The model in Figure 5D is entirely speculative and lacks data support in the manuscript.

      (2) Following up with experiments to demonstrate that their findings are real (e.g., those related to splicing defects and the PARylation/PAR-binding association) would be beneficial. For example, whether the association between PARP1 and SERBP1 is sensitive to PAR-degrading enzymes is unclear.

      (3) They did not clearly articulate how experiments were performed. For instance, the drug screen and even the initial experiment involving the pull-down were poorly described. Many in the community may not be familiar with vectors such as pSBP or pUltra without looking up details.

      (4) The co-staining of SERBP1 with pTau, PARP1, and G3BP1 in the brain is interesting, but it would be beneficial to follow up with immunoprecipitation in normal and patient samples to confirm the increased physical association.

      (5) The combination index of 0.7-0.9 for PJ34 + siSERBP1 is weak. Could this be due to the non-specific nature of the drug against other PARPs? Have the authors looked into this possibility?

    1. eLife assessment

      This important study shows that age-related gut microbiota modulates uric acid metabolism through the NLRP3 inflammasome pathway and thereby regulates susceptibility to age-related gout. Whereas some of the data are compelling, several experimental approaches and methods are currently incomplete, which could be remedied with more rigorous approaches. If strengthened, this paper would be of broad interest to researchers working on gout and microbiota.

    2. Reviewer #1 (Public Review):

      Gout, a prevalent form of arthritis among the elderly, exhibits an intricate relationship with age and gut microbiota. The authors found that gut microbiota plays a crucial role in determining susceptibility to age-related gout. They observed that age-related gut microbiota regulated the activation of the NLRP3 inflammasome pathway and modulated uric acid metabolism. "Younger" microbiota has a positive impact on the gut microbiota structure of old or aged mice, enhancing butanoate metabolism and butyric acid content. Finally, they found butyric acid exerts a dual effect, inhibiting inflammation in acute gout and reducing serum uric acid levels. This work's insight emphasizes the potential of a "young" gut microbiome in mitigating senile gout. The whole study was interesting, but there were some minor errors in the overall writing of the paper. The author should carefully check the spelling of the words in the text and the case consistency of the group names.

    3. Reviewer #2 (Public Review):

      Summary:

      In their manuscript titled "Microbiota from Young Mice Counteracts Susceptibility to Age-Related Gout through Modulating Butyric Acid Levels in Aged Mice," the authors report that fecal transplantation from young mice into old mice alleviates susceptibility to gout. The gut microbiota in young mice is found to inhibit activation of the NLRP3 inflammasome pathway and reduce uric acid levels in the blood in the gout model.

      Strengths:

      They focused on the butanoate metabolism pathway based on the results of metabolomics analysis after fecal transplantation and identified butyrate as the key factor in mitigating gout susceptibility. In general, this is a well-performed study.

      Weaknesses:

      The discussion on the current results and previous studies regarding the effect of butyrate on gout symptoms is insufficient. The authors need to provide a more thorough discussion of other possible mechanisms and relevant literature.

    4. Reviewer #3 (Public Review):

      Summary:

      This manuscript addresses an important and emerging area of research-the relationship between gut microbiota and age-related gout. The innovative aspect of this research is the demonstration that transplanting gut microbiota from young to aged mice can alleviate gout symptoms and modulate uric acid levels by increasing butyric acid levels. However, significant problems remain in the overall experimental design and manuscript writing.

      Some critical comments are provided below:

      (1) The data quality still needs to be improved. There are many outliers in the experimental data shown in some figures, e.g. Figure 2D-G. The presence of these outliers makes the results unreliable. The author should thoroughly review the data analysis in the manuscript. In addition, a couple of western blot bands, such as IL-1β in Figure 3C, are not clear enough, please provide clearer western blot results again to support the conclusion.

      (2) As shown in Figure 1G-I, foot thickness and IL-1β content in foot tissues of the Aged+Abx group were significantly reduced, but there was no difference in serum uric acid level. In addition, the Abx-untreated group should be included at all ages.

      (3) Since FMT (Figure 4) and butyrate supplementation (Figure 8) have different effects on uric acid synthesis enzyme and excretion, different mechanisms may lie behind these two interventions. Transplantation with significantly enriched single strains from young mice, such as Bifidobacterium and Akkermansia, is the more reliable approach to reveal the underlying mechanism between gut microbiota and gout.

      (4) In Figure 2F, the results showed the IL-1β, IL-6, and TNF-α content in serum, which was inconsistent with the authors' manuscript description (Line 171).

      (5) Figures 2F-H duplicate Supplementary Figures S1B-D. The authors should prepare the article more carefully to avoid such mistakes.

      (6) In lines 202-206, the authors stated that the elevated serum uric acid levels in the Young+Old or Young+Aged groups, but there is no difference in the results shown in Figure 4A.

      (7) Please visualize the results in Table 2 in a more intuitive manner.

      (8) The heatmap in Figure 7A cannot strongly support the conclusion "the butyric acid content in the faeces of Young+PBS group was significantly higher than that in the Aged+PBS group". The author should re-represent the visual results and provide a reasonable explanation. In addition, please provide the ordinate unit of Supplementary Figure 7A-H.

      (9) Uncropped original full-length western blot should be provided.

    1. eLife assessment

      This important paper addresses the role of fluid flows in nutrient uptake by microorganisms propelled by the action of cilia or flagella. Using a range of mathematical models for the flows created by such appendages, the authors provide convincing evidence that the two strategies of swimming and sessile motion can be competitive. These results will have significant implications for our understanding of the evolution of multicellularity in its various forms.

    2. Reviewer #1 (Public Review):

      Summary:

      The manuscript studies nutrient intake rates for stationary and motile microorganisms to assess the effectiveness of swim vs. stay strategies. This work provides valuable insights on how the different strategies perform in the context of a simplified mathematical model that couples hydrodynamics to nutrient advection and diffusion. The swim and stay strategies are shown to yield similar nutrient flux under a range of conditions.

      Strengths:

      Strengths of the work include (i) the model prediction in Fig. 3 of nutrient flux applied to a range of microorganisms including an entire clade that are known to use different feeding strategies and (ii) a study of the interaction between cilia and absorption coverage showing the robustness of their predictions provided these regions have sufficient overlap.

      Weaknesses: To improve the work, the authors should further expand their discussion of the following points:

      (1) The authors comment that a number of species alternate between sessile and motile behavior. It would be helpful to discuss what is known about what causes switching between these modes and whether this provides insights regarding the advantages of the different behaviors.

      (2) An encounter zone of R=1.1a appears be used throughout the manuscript, but I could not find a biological justification for this particular value. This results appear to be quite sensitive to this choice, as shown in Supplement Fig. 3(B). In the Discussion, it is mentioned that using a much larger exclusion zone leads to significantly different nutrient flux, and it is implied that such a large exclusion zone is not biologically plausible, but this was not explained sufficiently.

      (3) In schematic of the in Fig. 2(B) it was unclear if the encounter zone in the envelope model is defined analogously to the Stokeslet model or if a different formulation is used.

      (4) The force balance argument should be clarified. Equation (3) of the supplement gives the force-velocity relation in the motile case. Since equation (4), which the authors state is the net force in the sessile case, seems to involve the same expression, would it not follow from U=0 in the sessile case that one would simply obtain quiescent flow with Fcilia=0?

    3. Reviewer #2 (Public Review):

      Summary:

      The authors have collected a significant amount of data from the literature on the flow regimes associated with microorganisms whose propulsion is achieved through the action of cilia or flagella, with particular interest in the competition between sessile and motile lifestyles. They then use several distinct hydrodynamic models for the cilia-driven flows to quantify the nutrient uptake and clearance rate, reported as a function of the Peclet number. Among the interesting conclusions the authors draw concerns the question of whether, for certain ciliates, there is a clear difference in nutrient uptake rates in the sessile versus motile forms. The authors show that this is not the case, thereby suggesting that the evolutionary pressure associated with such a difference is not present. The analysis also includes numerical calculations of the uptake rate for spherical swimmers in the regime of large Peclet numbers, where the authors note an enhancement due to advection-generated thinning of the solutal boundary layer around the organism.

      Strengths:

      In addressing the whole range of organism sizes and Peclet numbers the authors have achieved an important broad perspective on the problem of nutrient uptake of ciliates, with implications for understanding evolutionary driving forces toward particular lifestyles (e.g. sessile versus motile).

      Weaknesses:

      The authors appear to be unaware of rather similar calculations that were done some years ago in the context of Volvox, in which the issue of the boundary layer size and nutrient uptake enhancement were clearly recognized [M.B. Short, et al., Flows Driven by Flagella of Multicellular Organisms Enhance Long-Range Molecular Transport, PNAS 103, 8315-8319 (2006)]. This reference also introduced the model of a fixed shear stress at the surface of the sphere as a representation of the action of the cilia, which may be more realistic than the squirmer-type boundary condition, although the two lead to similar large-Pe scalings.

      The findings reported in Figure 4, that the uptake rate is robust to variations in cilia coverage and absorption fraction, are similar in spirit to an observation made recently in the context of the somatic cell neighbourhood areas in Vovox [Day, et al., eLife 11, e72707 (2022)]. There, it was found that while there is a broad distribution of those areas, and hence of the coarse-grained tangential flagellar force acting on the fluid, the propulsion speed is rather insensitive to those variations.

    1. Author response:

      First we thank the reviewers for a thorough reading of our paper and some useful comments. A recurrent remark of the reviewers concerns the appearance of kRas-expressing cells (labelled by a nuclear blue fluorescent marker) which we attribute to the progeny of the initially induced cell. The reviewers suggest that these cells may have been obtained through activation of the Cre-recombinase in other cells by cyclofen released from light scattering, via diffusion, leakiness, etc. These remarks are perfectly reasonable from people not familiar with the cyclofen uncaging approach that we are using but are unwarranted as we shall show below.

      We have been using cyclofen uncaging with subsequent activation of a Cre-recombinase (or some other proteins) since 2010 (see ref.34, Sinha et al., Zebrafish 7, 199-204 (2010) and our 2018 review (ref.35, Zhang et al., ChemBioChem 19,1-8 (2018)). In our experiments, the embryos are incubated in the dark in 6M caged cyclofen (cCyc) and washed in E3 medium (or transferred to a new medium with no cCyc). In these conditions, over many years we never observed activation of the recombinase, i.e. the appearance of the associated fluorescent label in cells of embryos grown in E3 medium. Hence leakiness can be ruled out (in presence of cCyc or in its absence).

      Following transfer of the embryos to new E3 medium we illuminate the embryos locally with light at 405nm. In these conditions, cCyc is only partially uncaged and results in activation of Cre-recombinase in only a few cells (1,2, 3, …) within the illuminated region only, namely in the appearance of the kRas-associated nuclear blue fluorescent label in usually one cell (and sometimes in a few more; data and statistics will be incorporated in a revised manuscript). In absence of any further treatment (e.g. activation of a reprogramming factor) these fluorescently labelled cells disappear within a few days (either via shut-down of their promotor, apoptosis or some other mechanism). The crucial point here is that we see less and not more kRas expressing cells (i.e. with nuclear blue fluorescence). This observation rules out activation of Cre-recombinase in other cells days after illumination due to leakiness, cyclofen released by light or diffusing from the illumination spot.

      To observe many more fluorescent cells days after activation of the initial cell, one needs to transiently activate VentX-GR by overnight incubation in dexamethasone (DEX) (Injecting the embryos at 1-cell stage with VentX-GR or incubating them in DEX does not result in the appearance of more blue fluorescent cells). Following activation of VentX-GR, the fluorescent cells observed a couple of days after initiation are visualized in E3 medium (i.e. in absence of cyclofen) and are localized to the vicinity of the otic vesicle (the region where the initial cell was activated). In a revised manuscript we will present images of these fluorescent cells taken a few days apart from the same embryo in which a single cell was initially activated. Hence, we attribute these cells to the progeny of the activated cell. Obviously, single cell tracking via time-lapse microscopy would nail down this issue and provide fascinating insight into the initial stages of tumor growth. Unfortunately, immobilization of embryos in the usual medium (e.g. MS222, tricaine) over 5-6 days to track the division and motion of single cells is not possible. We are considering some other possibilities (immobilization in bungarotoxin or via photo-activation of anionic channels), but these challenging experiments are for a future paper.

      Reviewer #1 (Public Review):

      The authors then performed allotransplantations of allegedly single fluorescent TICs in recipient larvae and found a large number of fluorescent cells in distant locations, claiming that these cells have all originated from the single transplanted TIC and migrated away. The number of fluorescent cells showed in the recipient larve just after two days is not compatible with a normal cell cycle length and more likely represents the progeny of more than one transplanted cell.

      As mentioned in the manuscript, we measure the density of cells/nl and inject in the yolk of 2dpf Nacre embryos a volume containing about 1 cell, following published protocols (S.Nicoli and M.Presta, Nat.Prot. 2,2918 (2007)). We further image the injected cell(s) by fluorescence microscopy immediately following injection, as shown in Fig.4A and Fig.S8B. We might miss a few cells but not many. With a typical cell cycle of ~10h the images of tumors in larvae at 3dpt (and not 2dpt as misunderstood by this reviewer) correspond to ~100 cells. In any case the purpose of this experiment was not to study tumorigenesis upon transplantation but to show that the progeny of the initially induced cells is capable of developing into a tumor in a naïve fish, which is the operational definition of cancer that we adopted here.

      The ability to migrate from the injection site should be documented by time-lapse microscopy.

      As stated above our purpose here is not to study tumor formation from transplanted cell(s) but to use that assay as an operational test of cancer. Besides as mentioned earlier single cell tracking in larvae over 3-4dpt is not a trivial task.

      Then, the authors conclude that "By allowing for specific and reproducible single cell malignant transformation in vivo, their optogenetic approach opens the way for a quantitative study of the initial stages of cancer at the single cell level". However, the evidence for these claims are weak and further characterization should be performed to:

      (1) show that they are actually activating the oncogene in a single cell (the magnification is too low and it is difficult to distinguish a single nucleus, labelling of the cell membrane may help to demonstrate that they are effectively activating the oncogene in, or transplanting, a single cell)

      In a revised manuscript we will provide larger magnification of the initial induced cell and show examples of oncogene activation in more than one cell.

      (2) the expression of the genes used as markers of tumorigenesis is performed in whole larvae, with only a few transformed cells in them. Changes should be confirmed in FACS sorted fluorescent cells

      When the oncogene is activated in a whole larvae all cells are fluorescent and thus FACS is of no use for cell sorting. Sorting could be done in larvae where single cells are activated, but then the efficiency of FACS is not good enough to isolate the few fluorescent cells among the many more non-fluorescent ones. We agree that the change in expression of the genes used as markers of tumorigenesis is an underestimate of their true change, but our goal at this time is not to precisely measure the change in expression level, but to show that the pattern of change is different from the controls and corresponds to what is expected in tumorigenesis.

      (3) the histology of the so called "tumor masses" is not showing malignant transformation, but at the most just hyperplasia.

      The histology of the hyperplasic tissues displays cellular proliferation with a higher density of nuclear material which is characteristic of tumors, Fig.S4C. Besides the increased expression of pERK in these tissues, Fig.S4A,B is also a hallmark of cancer.

      In the brain, the sections are not perfectly symmetrical and the increase of cellularity on one side of the optic tectum is compatible with this asymmetry.

      The expected T-shape formed by the sections of the tegmentum and hypothalamus are compatible with the symmetric sections shown in Fg.2D. The asymmetry in the optic tectum is a result of the hyperplasic growth.

      (4) The number of fluorescent cells found dispersed in the larvae transplanted with one single TIC after 48 hours will require a very fast cell cycle to generate over 50 cells. Do we have an idea of the cell cycle features of the transplanted TICs?

      As answered above, the transplanted larvae are shown at 3dpt (and not 2dpt as misunderstood by this reviewer). With a cell cycle of about 10h, a single cell can give rise to about 100 cells in that time lapse.

      Reviewer #2 (Public Review):

      Summary:

      This paper describes a genetically tractable and modifiable system …which could be used to study an array of combinations and temporal relationships of these cancer drivers/modifiers.

      We thank this referee for its positive comments. We would also like to point out that our approach provides for the first quantitative means to estimate the probability of tumorigenesis from a single cell, an estimate which is crucial in any assessment of cancer malignancy and the effectiveness of prophylactics.

      Weaknesses:

      There is minimal quantitation of … the efficiency of activation of the Ras-TFP fusion (Fig 1) in, purportedly, a single cell. …, such information seems essential.

      In a revised manuscript we will add more images of induction of a single (or a few cells) and a table where the efficiency of RAS activation is detailed.

      The authors indicate that a single cell is "initiated" (Fig 2) using the laser optogenetic technique, but without definitive genetic lineage tracing, it is not possible to conclude that cells expressing TFP distant from the target site near the ear are daughter cells of the claimed single "initiated" cell. A plausible alternative explanation is 1) that the optogenetic targeting is more diffuse (i.e. some of the light of the appropriate wavelength hits other cells nearby due to reflection/diffraction), so these adjacent cells are additional independent "initiated" cells or 2) that the uncaged tamoxifen analogue can diffuse to nearby cells and allow for CreER activation and recombination.

      We have addressed this point in our general comments to the reviewers’ remarks. The possibilities mentioned by this reviewer would result in cells expressing TFP in absence of VentX activation, which is not the case. Cells expressing TFP away from the initial site are observed days after activation of the oncogene (and TFP) in a single cell and only upon activation of VentX.

      In Fig 2B, the claim is made that "the activated cell has divided, giving rise to two cells" - unless continuously imaged or genetically traced, this is unproven.

      We have addressed this remark previously. Tracking of larvae over many days is not possible with the usual protocol using tricaine to immobilize the larvae. Nonetheless, in a revised version we will present images of an embryo imaged at various times post activation where proliferation of the cells can be observed. We are pursuing other alternatives for time-lapse microscopy over many days since, besides convincing the sceptics, a single cell tracking experiment (possibly coupled with in-situ spatial transcriptomics) will shed a new and fascinating light on the initial stages of tumor growth.

      In addition, it appears that Figures S3 and S4 are showing that hyperplasia can arise in many different tissues (including intestine, pancreas, and liver, S4C) with broad Ras + Ventx activation …. This should be clarified in the manuscript).

      This is true and will be clarified in the new version.

      In Fig S7 where single cell activation and potential metastasis is discussed, similar gut tissues have TFP+ cells that are called metastatic, but this seems consistent with the possibility that multiple independent sites of initiation are occurring even when focal activation is attempted.

      As mentioned previously this is ruled out by the fact that these cells are observed days after cyclofen uncaging (and TFP activation) and if and only if VentX is activated.

      Although the hyperplastic cells are transplantable (Fig 4), the use of the term "cells of origin of cancer" or metastatic cells should be viewed with care in the experiments showing TFP+ cells (Fig 1, 2, 3) in embryos with targeted activation for the reasons noted above.

      The purpose of this transplantation experiment was to show that cell in which both kRas and VentX have been activated possess the capacity to metastasize and develop a tumor mass when transplanted in a naïve zebrafish. This - to the best of our knowledge - is the operational definition of a malignant tumor.

      Reviewer #3 (Public Review):

      Summary:

      This study employs an optogenetics approach … to examine tumourigenesis probabilities under altered tissue environments.

      We thank this reviewer for this remark, since we believe that the opportunity to assess the probability of tumorigenesis from a single cell is possibly the most significant contribution of this work. To the best of our knowledge this has never been done before.

      Weaknesses:

      Lack of Methodological Clarity: The manuscript lacks detailed descriptions of methodologies,

      In a revised manuscript we will include additional detail of our methodology.

      Sub-optimal Data Presentation and Quality:

      Lack of quantitative data and control condition data obtained from images of higher magnification limits the ability to robustly support the conclusions.

      In a revised version we will include more images at higher magnification and quantitative data to support the main report of targeted single cell induction.

      Here are some details:

      Authors might want to provide more evidence to support their claim on the single cell KRAS activation.

      More images and a data on activation of single or few cells in the illumination field will be provided in a revised version.

      · Stability of cCYC: The manuscript does not provide information on the half-life and stability of cCYC. Understanding these properties is crucial for evaluating the system's reliability and the likelihood of leakiness, which could significantly influence the study's outcomes.

      We have been using the cCyc system for about 14 years. We refer the reader to our previous papers and reviews on this methodology (e.g. ref. 34,35). Briefly, cCyc is stable when not illuminated with light around 375nm. Typically, we incubate our embryos in the dark for about 1h before transferring them into E3 medium and illuminating them. Assessing the leakiness of the system is easy as expression of the fluorescent marker is permanently turned on. We have observed none in the conditions of our experiment.

      · Metastatic Dissemination claim: However, the absence of a supportive cellular compartment within the fin-fold tissue makes the presence of mTFP-positive metastatic cells there particularly puzzling. This distribution raises concerns about the spatial specificity of the optogenetic activation protocol … The unexpected locations of these signals suggest potential ectopic activation of the KRAS oncogene,

      We have addressed this remark in the introduction and above. Specifically, metastatic and proliferative mTFP-positive cells are observed if and only if VentX is also activated concomitant with activation of kRAS in a single cell. No proliferative cells are observed in absence of VentX activation, or in presence of VentX or Dex alone, or if kRAS has not been activated by cyclofen uncaging.

      · Image Resolution Concerns: The cells depicted in Figure 3C β, which appear to be near the surface of the yolk sac and not within the digestive system as suggested in the MS, underscore the necessity for higher-resolution imaging. Without clearer images, it is challenging to ascertain the exact locations and states of these cells, thus complicating the assessment of experimental results.

      Better images will be provided in the revised version.

      · The cell transplantation experiment is lacking protocol details:

      Details will be provided in the revised version. We have followed regular protocols for transplantation: S.Nicoli and M.Presta, Nat.Prot. 2,2918 (2007).

      • If the cells are obtained from whole larvae with induced RAS + VX expression, it is notable and somewhat surprising that the larvae survived up to six days post-induction (6dpi) before cells were harvested for transplantation. This survival rate and the subsequent ability to obtain single cell suspensions raise questions about the heterogeneity of the RAS + VX expressing cells that transplanted.

      From Fig.S4D, about 50% of the embryos survive at 6dpi. Though an interesting question by itself we have not (yet) addressed the important issue of the heterogeneity of the outgrowth obtained from a single cell. Our purpose here was just to show that cells in which both kRAS and VentX have been activated possess the capacity to metastasize and develop a tumor mass when transplanted in a naïve zebrafish. This - to the best of our knowledge - is the operational definition of a malignant tumor.

      · Unclear Experimental Conditions in Figure S3B: …It is not specified whether the activation of KRAS was targeted to specific cells or involved whole-body exposure.

      This was whole body (global) illumination and will be specified in the revised version.

      · Contrasting Data in Figure S3C compared to literature: The graph in Figure S3C indicates that KRAS or KRAS + DEX induction did not result in any form of hyperplastic growth. The authors should provide detailed descriptions of the conditions under which the experiments were conducted in Figure S3B and clarifying the reasons for the discrepancies observed in Figure S3C are crucial. The authors should discuss potential reasons for the deviation from previous reports.

      This discrepancy will be discussed in the revised version. First the previous reports consider the development of tumors over a longer time-span (4-5 weeks) which we have not studied here. Second, the expression of the oncogene in these reports might be stronger than in ours. Third, the stochastic appearance of tumors in these reports suggest that some other mechanism (transient stress-induced reprogramming?) might have activated the oncogene in the initial cell.

      Further comments:

      Throughout the study, KRAS-activated cell expansion and metastasis are two key phenotypes discussed that Ventx is promoting. However, the authors did not perform any experiments to directly show that KRAS+ cells proliferate only in Ventx-activated conditions.

      Yes, we did. See Fig. S1 and compare with Fig.S3B, or Fig.S8A in comparison with Fig.2A,B.

      The authors also did not show any morphological features or time-lapse videos demonstrating that KRAS+ cells are motile, even though zebrafish is an excellent model for in vivo live imaging. This seems to be a missed opportunity for providing convincing evidence to support the authors' conclusions.

      Performing single cell time-lapse microscopy on larvae over many (4-5) days is not possible with the regular tricaine protocol for immobilization. We are definitely planning such experiments, but they will require some other protocol, perhaps using bungarotoxin or some optogenetic inhibitory channels. Nonetheless, in the revised version we will show images of the same embryos at various times post single cell induction displaying proliferation of cells.

      There were minimal experimental details provided for the qPCR data presented in the supplementary figures S5 and S6, therefore, it is hard to evaluate result obtained.

      More details will be given in the revised version.

    1. eLife assessment

      In this study, Tutak and colleagues set out to identify factors that mediate Repeat Associated Non-AUG (RAN) translation of CGG repeats in the FMR1 mRNA which are implicated in toxic protein accumulation that underpins ensuing neurological pathologies. This is a useful article that suggests that RPS26 may be implicated in mediating the RAN translation of FMR1 mRNA. However, the evidence supporting the proposed mechanism is incomplete, since the provided data only partially support the authors' conclusion.

    2. Reviewer #1 (Public Review):

      Summary:

      In this manuscript, Tutak et al use a combination of pulldowns, analyzed by mass spectrometry, reporter assays, and fluorescence experiments to decipher the mechanism of protein translation in fragile X-related diseases. The topic is interesting and important.

      Although a role for Rps26-deficient ribosomes in toxic protein translation is plausible based on already available data, the authors' data are not carefully controlled and thus do not support the conclusions of the paper.

      Strengths:

      The topic is interesting and important.

      Weaknesses:

      In particular, there is very little data to support the notion that Rps26-deficient ribosomes are even produced under the circumstances. And no data that indicate that they are involved in the RAN translation. Essential controls (for ribosome numbers) are lacking, no information is presented on the viability of the cells (Rps26 is an essential protein), and the differences in protein levels could well arise from block in protein synthesis, and cell division coupled to differential stability of the proteins.

      Specific points:

      (1) Analysis of the mass spec data in Supplemental Table S3 indicates that for many of the proteins that are differentially enriched in one sample, a single peptide is identified. So the difference is between 1 peptide and 0. I don't understand how one can do a statistical analysis on that, or how it would give out anything of significance. I certainly do not think it is significant. This is exacerbated by the fact that the contaminants in the assay (keratins) are many, many-fold more abundant, and so are proteins that are known to be mitochondrial or nuclear, and therefore likely not actual targets (e.g. MCCC1, PC, NPM1; this includes many proteins "of significance" in Table S1, including Rrp1B, NAF1, Top1, TCEPB, DHX16, etc...).

      The data in Table S6/Figure 3A suffer from the same problem.

      I am not convinced that the mass spec data is reliable.

      (2) The mass-spec data however claims to identify Rps26 as a factor binding the toxic RNA specifically. The rest of the paper seeks to develop a story of how Rps26-deficient ribosomes play a role in the translation of this RNA. I do not consider that this makes sense.

      (3) Rps26 is an essential gene, I am sure the same is true for DHX15. What happens to cell viability? Protein synthesis? The yeast experiments were carefully carried out under experiments where Rps26 was reduced, not fully depleted to give small growth defects.

      (4) Knockdown efficiency for all tested genes must be shown to evaluate knockdown efficiency.

      (5) The data in Figure 1E have just one mock control, but two cell types (control si and Rps26 depletion).

      (6) The authors' data indicate that the effects are not specific to Rps26 but indeed also observed upon Rps25 knockdown. This suggests strongly that the effects are from reduced ribosome content or blocked protein synthesis. Additional controls should deplete a core RP to ascertain this conclusion.

      (7) Supplemental Figure S3 demonstrates that the depletion of S26 does not affect the selection of the start codon context. Any other claim must be deleted. All the 5'-UTR logos are essentially identical, indicating that "picking" happens by abundance (background).

      (8) Mechanism is lacking entirely. There are many ways in which ribosomes could have mRNA-specific effects. The authors tried to find an effect from the Kozak sequence, unsuccessfully (however, they also did not do the experiment correctly, as they failed to recognize that the Kozak sequence differs between yeast, where it is A-rich, and mammalian cells, where it is GGCGCC). Collisions could be another mechanism.

    3. Reviewer #2 (Public Review):

      Summary:

      Translation of CGG repeats leads to the accumulation of poly G, which is associated with neurological disorders. This is a valuable paper in which the authors sought out proteins that modulate RAN translation. They determined which proteins in Hela cells bound to CGG repeats and affected levels of polyG encoded in the 5'UTR of the FMR1 mRNA. They then showed that siRNA depletion of ribosomal protein RPS26 results in less production of FMR1polyG than in control. There are data supporting the claim that RPS26 depletion modulates RAN translation in this RNA, although for some results, the Western results are not strong. The data to support increased aggregation by polyG expression upon S26 KD are incomplete.

      Strengths:

      The authors have proteomics data that show the enrichment of a set of proteins on FMR1 RNA but not a related RNA.

      Weaknesses:

      -It is insinuated that RPS26 binds the RNA to enhance CGG-containing protein expression. However, RPS26 reduction was also shown previously to affect ribosome levels, and reduced ribosome levels can result in ribosomes translating very different RNA pools.

      -A significant claim is that RPS26 KD alleviates the effects of FMR polyG expression, but those data aren't presented well.

    4. Reviewer #3 (Public Review):

      Tutak et al provide interesting data showing that RPS26 and relevant proteins such as TSR2 and RPS25 affect RAN translation from CGG repeat RNA in fragile X-associated conditions. They identified RPS26 as a potential regulator of RAN translation by RNA-tagging system and mass spectrometry-based screening for proteins binding to CGG repeat RNA and confirmed its regulatory effects on RAN translation by siRNA-based knockdown experiments in multiple cellular disease models and patient-derived fibroblasts. Quantitative mass spectrometry analysis found that the expressions of some ribosomal proteins are sensitive to RPS26 depletion while approximately 80% of proteins including FMRP were not influenced. Since the roles of ribosomal proteins in RAN translation regulation have not been fully examined, this study provides novel insights into this research field. However, some data presented in this manuscript are limited and preliminary, and their conclusions are not fully supported.

      (1) While the authors emphasized the importance of ribosomal composition for RAN translation regulation in the title and the article body, the association between RAN translation and ribosomal composition is apparently not evaluated in this work. They found that specific ribosomal proteins (RPS26 and RPS25) can have regulatory effects on RAN translation(Figures 1C, 2B, 2C, 2E, 4A, 5A, and 5B), and that the expression levels of some ribosomal proteins can be changed by RPS26 knockdown (Figure 3B, however, the change of the ribosome compositions involved in the actual translation has not been elucidated). Therefore, their conclusive statement, that is, "ribosome composition affects RAN translation" is not fully supported by the presented data and is misleading.

      (2) The study provides insufficient data on the mechanisms of how RPS26 regulates RAN translation. Although authors speculate that RPS26 may affect initiation codon fidelity and regulate RAN translation in a CGG repeat sequence-independent manner (Page 9 and Page 11), what they really have shown is just identification of this protein by the screening for proteins binding to CGG repeat RNA (Figure 1A, 1B), and effects of this protein on CGG repeat-RAN translation. It is essential to clarify whether the regulatory effect of RPS26 on RAN translation is dependent on CGG repeat sequence or near-cognate initiation codons like ACG and GUG in the 5' upstream sequence of the repeat. It would be better to validate the effects of RPS26 on translation from control constructs, such as one composed of the 5' upstream sequence of FMR1 with no CGG repeat, and one with an ATG substitution in the 5' upstream sequence of FMR1 instead of near-cognate initiation codons.

      (3) The regulatory effects of RPS26 and other molecules on RAN translation have all been investigated as effects on the expression levels of FMRpolyG-GFP proteins in cellular models expressing CGG repeat sequences (Figures 1C, 2B, 2C, 2E, 4A, 5A, and 5B). In these cellular experiments, there are multiple confounding factors affecting the expression levels of FMRpolyG-GFP proteins other than RAN translation, including template RNA expression, template RNA distribution, and FMRpolyG-GFP protein degradation. Although authors evaluated the effect on the expression levels of template CGG repeat RNA, it would be better to confirm the effect of these regulators on RAN translation by other experiments such as in vitro translation assay that can directly evaluate RAN translation.

      (4) While the authors state that RPS26 modulated the FMRpolyG-mediated toxicity, they presented limited data on apoptotic markers, not cellular viability (Figure 1E), not fully supporting this conclusion. Since previous work showed that FMRpolyG protein reduces cellular viability (Hoem G et al., Front Genet 2019), additional evaluations for cellular viability would strengthen this conclusion.

    5. Author response:

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this manuscript, Tutak et al use a combination of pulldowns, analyzed by mass spectrometry, reporter assays, and fluorescence experiments to decipher the mechanism of protein translation in fragile X-related diseases. The topic is interesting and important.

      Although a role for Rps26-deficient ribosomes in toxic protein translation is plausible based on already available data, the authors' data are not carefully controlled and thus do not support the conclusions of the paper.

      Strengths:

      The topic is interesting and important.

      Weaknesses:

      In particular, there is very little data to support the notion that Rps26-deficient ribosomes are even produced under the circumstances. And no data that indicate that they are involved in the RAN translation. Essential controls (for ribosome numbers) are lacking, no information is presented on the viability of the cells (Rps26 is an essential protein), and the differences in protein levels could well arise from block in protein synthesis, and cell division coupled to differential stability of the proteins.

      We agree that presented data could benefit from addition of suggested experiments. We will  address the ribosome content, global translation rate and cell viability upon RPS26 depletion. We are also planning to apply polysome profiling to determine if RPS26-depleted ribosomes are translationally active.

      Specific points:

      (1) Analysis of the mass spec data in Supplemental Table S3 indicates that for many of the proteins that are differentially enriched in one sample, a single peptide is identified. So the difference is between 1 peptide and 0. I don't understand how one can do a statistical analysis on that, or how it would give out anything of significance. I certainly do not think it is significant. This is exacerbated by the fact that the contaminants in the assay (keratins) are many, many-fold more abundant, and so are proteins that are known to be mitochondrial or nuclear, and therefore likely not actual targets (e.g. MCCC1, PC, NPM1; this includes many proteins "of significance" in Table S1, including Rrp1B, NAF1, Top1, TCEPB, DHX16, etc...).

      The data in Table S6/Figure 3A suffer from the same problem.

      Tables S3 and S6 show the mass spectrometry output data from MaxQuant analysis  without any flittering.  Certain identifications, i.e. those denoted as contaminants (such as keratins) were removed during statistical analysis in Perseus software. Regarding the data presented in Table S6 (SILAC data), we argue that these data are of very good quality. More than 2000 proteins were identified in a 125min gradient, with over 80% of proteins that were identified with at least 2 unique peptides. However, we acknowledge that the description of Tables S3 and S6 may lead to misunderstanding, thus we will clarify their explanation.

      I am not convinced that the mass spec data is reliable.

      (2) The mass-spec data however claims to identify Rps26 as a factor binding the toxic RNA specifically. The rest of the paper seeks to develop a story of how Rps26-deficient ribosomes play a role in the translation of this RNA. I do not consider that this makes sense.

      Indeed, we identified RPS26 as a protein co-precipitated with FMR1 RNA containing expanded CGG repeats. However, we do not claim that they interact directly. Downregulation of FMRpolyG biosynthesis could be an outcome of the alteration of ribosomal assembly, changes in efficiency and fidelity of PIC scanning or impeded elongation or more likely combination of some of these processes. We will  provide better explanation regarding those issues in the revised version of the manuscript.

      (3) Rps26 is an essential gene, I am sure the same is true for DHX15. What happens to cell viability? Protein synthesis? The yeast experiments were carefully carried out under experiments where Rps26 was reduced, not fully depleted to give small growth defects.

      We agree with the Reviewer 1 that RPS26 is an essential protein. Previously, it was shown that cell viability in cells with mutated C-terminal deletion of RPS26 is decreased (Havkin-Solomon T, Nucleic Acids Res 2023). We will address the question regarding the suppression of FMRpolyG in models with partial RPS26 knock-down.

      (4) Knockdown efficiency for all tested genes must be shown to evaluate knockdown efficiency.

      Missing experiments showing efficiency of knock-down will be included in the revised version of the manuscript.

      (5) The data in Figure 1E have just one mock control, but two cell types (control si and Rps26 depletion).

      We will clarify this ambiguity in the revised version of the manuscripts.

      (6) The authors' data indicate that the effects are not specific to Rps26 but indeed also observed upon Rps25 knockdown. This suggests strongly that the effects are from reduced ribosome content or blocked protein synthesis. Additional controls should deplete a core RP to ascertain this conclusion.

      We agree that observed effect may stem partially from reduced ribosome content, however, we argue that this is not the only explanation. In the publication concerning RPS25 regulation of G4C2-related RAN translation (Yamada SB, 2019, Nat Neurosci), it was shown that RPS25 KO does not affect global translation. Our experiments (SUnSET assay, unpublished) indicated that RPS26 KD also did not reduce global translation rate significantly. We will present that data in the revised version of the manuscript.

      (7) Supplemental Figure S3 demonstrates that the depletion of S26 does not affect the selection of the start codon context. Any other claim must be deleted. All the 5'-UTR logos are essentially identical, indicating that "picking" happens by abundance (background).

      Results shown in Fig.S3 does not imply that RPS26 does not affect the selection of start codon context entirely. We just tested a few hypotheses. We decided to test -4 position, because this position was indicated as the most sensitive to RPS26 regulation in yeast (Ferretti M, 2017, Nat Struct Mol Biol). Regarding WebLOGO analysis; we wrote in the manuscript that we did not identify any specific motif or enrichment within analysed transcripts in comparison to background. We will clarify this ambiguity in revised version of the manuscript.

      (8) Mechanism is lacking entirely. There are many ways in which ribosomes could have mRNA-specific effects. The authors tried to find an effect from the Kozak sequence, unsuccessfully (however, they also did not do the experiment correctly, as they failed to recognize that the Kozak sequence differs between yeast, where it is A-rich, and mammalian cells, where it is GGCGCC). Collisions could be another mechanism.

      As in (7).

      Reviewer #2 (Public Review):

      Summary:

      Translation of CGG repeats leads to the accumulation of poly G, which is associated with neurological disorders. This is a valuable paper in which the authors sought out proteins that modulate RAN translation. They determined which proteins in Hela cells bound to CGG repeats and affected levels of polyG encoded in the 5'UTR of the FMR1 mRNA. They then showed that siRNA depletion of ribosomal protein RPS26 results in less production of FMR1polyG than in control. There are data supporting the claim that RPS26 depletion modulates RAN translation in this RNA, although for some results, the Western results are not strong. The data to support increased aggregation by polyG expression upon S26 KD are incomplete.

      Strengths:

      The authors have proteomics data that show the enrichment of a set of proteins on FMR1 RNA but not a related RNA.

      Weaknesses:

      - It is insinuated that RPS26 binds the RNA to enhance CGG-containing protein expression. However, RPS26 reduction was also shown previously to affect ribosome levels, and reduced ribosome levels can result in ribosomes translating very different RNA pools.

      We agree that presented data could benefit from addition of some experiments. Therefore we will address questions regarding the ribosome content, global translation rate and cell viability upon RPS26 depletion. We are also planning to apply polysome profiling to determine if RPS26-depleted ribosomes are translationally active. However, we did not state that RPS26 binds directly to RNA with expanded CGG repeats and that this interaction is crucial for translation regulation of studied RNA. We just tested such hypotheses. We will improve the text narration in revised version of the manuscript to make major conclusions clearer.

      - A significant claim is that RPS26 KD alleviates the effects of FMRpolyG expression, but those data aren't presented well.

      We thank the Reviewer 2 for this comment. We will show the data derived from a few different cell models that we already have obtained. Moreover, we will include results of experiments with luminescence readout for FMRpolyG fused with luciferase upon RPS26 KD.

      Reviewer #3 (Public Review):

      Tutak et al provide interesting data showing that RPS26 and relevant proteins such as TSR2 and RPS25 affect RAN translation from CGG repeat RNA in fragile X-associated conditions. They identified RPS26 as a potential regulator of RAN translation by RNA-tagging system and mass spectrometry-based screening for proteins binding to CGG repeat RNA and confirmed its regulatory effects on RAN translation by siRNA-based knockdown experiments in multiple cellular disease models and patient-derived fibroblasts. Quantitative mass spectrometry analysis found that the expressions of some ribosomal proteins are sensitive to RPS26 depletion while approximately 80% of proteins including FMRP were not influenced. Since the roles of ribosomal proteins in RAN translation regulation have not been fully examined, this study provides novel insights into this research field. However, some data presented in this manuscript are limited and preliminary, and their conclusions are not fully supported.

      (1) While the authors emphasized the importance of ribosomal composition for RAN translation regulation in the title and the article body, the association between RAN translation and ribosomal composition is apparently not evaluated in this work. They found that specific ribosomal proteins (RPS26 and RPS25) can have regulatory effects on RAN translation(Figures 1C, 2B, 2C, 2E, 4A, 5A, and 5B), and that the expression levels of some ribosomal proteins can be changed by RPS26 knockdown (Figure 3B, however, the change of the ribosome compositions involved in the actual translation has not been elucidated). Therefore, their conclusive statement, that is, "ribosome composition affects RAN translation" is not fully supported by the presented data and is misleading.

      We thank Reviewer 3 for critical comments and suggestions. We agree that the proposed title may be misleading and the presented data does not fully support the aforementioned statement regarding ribosomal composition affecting FMRpolyG synthesis. Hence, we will change the title together with a narrative regarding these unfortunate statements that go beyond the presented results.

      (2) The study provides insufficient data on the mechanisms of how RPS26 regulates RAN translation. Although authors speculate that RPS26 may affect initiation codon fidelity and regulate RAN translation in a CGG repeat sequence-independent manner (Page 9 and Page 11), what they really have shown is just identification of this protein by the screening for proteins binding to CGG repeat RNA (Figure 1A, 1B), and effects of this protein on CGG repeat-RAN translation. It is essential to clarify whether the regulatory effect of RPS26 on RAN translation is dependent on CGG repeat sequence or near-cognate initiation codons like ACG and GUG in the 5' upstream sequence of the repeat. It would be better to validate the effects of RPS26 on translation from control constructs, such as one composed of the 5' upstream sequence of FMR1 with no CGG repeat, and one with an ATG substitution in the 5' upstream sequence of FMR1 instead of near-cognate initiation codons.

      We will address the question regarding the influence of the content of CGG repeats and START codon selection (including different near-cognate start codons) on RPS26-sensitive translation, and present these data in revised version of the manuscript.

      (3) The regulatory effects of RPS26 and other molecules on RAN translation have all been investigated as effects on the expression levels of FMRpolyG-GFP proteins in cellular models expressing CGG repeat sequences Figures 1C, 2B, 2C, 2E, 4A, 5A, and 5B). In these cellular experiments, there are multiple confounding factors affecting the expression levels of FMRpolyG-GFP proteins other than RAN translation, including template RNA expression, template RNA distribution, and FMRpolyG-GFP protein degradation. Although authors evaluated the effect on the expression levels of template CGG repeat RNA, it would be better to confirm the effect of these regulators on RAN translation by other experiments such as in vitro translation assay that can directly evaluate RAN translation.

      We agree that there are multiple factors affecting final translation of investigated mRNA including aforementioned processes. We evaluated the level of FMR1 mRNA, which turned out not to be affected upon RPS26 depletion (Figure 2B&C), however, we will address other possibilities as well.

      (4) While the authors state that RPS26 modulated the FMRpolyG-mediated toxicity, they presented limited data on apoptotic markers, not cellular viability (Figure 1E), not fully supporting this conclusion. Since previous work showed that FMRpolyG protein reduces cellular viability (Hoem G et al., Front Genet 2019), additional evaluations for cellular viability would strengthen this conclusion.

      We thank Reviewer 3 for this suggestion. We addressed the effect of RPS26 KD on apoptotic process induced by FMRpolyG. We will perform other experiments regarding different aspects of FMRpolyG-mediated cell toxicity as well.

    1. eLife assessment

      This fundamental work has completed our understanding of the singular binding profile of the Rhino HP1 protein to chromatin, a key step in converting certain genomic regions into piRNA source loci. The evidence supporting the conclusions is compelling. Phylogenetic analyses, structure prediction, rigorous biochemical assays and in vivo genetics emphasize the importance of the Rhino chromodomain in the recognition of both a histone mark and a DNA-binding protein, and highlight the importance of a single chromodomain residue in the protein-protein interaction.

    2. Reviewer #1 (Public Review):

      Summary:

      The manuscript focuses on an unexpected finding that a tiny change in a protein's aminoacid sequence can redefine its biological function. The authors' data and analyses explain how a chromodomain, typically implicated in interactions with histones, can also mediate binding of HP1 homolog Rhino to the non-histone partner protein Kipferl. They elegantly pinpoint the capacity for such interaction to a single aminoacid substitution (in fact, a single-nucleotide! substitution).

      Strengths:

      Both genetic and biochemical approaches are applied to rigorously probe the proposed explanation. The authors find their predictions to be borne out both in vivo, in mutant animals, and in biochemical experiments. The manuscript also features phylogenetic comparisons that put the finding into a broader evolutionary perspective.

      Weaknesses pointed out in the original submission were addressed in the revised manuscript.

    3. Reviewer #3 (Public Review):

      Summary:

      This article is a direct follow-up to the paper published last year in eLife by the same group. In the previous article, the authors discovered a zinc finger protein, Kipferl, capable of guiding the HP1 protein Rhino towards certain genomic regions enriched in GRGGN motifs and packaged in heterochromatin marked by H3K9me3. Unlike other HP1 proteins, Rhino recruitment activates the transcription of heterochromatic regions, which are then converted into piRNA source loci. The molecular mechanism by which Kipferl interacts specifically with Rhino (via its chromodomain) and not with other HP1 proteins remained enigmatic.

      In this latest article, the authors go a step further by elucidating the molecular mechanisms important for the specific interaction of Rhino and not other HP1 proteins with Kipferl. A phylogenetic study carried out between the HP1 proteins of 5 Drosophila species led them to study the importance of an AA Glycine at position 31 located in the Rhino chromodomain, an AA different from the AA (aspartic acid) found at the same position in the other HP1 proteins. The authors then demonstrate, through a series of structure predictions, biochemical and genetic experiments, that this specific AA in the Rhino-specific chromodomain explains the difference in the chromatin binding pattern between Rhino and the other Drosophila HP1 proteins. Importantly, the G31D conversion of the Rhino protein prevents interaction between Rhino and Kipferl, phenocopying a Kipfer mutant.

      Strengths:

      The strength of this study is to test at the molecular and genetic level whether the difference in the AA sequence- encovered by phylogenetic analysis of HP1 proteins including Rhino combined with structure prediction- can explain the difference in chromatin binding patterns between HP1 proteins and Rhino.<br /> To do so they have created a Rhino mutant by introducing a point mutation into the endogenous rhino gene, reverting the Glycine in position 31 to the aspartic acid found in all other HP1 proteins. Even if the Rhino G31D mutant retains its ability to interact with H3K9me3 (predictive and biochemistry approaches that I'm less familiar with) it does not localize correctly on the chromatin preventing certain regions such as locus 80F from being converted into piRNA source loci. However other regions such as satellite regions attract the Rhino mutant protein converting them into super piRNA source loci, phenocopying the effects observed in a Kipferl mutant. Why Rhino when not bound to Kipferl concentrates in satellite regions is a question that remains unanswered.

      Weaknesses:

      In this new version of the manuscript, the authors have answered all the questions and weaknesses raised previously.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Review:

      This article is a direct follow-up to the paper published last year in eLife by the same group. In the previous article, the authors discovered a zinc finger protein, Kipferl, capable of guiding the HP1 protein Rhino towards certain genomic regions enriched in GRGGN motifs and packaged in heterochromatin marked by H3K9me3. Unlike other HP1 proteins, Rhino recruitment activates the transcription of heterochromatic regions, which are then converted into piRNA source loci. The molecular mechanism by which Kipferl interacts specifically with Rhino (via its chromodomain) and not with other HP1 proteins remained enigmatic. 

      In this latest article, the authors go a step further by elucidating the molecular mechanisms important for the specific interaction of Rhino and not other HP1 proteins with Kipferl. A phylogenetic study carried out between the HP1 proteins of 5 Drosophila species led them to study the importance of an AA Glycine at position 31 located in the Rhino chromodomain, an AA different from the AA (aspartic acid) found at the same position in the other HP1 proteins. The authors then demonstrate, through a series of structure predictions, biochemical, and genetic experiments, that this specific AA in the Rhino-specific chromodomain explains the difference in the chromatin binding pattern between Rhino and the other Drosophila HP1 proteins. Importantly, the G31D conversion of the Rhino protein prevents interaction between Rhino and Kipferl, phenocopying a Kipferl mutant. 

      Strengths: 

      The authors' effective use of phylogenetic analyses and protein structure predictions to identify a substitution in the chromodomain that allows Rhino's specific interaction with Kipferl is very elegant. Both genetic and biochemical approaches are applied to rigorously probe the proposed explanation. They used a point mutation in the endogenous locus that replaces the Rhino-specific residue with the aspartic acid residue present in all other HP1 family members. This novel allele largely phenocopies the defects in hatch rate, chromatin organization, and piRNA production associated with kipferl mutants, and does not support Kipferl localization to clusters. The data are of high quality, the presentation is clear and concise, and the conclusions are generally well-supported.

      Weaknesses: 

      The reviewers identified potential ways to further strengthen the manuscript.

      (1) The one significant omission is RNAseq on the rhino point mutant, which would allow direct comparison to cluster, transposon, and repeat expression in kipferl mutants. 

      In this eLife Advances submission, we aim to elucidate the molecular interaction between Rhino and the zinc finger protein Kipferl and how it evolved. Using various assays, of which piRNA sequencing is the most relevant and comprehensive, we show that the rhino[G31D] mutation phenocopies a rhino loss-of-function situation for Kipferl and a kipferl loss-of-function situation for Rhino. Further confirmation of this statement by additional RNA-seq experiments to probe the extent of selective TE de-repression would indeed be a possibility. We decided to test for TE de-repression phenotypes using sensitive RNA-FISH experiments of a handful of TEs that are deregulated in kipferl loss of function flies (Baumgartner at al. 2022). This showed that the same TEs are also deregulated in rhino[G31D] flies, further confirming the similarity of the two genotypes. We have added these data to the text and to Figure 5-figure supplement 2, which shows representative RNA FISH images.

      (2) The manuscript would benefit from adding more evolutionary comparisons. The following or similar analyses would help put the finding into a broader evolutionary perspective:

      i) Is Kipferl's surface interacting with Rhino also conserved in Kipferl orthologs? In other words, are the Rhino-interacting amino acids of Kipferl under any pressure to be conserved?

      We performed an analysis of the Kipferl interface that interacts with the Rhino chromodomain in those species where Kipferl could be unambiguously identified. This showed that the residues involved in the Rhino interaction are generally conserved. We have added this analysis to Figure 1-figure supplement 4.

      ii) The remarkable conservation of Rhino's G31 is at odds with the arms race that is proposed to be happening between the fly's piRNA pathway proteins and transposons. Does this mean that Rhino's chromodomain is "untouchable" for such positive selection? 

      We agree that the conservation of the G31 residue argues against this binding interface being under positive selection in Rhino. Without understanding the pressures acting on Rhino that underlie the previously published positive selection, we find it difficult to draw firm conclusions. Mutating G31 in fly species that lack Kipferl would be an interesting experiment.

      Recommendations for the authors:

      (1) RNAseq is important to the full characterization of the phenotype and should be included. It's now clear that the major piRNA clusters are not required for fertility, so I would also include an analysis of piRNA production and Rhino binding to regions flanking isolated insertions. 

      See our response to raised weakness #1 above. Briefly, we have now added an analysis of TE de-repression based on RNA-FISH experiments (Figure 5-figure supplement 2). Regarding the proposed analysis of piRNA production and Rhino binding to regions flanking isolated TE insertions: this is an important issue that we carefully analysed in our previous work characterising the kipferl mutant (Baumgartner et al. 2022). In the present work, we focused on generating a rhino mutant that uncouples Rhino from Kipferl.

      (2) The authors do not provide direct biochemical evidence that the chromodomain substitution blocks Rhino binding to Kipferl. However, Rhino protein is very low abundance, making analysis of the endogenous protein very difficult.

      Based on our previous work (Baumgartner et al 2022), the Rhino chromodomain interacts directly with the fourth zinc finger of Kipferl. Mutation of a single residue in the predicted interface (Rhino[G31D]) phenocopies a kipferl mutant, strongly suggesting that this mutation disrupts the Rhino-Kipferl interaction. Definitive evidence will have to await the reconstitution of this interaction using recombinant proteins. Our attempts to purify recombinant Kipferl (expressed in bacteria or in insect cells) or the protein fragments relevant to the interaction were unsuccessful so far. While we obtained soluble fractions of the first ZnF array, there was always a high level of co-purifying nucleic acids that we were not able to remove.

      (3) Even if the Rhino G31D mutant retains its ability to interact with H3K9me3 it does not localize correctly on the chromatin preventing certain regions such as locus 80F from being converted into piRNA source loci. However other regions such as satellite regions attract the Rhino mutant protein converting them into super piRNA source loci, phenocopying the effects observed in a Kipferl mutant. Why Rhino when not bound to Kipferl concentrates in satellite regions is a question that remains unanswered.

      This is a very interesting question indeed. We have not been able to elucidate the molecular basis of how Rhino is recruited to satellite repeats in Kipferl mutants. For example, we performed a proximity biotinylation experiment with GFP-Rhino in Kipferl mutant ovaries, but this experiment did not reveal any protein that would explain the observed accumulation of Rhino at the complex satellite repeats.

      (4) In the phylogenetic analysis the authors identified two residues as Rhino-specific and conserved sequence alterations, the D31G mutation and the G62 insertion. However, the authors limit their study to D31G mutation, and nothing is performed on the G32 insertion. It would have been interesting to know the impact of this insertion on Rhino's biology. 

      The role, if any, of the Rhino-specific G62 insertion and its effect on Rhino localisation or function is an interesting topic for further study. We have not investigated the G62 residue experimentally. In the current manuscript, we limited our efforts to the analysis of the G31D mutation, as the goal was to identify the mode of interaction with Kipferl, and the G62 residue is not predicted to contact Kipferl according to AlphaFold.

      (5) The authors report that the G31D mutation of Rhino phenocopies the Kipferl mutant. Rhino is wrongly localized in the nucleus, and Rhino G31D recruitment in certain Kipferl-enriched regions is affected, as at the 80F locus, which correlates with a strong drop in piRNA production from this locus. To go a step further in demonstrating that G31D phenocopies the Kipferl mutant, it would have been informative to analyse how much TE piRNAs are affected and whether TEs are deregulated.

      See our response to similar comments above. We have added RNA-FISH experiments to illustrate that the TE de-repression phenotypes are comparable between rhino[G31D] and kipferl loss of function ovaries (Figure 5-figure supplement 2). Analyses of TE-mapping piRNAs also show well correlated phenotypes (Figure 5-figure supplement 1).

      (6) Figure 3A: To homogenize with the immunostaining presented in Figure 3B, can the authors add on the bar graph depicting female fertility the results obtained with kipferl-/- and rhino-/- genotype? 

      rhino mutants are completely (100%) sterile and the fertility of kipferl mutants was previously measured to range between 15% and 40% (Baumgartner et al. 2022).

      (7) Figure 4A: It would have been interesting to show Venn diagrams showing the overlap of genomic regions enriched for Kipferl versus regions enriched for Rhi in a WT and in a Rhi G31D mutant. 

      We consider the analysis presented in Figure 4 to be more meaningful, as a Venn diagram would require binary cut-offs.

      (8) Figure 1B: In the phylogenic analysis for Rhino/HP1d two D. simulans lines are presented. Can the authors clarify this point?

      There are two Rhino paralogs in D. simulans: one paralog (NCBI: AAY34025.1) is more similar to D. melanogaster Rhino, contains one intron and is located at chromosome chr2R (assembly Apr. 2005, WUGSC mosaic 1.0/droSim1: 12256895-12258668). The second paralog (XP_002106478.1) is located on chromosome X (6734493-6735248) and does not contain an intron. We have added a clarifying statement to the corresponding figure legend.

      (9) To determine whether Rhino G31D point mutation affects the overall function of Rhino, the authors analysed Kipferl-independent piRNA source loci by looking at Responder and 1,688 family satellites. I'm not sure that these loci can be classified as Kipferl-independent piRNA source loci since a strong increase of piRNA production from these loci in Kipferl mutant is observed. In my point of view, the 42AB and 38C are real Kipferl-independent piRNA source loci as piRNA production from these loci is not affected by Kipferl KD. 

      Indeed, the Rsp and 1,688 family satellites are not completely independent of Kipferl, as their expression and Rhino occupancy differ between wild-type and kipferl loss-of-function phenotypes (including rhino[G31D]). However, we believe that this increase is due to a strong dependence on different sequestration mechanisms and is not mediated by a direct function of Kipferl at these sites. Similarly, we observe slight differences in piRNA production for the peripheral parts of cluster 42AB, as well as differences in Rhino occupancy despite an unaltered piRNA profile at cluster 38C (Baumgartner et al. 2022). Thus, different flavours of Kipferl-independence exist, with the only truly Kipferl-independent piRNA sources likely to be the piRNA clusters in the testis. A clear classification is further complicated by previously observed compensatory effects in the piRNA pathway, leading us to adopt the current definition of "requiring Kipferl for Rhino recruitment" to distinguish Kipferl-dependent from Kipferl-independent sites.

      (10) The authors report that the G31D mutation of Rhino phenocopies the Kipferl mutant. Rhino is wrongly localized in the nucleus, and Rhino G31D recruitment in certain Kipferl-enriched regions is affected, as at 80F locus, which correlates with a strong drop in piRNA production from this locus. To go a step further in demonstrating that G31D phenocopies the Kipferl mutant, it would have been interesting to look at how much TE piRNAs are affected and whether TEs (and which class of TE) are deregulated by RNAseq and/or in situ hybridization. 

      See our response to similar comments above. Our new RNA-FISH experiments and TE-mapping piRNA analysis extend the comparison of phenotypes between kipferl mutants and rhino[G31D] mutants and are consistent with our previous conclusions (Figure 5-figure supplements 1 and 2).

    1. eLife assessment

      Schafer et al. investigate the extremely interesting and important claim that the human hippocampus represents the interactions with multiple social interaction partners on two relatively abstract social dimensions – and that this ability correlates with the social network size of the participant. This research potentially demonstrates the intricate role of the hippocampus in navigating our social world. While some results are tantalizing, the empirical evidence for the main claims is currently incomplete and requires clarifications and substantial revisions.

    2. Reviewer #1 (Public Review):

      Summary:

      Schafer et al. tested whether the hippocampus tracks social interactions as sequences of neural states within an abstract social space defined by dimensions of affiliation and power, using a task in which participants engaged in narrative-based social interactions. The findings of this study revealed that individual social relationships are represented by unique sequences of hippocampal activity patterns. These neural trajectories corresponded to the history of trial-to-trial affiliation and power dynamics between participants and each character, suggesting an extended role of the hippocampus in encoding sequences of events beyond spatial relationships.

      The current version has limited information on details in decoding and clustering analyses which can be improved in the future revision.

      Strengths:

      (1) Robust Analysis: The research combined representational similarity analysis with manifold analyses, enhancing the robustness of the findings and the interpretation of the hippocampus's role in social cognition.

      (2) Replicability: The study included two independent samples, which strengthens the generalizability and reliability of the results.

      Weaknesses:

      I appreciate the authors for utilizing contemporary machine-learning techniques to analyze neuroimaging data and examine the intricacies of human cognition. However, the manuscript would benefit from a more detailed explanation of the rationale behind the selection of each method and a thorough description of the validation procedures. Such clarifications are essential to understand the true impact of the research. Moreover, refining these areas will broaden the manuscript's accessibility to a diverse audience.

    3. Reviewer #2 (Public Review):

      Summary:

      Using an innovative task design and analysis approach, the authors set out to show that the activity patterns in the hippocampus related to the development of social relationships with multiple partners in a virtual game. While I found the paper highly interesting (and would be thrilled if the claims made in the paper turned out to be true), I found many of the analyses presented either unconvincing or slightly unconnected to the claims that they were supposed to support. I very much hope the authors can alleviate these concerns in a revision of the paper.

      Strengths & Weaknesses:

      (1) The innovative task design and analyses, and the two independent samples of participants are clear strengths of the paper.

      (2) The RSA analysis is not what I expected after I read the abstract and tile of the result section "The hippocampus represents abstract dimensions of affiliation and power". To me, the title suggests that the hippocampus has voxel patterns, which could be read out by a downstream area to infer the affiliation and power value, independent of the exact identity of the character in the current trial. The presented RSA analysis however presents something entirely different - namely that the affiliation trials and power trials elicit different activity patterns in the area indicated in Figure 3. What is the meaning of this analysis? It is not clear to me what is being "decoded" here and alternative explanations have not been considered. How do affiliation and power trials differ in terms of the length of sentences, complexity of the statements, and reaction time? Can the subsequent decision be decoded from these areas? I hope in the revision the authors can test these ideas - and also explain how the current RSA analysis relates to a representation of the "dimensions of affiliation and power".

      (3) Overall, I found that the paper was missing some more fundamental and simpler RSA analyses that would provide a necessary backdrop for the more complicated analyses that followed. Can you decode character identity from the regions in question? If you trained a simple decoder for power and affiliation values (using the LLE, but without consideration of the sequential position as used in the spline analysis), could you predict left-out trials? Are affiliation and power represented in a way that is consistent across participants - i.e. could you train a model that predicts affiliation and power from N-1 subjects and then predict the Nth subject? Even if the answer to these questions is "no", I believe that they are important to report for the reader to get a full understanding of the nature of the neural representations in these areas. If the claim is that the hippocampus represents an "abstract" relationship space, then I think it is important to show that these representations hold across relationships. Otherwise, the claim needs to be adjusted to say that it is a representation of a relationship-specific trajectory, but not an abstract social space.

      (4) To determine that the location of a specific character can be decoded from the hippocampal activity patterns, the authors use a sequential analysis in a low-dimensional space (using local linear embedding). In essence, each trial is decoded by finding the pair of two temporally sequential trials that is closest to this pattern, and then interpolating the power/affiliation values linearly between these two points. The obvious problem with this analysis is that fMRI pattern will have temporal autocorrelation and the power and affiliation values have temporal autocorrelation. Successful decoding could just reflect this smoothness in both time series. The authors present a series of control analyses, but I found most of them to not be incisive or convincing and I believe that they (and their explanation of their rationale) need to be improved. For example, the circular shifting of the patterns preserves some of the autocorrelation of the time series - but not entirely. In the shifted patterns, the first and last items are considered to be neighboring and used in the evaluation, which alone could explain the poor performance. The simplest way that I can see is to also connect the first and last item in a circular fashion, even when evaluating the veridical ordering. The only really convincing control condition I found was the generation of new sequences for every character by shuffling the sequence of choices and re-creating new artificial trajectories with the same start and endpoint. This analysis performs much better than chance (circular shuffling), suggesting to me that a lot of the observed decoding accuracy is indeed simply caused by the temporal smoothness of both time series.

      (5) Overall, I found the analysis of the brain-behavior correlation presented in Figure 5 unconvincing. First, the correlation is mostly driven by one individual with a large network size and a 6.5 cluster. I suspect that the exclusion of this individual would lead to the correlation losing significance. Secondly, the neural measure used for this analysis (determining the number of optimal clusters that maximize the overlap between neural clustering and behavioral clustering) is new, non-validated, and disconnected from all the analyses that had been reported previously. The authors need to forgive me for saying so, but at this point of the paper, would it not be much more obvious to use the decoding accuracy for power and affiliation from the main model used in the paper thus far? Does this correlate? Another obvious candidate would be the decoding accuracy for character identity or the size of the region that encodes affiliation and power. Given the plethora of candidate neural measures, I would appreciate if the authors reported the other neural measures that were tried (and that did not correlate). One way to address this would have been to select the method on the initial sample and then test it on the validation sample - unfortunately, the measure was not pre-registered before the validation sample was collected. It seems that the correlation was only found and reported on the validation sample?

    4. Author response:

      a) that the investigation is very interesting and inventive, and has the potential to reveal some novel insights.

      We thank the reviewers and are excited to improve upon the manuscript through their suggestions.

      b) that the problem of temporal autocorrelation in the fMRI and behavioral data has not been dealt with clearly and convincingly

      We agree that convincingly accounting for fMRI temporal autocorrelation is important to our claims. To reduce its effects, we used field standard methods: prewhitening and autocorrelation modeling with SPM’s FAST algorithm (shown by Olszowy et al. 2019 to be superior to SPM’s default setting), as well as a high-pass filter of 128 Hz. There is still some first-order autocorrelation structure present across voxels in the left hippocampal beta series: across participants there is slightly positive autocorrelation between the betas of decision trials on successive trials, that decays to ~0 at subsequent lags. We note that our task is a narrative, and some patterns over time are expected; instead of attempting to fully eliminate all temporal structure in the data, we aim to show that the temporal distance between trials is unlikely to explain our effects.

      In the within versus between social dimension representational similarity analysis, the average temporal distance between trials is the same within and between dimensions. The clustering analysis is a between subject analysis about individual differences–and the same overall temporal structure is experienced by all participants.

      The trajectory analysis does not focus on consecutive trials across characters, but rather on consecutive trials within characters, where the time gap between successive trials is relatively large and highly variable. An average of over a minute of time elapses between successive decision trials for a given character (versus ~20 seconds across characters), which is on average almost 11 narrative slides and 3 decision trials. Across characters, the temporal gap between decision trials ranges between 12 seconds to more than 10 minutes, reducing the likelihood that temporal autocorrelation drives character-related estimates. We also highlight the shuffled choices control model, which shares the same temporal autocorrelation structure as the model of interest but had significantly poorer social location decoding–a strong indication that temporal autocorrelation alone can’t explain these results. For each participant, we shuffled their choices and re-computed trajectories that preserved the origin and end locations but produced different locations along the way. Our model decoded location significantly better than this null model, and this difference in performance can't be explained by differences in temporal autocorrelation in the neural or behavioral data.

      In the revision, we will further address this concern. For example, we will report more details on the task structure to aid in interpretation and will more precisely characterize the temporal autocorrelation profile. Where appropriate, we will also improve on and/or add more control analyses that preserve the autocorrelation structure.

      c) that a number of important interesting questions have not been addressed: Are the differences between social partners encoded in the hippocampus? Are the social dimensions encoded in a consistent manner across social partners?

      We believe that we should be able to decode other interesting task- and relationship-related features from the hippocampal patterns, as suggested by the reviewers. In the revision, we will attempt several such analyses, while taking care to control for temporal autocorrelation.

      d) that the cluster analysis in the brain-behavior correlation analysis is not well motivated or validated and should be clarified.

      We agree with the reviewers that this clustering analysis should be better described and validated. We aimed to ask whether less diverse and distinctive cognitive representations of the relationship trajectories relate to smaller real-world social networks. This question of impoverished cognitive maps was first raised by Edward Tolman; we think it is relevant here, as well. In the revision, we will clarify its motivations and implications, and better evaluate it for its robustness. Here, we address a few comments made by the reviewers.

      Reviewer 2 noted that other analyses could be used to ask whether social cognitive map complexity relates to real-world social network complexity. While the proposed alternatives are interesting (e.g., correlating decoding accuracy with social network size), we believe these analyses ask different questions. The current co-clustering analysis was intended to estimate map complexity jointly from the behavioral and neural signatures of the social map across characters. In contrast, the spline location decoding is within character; the accuracy of this decoding does not say much about representations across characters. And although we think character decoding is an interesting possible addition to this manuscript, its accuracy may reflect other aspects of the relationships, beyond just spatial representation. Thus, we will provide a clearer and better validated version of the current analysis to address this question.

      We would also like to clarify that we did not collect the Social Network Index questionnaire in the Initial sample; as such these results are more tentative than the other analyses, due to the inability to confirm them in a separate sample. Reviewer 2 also suggests that a single outlier could drive this effect; but estimating the effect with robust regression also returns a right-tailed p < 0.05, showing that the relationship is robust to outliers.

      References

      Olszowy, W., Aston, J., Rua, C. & Williams, W.B. Accurate autocorrelation modeling substantially improves fMRI reliability. Nature Communications. (2019).

    1. Reviewer #1 (Public Review):

      Summary:

      In this study, Jellinger et al. performed engram-specific sequencing and identified genes that were selectively regulated in positive/negative engram populations. In addition, they performed chronic activation of the negative engram population over 3 months and observed several effects on fear/anxiety behavior and cellular events such as upregulation of glial cells and decreased GABA levels.

      Strengths:

      They provide useful engram-specific GSEA data and the main concept of the study, linking negative valence/memory encoding to cellular level outcomes including upregulation of glial cells, is interesting and valuable.

      Comments on the revised manuscript:

      The revised manuscript still does not adequately address the primary technical concern regarding long-term DREADD manipulation. The authors reference their previous work (Suthard et al., 2023) as evidence; however, this earlier paper only presents fluorescence intensity in a non-quantitative manner with merely three samples (Supplementary Figure 7). This limited evidence does not sufficiently support the claim of potent long-term activation. The discussion in the revision stating "...even if our manipulation is only working for 1 month, rather than 3 months..." is unconvincing, particularly given that the title and abstract still claims "chronic activation of...". To substantiate the technical validity of the study, at least cFos staining at various time points is necessary, which is less burdensome compared to more direct demonstrations such as slice physiology. Thus, although I believe it could be an interesting study for some audiences, I cannot support the strength of the evidence presented in the study.

      Furthermore, in response to all reviewers' concerns regarding the quantification of GABA, the authors have removed the data from the study rather than providing properly acquired images or quantified data. This action diminishes the significance of the study.

    2. eLife assessment

      This useful study reports the behavioural and physiological effects of the longitudinal activation of neurons associated with negative experiences. The main claims of the paper are supported by solid experimental evidence, although the specificity of the long-term manipulation could have benefitted from additional validation. This study will be of interest to neuroscientists working on memory.

    3. Reviewer #2 (Public Review):

      Summary:

      Jellinger, Suthard, et al. investigated the transcriptome of positive and negative valence engram cells in the ventral hippocampus, revealing anti- and pro-inflammatory signatures of these respective valences. The authors further reactivated the negative valence engram ensembles to assay the effects of chronic negative memory reactivation in young and old mice. This chronic re-activation resulted in differences in aspects of working memory, fear memory, and caused morphological changes in glia. Such reactivation-associated changes are putatively linked to GABA changes and behavioral rumination.

      Strengths:

      Much the content of of this manuscript is of benefit to the community, such as the discovery of differential engram transcriptomes dependent on memory valence. The chronic activation of neurons, and the resultant effects on glial cells and behavior, also provide the community with important data. Laudable points of this manuscript include the comprehensiveness of behavioral experiments, as well as the cross-disciplinary approach.

      Weaknesses:

      Weaknesses noted in the previous version of the manuscript have been accounted for.

    4. Reviewer #3 (Public Review):

      Summary:

      The authors note that negative ruminations can lead to pathological brain states and mood/anxiety dysregulation. They test this idea by using mouse engram-tagging technology to label dentate gyrus ensembles activated during a negative experience (fear conditioning). They show that chronic chemogenetic activation of these ensembles leads to behavioral (increased anxiety, increased fear generalization, reduced fear extinction) and neural (increases in neuroinflammation, microglia and astrocytes).

      Strengths:

      The question the authors ask here is an intriguing one, and the engram activation approach is a powerful way to address the question. Examination of a wide range of neural and behavioral dependent measures is also a strength.

      Weaknesses:

      The major weakness is that the authors have found a range of changes that are correlates of chronic negative engram reactivation. However, they do not manipulate these outcomes to test whether microglia, astrocytes, neuroinflammation are causally linked to the dysregulated behaviors.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this study, Jellinger et al. performed engram-specific sequencing and identified genes that were selectively regulated in positive/negative engram populations. In addition, they performed chronic activation of the negative engram population over 3 months and observed several effects on fear/anxiety behavior and cellular events such as upregulation of glial cells and decreased GABA levels.

      Strengths:

      They provide useful engram-specific GSEA data and the main concept of the study, linking negative valence/memory encoding to cellular level outcomes including upregulation of glial cells, is interesting and valuable.

      Weaknesses:

      A number of experimental shortcomings make the conclusion of the study largely unsupported. In addition, the observed differences in behavioral experiments are rather small, inconsistent, and the interpretation of the differences is not compelling.

      Major points for improvement:

      (1) Lack of essential control experiments

      With the current set of experiments, it is not certain that the DREADD system they used was potent and stable throughout the 3 months of manipulations. Basic confirmatory experiments (e.g., slice physiology at 1m vs. 3m) to show that the DREADD effects on these vHP are stable would be an essential bottom line to make these manipulation experiments convincing.

      In previous work from our lab performing long-term activation of Gq DREADD receptors in the vHPC, we quantify the presence of Gq receptor expression over 3-, 6- and 9-month timepoints and show that there is no decrease in receptor expression, as measured via fluorescence intensity (Suthard et al., 2023). In this study, we also address that even if our manipulation is only working for 1 month, rather than 3 months, we are observing the long-term effects of this shorter-term stimulation. This is still relevant, and only changes how we interpret these findings, as shorter-term stimulation or disruption of neuronal activity can still have detrimental effects on behavior.

      Furthermore, although the authors use the mCherry vector as a control, they did not have a vehicle/saline control for the hM3Dq AAV. Thus, the long-term effects such as the increase in glial cells could simply be due to the toxicity of DREADD expression, rather than an induced activity of these cells.

      For chemogenetic studies, our experimental rationale utilized a standard approach in the field, which includes one of two control options: 1) active receptor vs. control vector + ligand or 2) active receptor + ligand or saline control. We chose the first option, as this more properly controls for the potential off-target effects of the ligand itself, as shown in other previous work (Xia et al., 2017). This is particularly important for studies using CNO, as many off-target effects have been noted as a limitation (Manvich et al., 2018). We chose to use DCZ as it is closely related to CNO and newer ligands, but comes with added benefits of high specificity, low off-target effects, high potency and brain penetrance (Nagai et al., 2020), but any potential off-target effects of DCZ are yet to be completely investigated as this ligand is very new.

      Evidence of DREADD toxicity has been shown at high titer levels of AAV2/7- CamKIIα-hM4D(Gi)-mCherry in the hippocampus at 5 weeks, as the reviewer pointed out in their above comment (Goossens et al., 2021). Our viral strategy is targeted to a much smaller number of cells using AAV9-DIO-Flex-hM3Dq-mCherry at a lower titer, unlike expression within a much larger population of CaMKII+ excitatory neurons in this study. Additionally, visual comparison of their viral load and expression with ours shows much more intense expression that spans a larger area of the hippocampus (Goossens et al, 2021; Figure 1D), whereas ours is isolated to a smaller region of vHPC (see Figure 1B).

      Further, we attempted to quantify a decrease in neuronal health (Yousef et al., 2017) resulting from DREADD expression via NeuN counts within multiple hippocampal subregions for the 6- and 14-month groups across active Gq receptor and mCherry conditions and did not observe significant decreases in NeuN as a result (Supplemental Figure 1). However, immunohistochemistry of an individual marker may not be sufficient to capture the entire health profile of an individual neuron and future work should consider other markers of cell death or inflammation, which we have added to the Limitations & Future Work section of our Discussion.

      (2) Figure 1 and the rest of the study are disconnected

      The authors used the cFos-tTA system to label positive/negative engram populations, while the TRAP2 system was used for the chronic activation experiments. Although both genetic tools are based on the same IEG Fos, the sensitivity of the tools needs to be validated. In particular, the sensitivity of the TRAP2 system can be arbitrarily altered by the amount of tamoxifen (or 4OHT) and the administration protocols. The authors should at least compare and show the percentage of labeled cells in both methods and discuss that the two experiments target (at least slightly) different populations. In addition, the use of TRAP2 for vHP is relatively new; the authors should confirm that this method actually captures negative engram populations by checking for reactivation of these cells during recall by overlap analysis of Fos staining or by artificial activation.

      We thank the reviewer for their comments and opportunity to discuss the marked differences between TRAP2 and DOX systems. In particular, we agree that while both systems rely on the the Fos promoter to drive an effector of interest, their efficacy and temporal resolution vary substantially depending on genetic cell-type, brain region, temporal parameters of Dox or 4-OHT delivery, subject-by-subject metabolic variability, and threshold to Fos induction given the promoter sequences inherent to each system. For example, recent studies have reported the following:

      - The TRAP2 line labels a subset of endogenously activeCA1 pyramidal cells (e.g. 5-18%) while the DOX system labels 20-40% of CA1 pyramidal cells (DeNardo et al, 2019; Monasterio et al, BioRxiv 2024 ).

      - The temporal windows for each range from hours in TRAP2 to 24-48 hours for DOX (DeNardo et al, 2019; Denny et al, 2014; Liu & Ramirez et al, 2012).

      - The efficacy of “tagging” a population of cells with TRAP2 vs with DOX will constrain the number of possible cells that may overlap with cFos upon re-exposure to a given experience (e.g. see the observed overlaps in vCA1 - BLA circuits (Kim & Cho, 2020), compared to vCA1 in general (Ortega-de San Luis et al, 2023) and valence-specific vCA1 populations (Shpokayte et al, 2022).

      - Tagging vCA1 cells with both the TRAP2 and DOX systems are nonetheless sufficient to drive corresponding behaviors (e.g. vCA1 terminal stimulation drives behavioral changes with the DOX and TRAP2 system (Shpokayte et al, 2022) and vCA1 stimulation of an updated fear-linked ensemble drives light-induced freezing in a neutral context utilizing the TRAP2 and DOX systems (Ortega-de San Luis et al, 2023)).

      Finally, and promisingly, as more studies continue to link the in vivo physiological dynamics of these cell populations tagged using each system (e.g. compare Pettit et al, 2022 with Tanaka et al, 2018) and correlating their activity to behavioral phenotypes, our field is in the prime position to uncover deeper principles governing hippocampus-mediated engrams in the brain. Together, we believe a more comprehensive understanding of these systems is fully warranted, especially in the service of further cataloging cellular similarities and differences within such tagged populations.

      (3)  Interpretation of the behavior data

      In Figures 3a and b, the authors show that the experimental group showed higher anxiety based on time spent in the center/open area. However, there were no differences in distance traveled and center entries, which are often reduced in highly anxious mice. Thus, it is not clear what the exact effect of the manipulation is. The authors may want to visualize the trajectories of the mice's locomotion instead of just showing bar graphs.

      Our findings show that our experimental group displays higher levels of anxiety-like behaviors as measured via time spent in center/open area, while there are no differences in distance traveled or center entries. For distance traveled, our interpretation is in line with complementary research (Jimenez et al, 2018; Kheirbek et al, 2013) that shows no changes in distance traveled/distance traveled in the center coupled with changes in anxiety levels as a result of manipulation within anxiety-related circuits. More broadly, any locomotion-related deficit could cause a change in distance traveled that is unrelated to anxiety-like behaviors alone. For example, a reduction in distance traveled could be coupled with a decrease in time spent in the center, but could also result only from motor or exploratory deficits. We hope that this explanation clarifies our interpretation of the open field and elevated plus maze findings in light of other literature.

      In addition, the data shown in Figure 4b is somewhat surprising - the 14MO control showed more freezing than the 6MO control, which can be interpreted as "better memory in old". As this is highly counterintuitive, the authors may want to discuss this point. The authors stated that "Mice typically display increased freezing behavior as they age, so these effects during remote recall are expected" without any reference. This is nonsense, as just above in Figure 4a, older mice actually show less freezing than young mice. Overall, the behavioral effects are rather small and random. I would suggest that these data be interpreted more carefully.

      In Figure 4B, we present our findings from remote recall and observe increased freezing levels in control mice with age, as mentioned by the reviewer, indicating increased memory. This is in line with previous work from Shoji & Miyakawa, 2019 which has been added as a reference for the quotation described above; we thank the reviewer for pointing this error out. As the reviewer has pointed out, above in Figure 4A, we measured freezing levels across all groups during contextual fear conditioning before the start of chronic stimulation, as this was the session we ‘tagged’ a negative memory in. Although it appears that there may be slightly lower levels of freezing in older (14-month old) mice, our findings do not determine statistical significance for difference between age group, only effects of time and subject which are expected as freezing increases within the session and animals display high levels of variability in freezing levels across many experiments (Figure 4A i-iii). We also find in previous work that control mice receiving 3-, 6- and 9-months of chronic DCZ stimulation in the vHPC with empty vector (mCherry) receptor show an increase in freezing with age (Suthard et al, 2023; Figure 2A ii).

      (4) Lack of citation and discussion of relevant study

      Khalaf et al. 2018 from Gräff lab showed that experimental activation of recall-induced populations leads to fear attenuation. Despite the differences in experimental details, the conceptual discrepancy should be discussed.

      As mentioned by the reviewer, Khalaf et al. 2018 showed that experimental activation of recall-induced populations in the dentate gyrus leads to fear attenuation. Specifically, they pose that this fear attenuation occurs in these ensembles through updating or unlearning of the original memory trace via the engagement, rather than suppression, of an original traumatic experience. Despite the differences in experimental details with our current study and this work, we agree that the conceptual discrepancy should be discussed. First, one major difference is that we are reactivating an ensemble that was tagged during fear memory encoding, while Khalaf et al. are activating a remote recall-induced ensemble that was tagged one month after encoding. Although there is high overlap between the encoding and recall ensembles when mice are exposed to the conditioning context, these ensembles are not identical and may result in different behavioral phenotypes when chronically reactivated. Further, Khalaf et al rely on reactivation of the recall-induced ensemble during extinction to facilitate rapid fear attenuation. This differs from our current work, as their reactivation is occurring during the extinction process in the previously conditioned context, while we are reactivating chronically in the animal’s home cage over the course of a longer time period. It may be necessary that the memory is first reactivated, and thus, more liable to re-contextualization, in the original context compared to an unrelated homecage environment where there are presumably no related cues present. Importantly, this previous work tests the attenuation of fear shortly after an extinction process, while we are not traditionally extinguishing the context with aid of the memory reactivation. Finally, we are testing remote recall (3 months post-conditioning), while they are testing at a shorter time interval (28 days). In line with these ideas, future work may seek to tease out the mechanistic differences between recent and remote memory extinction both in terms of natural memory recall and chronically manipulated memory-bearing cells.

      Reviewer #2 (Public Review):

      Summary:

      Jellinger, Suthard, et al. investigated the transcriptome of positive and negative valence engram cells in the ventral hippocampus, revealing anti- and pro-inflammatory signatures of these respective valences. The authors further reactivated the negative valence engram ensembles to assay the effects of chronic negative memory reactivation in young and old mice. This chronic re-activation resulted in differences in aspects of working memory, and fear memory, and caused morphological changes in glia. Such reactivation-associated changes are putatively linked to GABA changes and behavioral rumination.

      Strengths:

      Much of the content of this manuscript is of benefit to the community, such as the discovery of differential engram transcriptomes dependent on memory valence. The chronic activation of neurons, and the resultant effects on glial cells and behavior, also provide the community with important data. Laudable points of this manuscript include the comprehensiveness of behavioral experiments, as well as the cross-disciplinary approach.

      Weaknesses:

      There are several key claims made that are unsubstantiated by the data, particularly regarding the anthropomorphic framing of "rumination" on a mouse model and the role of GABA. The conclusions and inferences in these areas need to be carefully considered.

      (1) There are many issues regarding the arguments for the behavioural data's human translation as "rumination." There is no definition of rumination provided in the manuscript, nor how rumination is similar/different to intrusive thoughts (which are psychologically distinct but used relatively interchangeably in the manuscript), nor how rumination could be modelled in the rodent. The authors mention that they are attempting to model rumination behaviours by chronically reactivating the negative engram ("To understand if our experimental model of negative rumination..."), but this occurs almost at the very end of the results section, and no concrete evidence from the literature is provided to attempt to link the behavioural results (decreased working memory, increased fear extinction times) to rumination-like behaviours. The arguments in the final paragraph of the Discussion section about human rumination appear to be unrelated to the data presented in the manuscript and contain some uncited statements. Finally, the rumination claims seem to be based largely upon a single data figure that needs to be further developed (Figure 6, see also point 2 below).

      (2) The staining and analysis in Figure 6 are challenging to interpret, and require more evidence to substantiate the conclusions of these results. The histological images are zoomed out, and at this resolution, it appears that only the pyramidal cell layer is being stained. A GABA stain should also label the many sparsely spaced inhibitory interneurons existing across all hippocampal layers, yet this is not apparent here. Moreover, both example images in the treatment group appear to have lower overall fluorescence intensity in both DAPI and GABA. The analysis is also unclear: the authors mention "ROIs" used to measure normalized fluorescence intensity but do not specify what the ROI encapsulates. Presumably, the authors have segmented each DAPI-positive cell body and assessed fluorescence however, this is not explicated nor demonstrated, making the results difficult to interpret.

      Based on the collective discussion from all reviewers on the completeness of our GABA quantification and its implications, we have decided to remove this figure and perform more substantive analysis of this E/I imbalance in future work.

      (3) A smaller point, but more specific detail is needed for how genes were selected for GSEA analysis. As GSEA relies on genes to be specified a priori, to avoid a circular analysis, these genes need to be selected in a blind/unbiased manner to avoid biasing downstream results and conclusions. It's likely the authors have done this, but explicitly noting how genes were selected is an important context for this analysis.

      As mentioned in our Methods section, gene sets were selected based on pre-existing biology and understanding of genes canonically involved in “neurodegeneration” such as those related to apoptotic pathways and neuroinflammation or “neuroprotection” such as brain-derived neurotrophic factor, to name a few. A limitation of this method is that we must avoid making strong claims about the actual function of these up- or down-regulated genes without performing proper knock-in or knock-out studies, but we hope that this provides an unbiased inventory for future experiments to perform causal manipulations.

      Reviewer #3 (Public Review):

      Summary:

      The authors note that negative ruminations can lead to pathological brain states and mood/anxiety dysregulation. They test this idea by using mouse engram-tagging technology to label dentate gyrus ensembles activated during a negative experience (fear conditioning). They show that chronic chemogenetic activation of these ensembles leads to behavioral (increased anxiety, increased fear generalization, reduced fear extinction) and neural (increases in neuroinflammation, microglia, and astrocytes).

      Strengths:

      The question the authors ask here is an intriguing one, and the engram activation approach is a powerful way to address the question. Examination of a wide range of neural and behavioral dependent measures is also a strength.

      Weaknesses:

      The major weakness is that the authors have found a range of changes that are correlates of chronic negative engram reactivation. However, they do not manipulate these outcomes to test whether microglia, astrocytes, or neuroinflammation are causally linked to the dysregulated behaviors.

      Recommendations For The Authors:

      Reviewer #1 (Recommendations For The Authors):

      - Figure 2c should include Month0, the BW before the start of the manipulation.

      Regrettably, we do not have access to the Month 0 body weights at this time as this project changed hands over the course of the past year or so. This is an inherent limitation that we missed during analysis and we pose this as a limitation in the Results section after describing this finding. Therefore, it is possible that over the first month of stimulation (Month 0-1), there may have been a drop in body weight that rebounded by the first measurement at Month 1 that continued to increase normally through Months 2-3, as shown in our Figure 1. Thank you for this note.

      - Figure 6a looks confusing - the background signal in the green channel is very different between control and experimental groups. Were representative images taken with different microscope settings?

      The representative images were taken with the same microscope power settings, but were adjusted in brightness/contrast within FIJI for clarity in the Figure – we apologize that this was misleading in any way and thank the reviewer for their feedback. Further, based on the collective discussion from all reviewers on the completeness of our GABA quantification and its implications, we have decided to remove this figure and perform more substantive analysis of this E/I imbalance in future work.

      - Typo mChe;try

      This typo was fixed

      - "During this contextual... mice in the 6- and 14- month groups..." Isn't it 3- and 11- month respectively at the time of fear conditioning? Throughout the manuscript, this point was written very confusingly.

      Yes, we thank the reviewer for pointing this out. It has been corrected to 3- and 11-month old mice at the timing of fear conditioning and clarified throughout the manuscript where applicable.

      - "GABAergic eYFP fluorescence" Where does the eYFP come from? The methods state that GABA quantification is based on IHC staining.

      Based on the collective discussion from all reviewers on the completeness of our GABA quantification and its implications, we have decided to remove this figure and perform more substantive analysis of this

      E/I imbalance in future work. We discuss this E/I balance not being directly assessed in the Limitations & Future Directions section of our Discussion, noting the importance of detailed quantification of both excitatory and inhibitory markers within the hippocampus.

      Reviewer #2 (Recommendations For The Authors):

      (1) There is a full methods section ("Analysis of RNA-seq data") that mostly describes RNA-seq analysis that seemingly does not appear in the paper. This section should be reviewed.

      We have included this portion of the methods that explain the previous workflow from Shpokayte et al., 2022 where this dataset was generated and this has been noted in the “Analysis of RNA-seq data” section of the methods.

      (2) Figure 6: GABA staining should be more critically analyzed, as discussed above, and validated with another GABA antibody for rigor. From the representative images provided in Figure 6, it looks possibly as though the hM3Dq images were simply not fully in the focal plane when being imaged or were over-washed, as DAPI staining also appears to be lower in these images.

      Based on the collective discussion from all reviewers on the completeness of our GABA quantification and its implications, we have decided to remove this figure and perform more substantive analysis of this E/I imbalance in future work. Specifically, it will be necessary to rigorously investigate both excitatory and inhibitory markers within this region to ensure these claims are substantiated. Thank you for this suggestion.

      (3) The first claim that human GABAergic interneurons cause rumination is uncited. (Page 19, first sentence beginning with: "Evidence from human studies suggests...").

      Based on the collective discussion from all reviewers on the completeness of our GABA quantification and its implications, we have decided to remove this figure and perform more substantive analysis of this E/I imbalance in future work. Apologies for the lack of citation in-text, the proper citation for this finding is Schmitz et al, 2017.

      (4) Gene names throughout the manuscript and figure are written in the wrong format for mice (eg: Page 13, second line: SPP1, TTR, and C1QB1 instead of Spp1, Ttr, C1qb1).

      This was corrected throughout the manuscript.

      (5) Tense on Page 15 third sentence of the second paragraph: "...spatial working memory was assessed...".

      This was corrected throughout the manuscript.

      (6) Supplemental Figure 1 would benefit from normalization of the NeuN+ cell counts. The inclusion of an excitatory and inhibitory neuron marker in this figure might benefit the argument that there is a change in the excitation/inhibition of the hippocampus - as the numbers of excitatory neurons outweigh the numbers of inhibitory neurons that would be assayed here.

      In an effort to normalize the NeuN+ cell counts, for each of our ROIs (6-8 single tiles for each brain region (DG, vCA1, vSub) x 3-5 coronal slices = ~18 single tiles per mouse x 3-4 mice) we captured a 300 x 300 micrometer, single-tile z-stack at 20x magnification. These ROIs were matched for dimensions and brain regions across all groups for each hippocampal subregion quantified. We initially proposed to normalize these NeuN counts over DAPI, but because DAPI includes all nuclei (microglia, oligodendrocytes, astrocytes and neurons), we weren’t sure this was the most optimal tool. We do agree that further quantification of excitatory and inhibitory cell markers would be vital to more concrete interpretation of our findings and we have added this to our Limitations & Future Work section of the Discussion.

      Reviewer #3 (Recommendations For The Authors):

      (1) The DOX tagging window lacks temporal precision. I suggest the authors note this as a limitation.

      We thank the reviewer for noting this, and we have added this limitation to the Methods section with the context of the 24-48 hour DOX window being longer than other methods like TRAP.

      (2) Is there a homeostatic response to chronic engram stimulation? That is, is DCZ as effective in increasing neuronal excitability on day 90 as it is on day 1. This could be addressed with electrophysiology, or with IEG induction. Alternatively, the authors could refer to previous literature-- for example, Xia et al (2017) eLife-- that examined whether there was any blunting of the effects of DREADD ligands after sustained delivery via drinking water. There, of course, may be other papers as well.

      As noted by the reviewer, it is important to determine if DCZ maintains its effects on neuronal excitability throughout the 3 month administration period. To address this, previous work has shown that CNO administration in drinking water over one month consistently inhibited hM4Di+ neurons without altering baseline neuronal excitability as measured by firing rate and potassium currents (Xia et al, 2017). Although this is only for one month, it is administered via the same oral route as our DCZ protocol and suggests that at least for that amount of time we are likely producing consistent effects. In our reply above to Reviewer #1’s comment, we also note that even if DCZ is only having an effect for one month, rather than 3 months, we are still observing enduring changes that resulted from this short-term disturbance.

      (3) Please double check there is no group effect on weight in 6-month-old mice in Figure 2C.

      Two-way RM ANOVA showed no main effect of Group within the 6-month-old control and hM3Dq groups.

      Group: F(1,17) = 1.361, p=0.2594.

      (4) The shock intensity is much higher than is typical for fear conditioning studies in mice. Why was this the case?

      Yes, we do agree that this shock intensity is on the higher side of typical paradigms in mice, however, our lab has utilized 0.75mA to 1.5mA intensity foot shocks for contextual fear conditioning in the past (Suthard & Senne et al, 2023; 2024; Dorst & Senne et al, 2023; Grella et al., 2022; Finkelstein et al., 2022) and we maintained this protocol for internal consistency. However, it would be interesting to systematically investigate how differing intensities of foot shock, subsequent tagging of this ensemble and reactivation would uniquely impact behavioral state acutely and chronically in mice.

      (5) Remote freezing is very low. The authors should comment on this-- perhaps repeated testing has led to some extinction?

      A reviewer above suggested a similar phenomenon may be occuring, specifically fear attenuation as a result of chronic stimulation. They referenced previous work from Khalaf et al. 2018, where they reactivated a recall-induced ensemble, while we reactivated an ensemble tagged during encoding. We expand upon this work in light of our findings within the Limitations & Future Work section of our Discussion. However, we do appreciate the lower levels of freezing observed in remote recall and sought out other literature to understand the typical range of remote freezing levels. One thing that we note is that our remote recall is occurring 3 months after conditioning, which is much longer than typical 14-28 day protocols. However, we find that freezing levels at remote timepoints from 21-45 days results in contextual freezing levels of between 20-50% approximately (Kol et al., 2020), as well as 40-75% approximately in a variety of 28 day remote recall experiments (Lee et al., 2023). This information, together with our current experimental protocol demonstrates a wide range of remote freezing levels that may depend heavily on the foot shock intensity, duration of days after conditioning, and animal variability.

      (6) "mice display increased freezing with age": please add a reference.

      Apologies, we missed the citation for that claim and it has been added in-text and in the references list (Shoji & Miyakawa, 2019).

      (7) Related to the low freezing levels for remote memory, why is generalization minimal? Many studies have shown that there is a time-dependent emergence of generalized fear, yet here this is not seen. Is it linked to extinction (as above)? Or genetic background?

      Previous work has shown that rats receiving multiple foot shocks during conditioning displayed a time-dependent generalization of context memory, while those receiving less shocks did not (Poulos et al., 2016), as the reviewer noted in their comment. In our current study, we observe low levels of generalization in all of our groups compared to freezing levels displayed in the conditioned context at the remote timepoint, in opposition to this time-dependent enhancement of generalization. It is possible that the genetic background of our C57BL/6J mice compared to the Long-Evans rat strain in this previous work accounts for some of this difference. In addition, it is possible that the longer duration of time (3 months) compared to their remote timepoint (28 days) resulted in time-dependent decrease in generalization that decreases with greater durations of time from original conditioning. As noted above, it is indeed plausible that the reactivation of a contextual fear ensemble over time is attenuating freezing levels for both the original and similar contexts (Khalaf et al, 2018). We discuss the differences in our study and this 2018 work more comprehensively above.

      (8) Morphological phenotypes of astrocytes/microglia. Would be great to do some transcriptomic profiling of microglia/astrocytes to couple with the morphological characterization (but appreciate this is beyond the scope of current work).

      We thank the reviewer this suggestion, we agree that would be an incredibly informative future experiment and have added this to our Limitations & Future Experiments section of the Discussion.

      (9) The authors could consider including a limitations section in their discussion which discusses potential future directions for this work:

      - causal experiments.

      - E/I balance is not assessed directly (interestingly, in this regard, expanded engrams are linked to increased generalization [e.g., Ramsaran et al 2023]).

      Thank you for this suggestion, we have added a Limitations & Future Directions section to our Discussion and have expanded upon these suggested points.

      (10) For Figure 10, consider adding an experimental design/timeline.

      We are making the assumption that the reviewer meant Figure 1 instead of Figure 10 here, but note that there is a description of the viral expression duration (D0-D10), followed by an off Dox period of 48 hours (D10-D12), with subsequent engram tagging of a negative (foot shock) or positive (male-to-female exposure) on D12. In our experiments (Shpokayte et al., 2022), Dox was administered for 24 hours (D12-D13), which was followed by sacrificing the animal for cell suspension and sequencing of the positive and negative engram populations. This figure also shows the viral strategy for the Tet-tag system (Figure 1A), as well as representative viral expression in vHPC (Figure 1B). We are happy to add additional experimental design/timeline information to this figure that would be helpful to the reviewer.

    1. eLife assessment

      This fundamental work proposes a novel mechanism for memory consolidation where short-term memory provides a gating signal for memories to be consolidated into long-term storage. The work combines extensive analytical and numerical work applied to three different scenarios and provides a convincing analysis of the benefits of the proposed model, although some of the analyses are limited to the type of memory consolidation the authors consider (and don't consider), which limits the impact. The work will be of interest to neuroscientists and many other researchers interested in the mechanistic underpinnings of memory.

    2. Reviewer #2 (Public Review):

      Summary:

      In the manuscript the authors suggest a computational mechanism called recall-gated consolidation, which prioritizes the storage of previously experienced synaptic updates in memory. The authors investigate the mechanism with different types of learning problems including supervised learning, reinforcement learning, and unsupervised auto-associative memory. They rigorously analyse the general mechanism and provide valuable insights into its benefits.

      Strengths:

      The authors establish a general theoretical framework, which they translate into three concrete learning problems. For each, they define an individual mathematical formulation. Finally, they extensively analyse the suggested mechanism in terms of memory recall, consolidation dynamics, and learnable timescales.

      The presented model of recall-gated consolidation covers various aspects of synaptic plasticity, memory recall, and the influence of gating functions on memory storage and retrieval. The model's predictions align with observed spaced learning effects.

      The authors conduct simulations to validate the recall-gated consolidation model's predictions, and their simulated results align with theoretical predictions. These simulations demonstrate the model's advantages over consolidating any memory and showcase its potential application to various learning tasks.

      The suggestion of a novel consolidation mechanism provides a good starting point to investigate memory consolidation in diverse neural systems and may inspire artificial learning algorithms.

      Weaknesses:

      I appreciate that the authors devoted a specific section to the model's predictions, and point out how the model connects to experimental findings in various model organisms. However, the connection is rather weak and the model needs to make more specific predictions to be distinguishable from other theories of memory consolidation (e.g. those that the authors discuss) and verifiable by experimental data.

      The model is not compared to other consolidation models in terms of performance and how much it increases the signal-to-noise ratio. It is only compared to a simple STM or a parallel LTM, which I understand to be essentially the same as the STM but with a different timescale (so not really an alternative consolidation model). It would be nice to compare the model to an actual or more sophisticated existing consolidation model to allow for a fairer comparison.

      The article is lengthy and dense and it could be clearer. Some sections are highly technical and may be challenging to follow. It could benefit from more concise summaries and visual aids to help convey key points.

    3. Reviewer #3 (Public Review):

      Summary:

      In their article Jack Lindsey and Ashok Litwin-Kumar describe a new model for systems memory consolidation. Their idea is that a short-term memory acts not as a teacher for a long-term memory - as is common in most complementary learning systems -, but as a selection module that determines which memories are eligible for long term storage. The criterion for the consolidation of a given memory is a sufficient strength of recall in the short term memory.

      The authors provide an in-depth analysis of the suggested mechanism. They demonstrate that it allows substantially higher SNRs than previous synaptic consolidation models, provide an extensive mathematical treatment of the suggested mechanism, show that the required recall strength can be computed in a biologically plausible way for three different learning paradigms, and illustrate how the mechanism can explain spaced training effects.

      Strengths:

      The suggested consolidation mechanism is novel and provides a very interesting alternative to the classical view of complementary learning systems. The analysis is thorough and convincing.

      Weaknesses:

      The main weakness of the paper is the equation of recall strength with the synaptic changes brought about by the presentation of a stimulus. In most models of learning, synaptic changes are driven by an error signal and hence cease once the task has been learned. The suggested consolidation mechanism would stop at that point, although recall is still fine. The authors should discuss other notions of recall strength that would allow memory consolidation to continue after the initial learning phase. Aside from that, I have only a few technical comments that I'm sure the authors can address with a reasonable amount of work.

    4. Author response:

      The following is the authors’ response to the original reviews.

      In light of some reviewer comments requesting more clarity on the relationship between our model and prior theoretical studies of systems consolidation, we propose a modification to the title of our manuscript: “Selective consolidation of learning and memory via recall-gated plasticity.” We believe this title better reflects the key distinguishing feature of our model, that it selectively consolidates only a subset of memories, and also highlights the model’s applicability to task learning as well as memory storage.

      Major comments:

      Reviewer #3’s primary concern with the paper is the following: “The main weakness of the paper is the equation of recall strength with the synaptic changes brought about by the presentation of a stimulus. In most models of learning, synaptic changes are driven by an error signal and hence cease once the task has been learned. The suggested consolidation mechanism would stop at that point, although recall is still fine. The authors should discuss other notions of recall strength that would allow memory consolidation to continue after the initial learning phase.”

      We thank the reviewer for drawing attention to this issue, which primarily results from a poor that memories should be interpreted as actual synaptic weight updates,∆𝑤and thus in the context choice of notation on our part. Our decision to denote memories as gives the impression of supervised learning would go to zero when the task is learned. However, in the formalism of our model, memories are in fact better interpreted as target values of synaptic weights, and the synaptic model/plasticity rule is responsible for converting these target values into synaptic weight updates. We were unclear on this point in our initial submission, because our paper primarily considers binary synaptic weights, where target synaptic weights have a one-to-one correspondence with candidate synaptic weight updates. We have updated the paper to use w* to refer to memories, which we hope resolves this confusion, and have updated our introduction to the term “memory” to reflect their interpretation as target synaptic weight values. We have also updated the paper’s language to more clearly disambiguate between the “learning rule,” which determines how the memory vector (target synaptic weight vectors) are derived from task variables, and the “plasticity rule,” which governs how these are translated into actual synaptic weight updates. We acknowledge that our manuscript still does not explicitly consider a plasticity rule that is sensitive to continuous error error signals, as our analysis is restricted to binary weights. However, we believe that the updated notation and exposition makes it more clear that our model could be applied in such a case.

      Reviewer #1 brought up that our framework cannot capture “single-shot learning, for example, under fear conditioning or if a presented stimulus is astonishing.” Reviewer #2 raised a related question of how our model “relates to the opposite more intuitive idea, that novel surprising experiences should be stored in memory, as the familiar ones are presumably already stored.”

      We agree that the built-in inability to consolidate memories after a single experience is a limitation of our model, and that extreme novelty is one factor (among others, such as salience or reward) that might incentivize one-shot consolidation. We have added a comment to the discussion to acknowledge these points (added text in bold): “ Moreover, in real neural circuits, additional factors besides recall, such as reward or salience, are likely to influence consolidation as well. For instance, a sufficiently salient event should be stored in long-term memory even if encountered only once. Furthermore, while in our model familiarity drives consolidation, certain forms of novelty may also incentivize consolidation, raising the prospect of a non-monotonic relationship between consolidation probability and familiarity.” We agree that future work should address the combined influence of recall (as in our model) and other factors on the propensity to consolidate a memory.

      Reviewer #1 requested, “a comparison/discussion of the wide range of models on synaptic tagging for consolidation by various types of signals. Notably, studies from Wulfram Gerstner's group (e.g., Brea, J., Clayton, N. S., & Gerstner, W. (2023). Computational models of episodic-like memory in food-caching birds. Nature Communications, 14(1); and studies on surprise).”

      We thank the reviewer for the reference, which we have added to the manuscript. The model of Brea et al.(2023) is similar to that of Roxin & Fusi (2013), in that consolidation consists of “copying” synaptic weights from one population to another. As a result, just like the model of Roxin & Fusi (2013), this model does not provide the benefit that our model offers in the context of consolidating repeatedly recurring memories. However, the model of Brea et al. does have other interesting properties – for instance, it affords the ability to decode the age of a memory, which our model does not. We have added a comment on this point in the subsection of the Discussion tilted “Other models of systems consolidation.”

      Reviewer #2 noted, “While the article extensively discusses the strengths and advantages of the recall-gated consolidation model, it provides a limited discussion of potential limitations or shortcomings of the model, such as the missing feature of generalization, which is part of previous consolidation models. The model is not compared to other consolidation models in terms of performance and how much it increases the signal-to-noise ratio.”

      We agree that our work does not consider the notion of generalization and associated changes to representational geometry that accompany consolidation, which is the focus of many other studies on consolidation. We have further highlighted this limitation in the discussion. Regarding the comparison to other models, this is a tricky point as the desiderata we emphasize in this study (the ability to recall memories that are intermittently reinforced) is not the focus of other studies. Indeed, our focus is primarily on the ability of systems consolidation to be selective in which memories are consolidated, which is somewhat orthogonal to the focus of many other theoretical studies of consolidation. We have updated some wording in the introduction to emphasize this focus.

      Additional comments made by reviewer #1

      Reviewer #1 pointed out issues in the clarity of Fig. 2A. We have added substantial clarifying text to the figure caption.

      Reviewer #1 pointed out lack of clarity in our introduction to the terms “reliability” and “reinforcement.” We have now made it more clear what we mean by these terms the first time they are used.

      We have updated our definition of “recall” to use the term “recall factor,” which is how we refer to it subsequently in the paper.

      We have made explicit in the main text our simplifying assumption that memories are mean-centered.

      We have made consistent our use of “forgetting curve” and “memory trace”.

      Additional comments made by reviewer #2

      We have added a comment in the discussion acknowledging alternative interpretations of the result of Terada et al. (2021)

      We have significantly expanded the discussion of findings about the mushroom body to make it accessible to readers who do not specialize in this area. We hope this clarifies the nature of the experimental finding, which uncovered a circuit that performs a strikingly clean implementation of our model.

      The reviewer expresses concern that the songbird study (Tachibana et al., 2022) does not provide direct evidence for consolidation being gated by familiarity of patterns of activity. Indeed, the experimental finding is one-step removed from the direct predictions of our model. That said, the finding – that the rate of consolidation increases with performance – is highly nontrivial, and is predicted by our model when applied to reinforcement learning tasks. We have added a comment to the discussion acknowledging that this experimental support for our model is behavioral and not mechanistic.

      We do not regard it as completely trivial that the parallel LTM model performs roughly the same as the STM model, since a slower learning rate can achieve a higher SNR (as in Fig. 2C). Nevertheless we have added wording to the main text around Fig. 4B to note that the result is not too surprising.

      We have added a sentence that clarifies the goal / question of our paper earlier on in the introduction.

      We have updated Figure 3 by labeling the key components of the schematics and adding more detail to the legend, as suggested by the reviewer. We also reordered the figure panels as suggested.

      Additional comments made by reviewer #3:

      We have clarified in the main text that Fig. 2C and all results from Fig. 4 onward are derived from an ideal observer model (which we also more clearly define).

      We have now emphasized in the main text that the derivations of the recall factors for specific learning rules are derived in the Supplementary Information.

      We have highlighted more clearly in the main text that the recall factors associated with specific learning rules may correspond to other notions that do not intuitively correspond to “recall,” and have added a pointer to Fig. 3A where these interpretations are spelled out.

      We have added references corresponding to the types of learning rules we consider.

      The cutoffs / piecewise-looking behavior of plots in Fig. 4 are primarily the result of finite N, which limits the maximum SNR of the system, rather than coarse sampling of parameter values.

      Thank you for pointing out the error in the legend in Fig. 5D (also affected Supp Fig. S7/S8), which is now fixed.

      The reference to the nonexistence panel Fig. 5G has been removed.

      As the reviewer points out, the use of a binary action output in our reinforcement learning task renders it quite similar to the supervised learning task, making the example less compelling. In the revised manuscript we have updated the RL simulation to use three actions. Note also that in our original submission the network outputs represented action probabilities directly (which is straightforward to do for binary actions, but not for more than two available actions). In order to parameterize a policy when more than two actions are available, we sample actions using a softmax policy, as is more standard in the field and as the reviewer suggested. The associated recall factor is still a product of reward and a “confidence factor,” and the confidence factor is still the value of the network output in the unit corresponding to the chosen action, but in the updated implementation this factor is equal to , similar (though with a sign difference) to the reviewer’s suggestion. We believe these updates make our RL implementation and simulation more compelling, as it allows them to be applied to tasks with arbitrary numbers of actions.

      Additional minor comments

      The reviewers made a number of other specific line-by-line wording suggestions, typo corrections,

    1. eLife assessment

      This study provides valuable insights into the mechanism of axonal directional changes, utilizing the pacemaker neurons of the circadian clock, the sLNVs, as a model system. The data were collected and analysed using solid methodology, resulting in valuable data on the interplay of signalling pathways and the growth of the axon. The study holds potential interest for neurobiologists focusing on axonal growth and development.

    2. Reviewer #1 (Public Review):

      The mechanisms of how axonal projections find their correct target requires the interplay of signalling pathways, and cell adhesion that act over short and long distances. The current study aims to use the small ventral lateral clock neurons (s-LNvs) of the Drosophila clock circuit as a model to study axon projections. These neurons are born during embryonic stages and are part of the core of the clock circuit in the larval brain. Moreover, these neurons are maintained through metamorphosis and become part of the adult clock circuit. The authors use the axon length by means of anti-Pdf antibody or Pdf>GFP as a read-out for the axonal length. Using ablation of the MB- the overall target region of the s-LNvs, the authors find defects in the projections. Next, by using Dscam mutants or knock-down they observe defects in the projections. Manipulations by the DNs - another group of clock neurons - can induce defects in the s-LNvs axonal form, suggesting an active role of these neurons in the morphology of the s-LNvs.

    3. Reviewer #2 (Public Review):

      The paper from Liu et al shows a mechanism by which axons can change direction during development. They use the sLNv neurons as a model. They find that the appearance of a new group of neurons (DNs) during post-embryonic proliferation secretes netrins and repels horizontally towards the midline, the axonal tip of the LNvs. The experiments are well done and the results are conclusive.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The mechanisms of how axonal projections find their correct target requires the interplay of signalling pathways, and cell adhesion that act over short and long distances. The current study aims to use the small ventral lateral clock neurons (s-LNvs) of the Drosophila clock circuit as a model to study axon projections. These neurons are born during embryonic stages and are part of the core of the clock circuit in the larval brain. Moreover, these neurons are maintained through metamorphosis and become part of the adult clock circuit. The authors use the axon length by means of anti-Pdf antibody or Pdf>GFP as a read-out for the axonal length. Using ablation of the MB- the overall target region of the s-LNvs, the authors find defects in the projections. Next, by using Dscam mutants or knock-down they observe defects in the projections. Manipulations by the DNs - another group of clock neurons- can induce defects in the s-LNvs axonal form, suggesting an active role of these neurons in the morphology of the s-LNvs.

      Strengths:

      The use of Drosophila genetics and a specific neural type allows targeted manipulations with high precision.

      Proposing a new model for a small group of neurons for axonal projections allows us to explore the mechanism with high precision.

      Weaknesses:

      It is unclear how far the proposed model can be seen as developmental.

      The study of changes in fully differentiated and functioning neurons may affect the interpretation of the findings.

      We appreciate the reviewer's feedback on the strengths and weaknesses of our study.

      We acknowledge the strengths of our research, particularly the precision afforded by using Drosophila genetics and a specific neural type for targeted manipulations, as well as the proposal of a new model for studying axonal projections in a small group of neurons.

      We understand the concerns about the developmental aspects of our proposed model and the use of Pdf-GAL4 >GFP as a read-out for the axonal length (revised manuscript Figure 1--figure supplement 1). However, even with the use of Clk856-GAL4 that began to be expressed at the embryonic stage (revised manuscript Figure 3--figure supplement 1) to suppress Dscam expression, the initial segment of the dorsal projection of s-LNvs (the vertical part) remained unaffected. Instead, the projection distance is severely shortened towards the midline, and this defect persists until the adult stage. It is for this reason that we delineate the dorsal projections of s-LNvs into two distinct phases: the vertical and horizontal parts, rather than a mere expansion in correspondence with the development of the larval brain.

      Thank you for your valuable feedback, and we have incorporated these considerations into our revised manuscript to enhance the clarity and depth of our research.

      Reviewer #2 (Public Review):

      Summary:

      The paper from Li et al shows a mechanism by which axons can change direction during development. They use the sLNv neurons as a model. They find that the appearance of a new group of neurons (DNs) during post-embryonic proliferation secretes netrins and repels horizontally towards the midline, the axonal tip of the LNvs.

      Strengths:

      The experiments are well done and the results are conclusive.

      Weaknesses:

      The novelty of the study is overstated, and the background is understated. Both things need to be revised.

      We appreciate your acknowledgment that the experiments were well-executed and the results conclusive. This validation reinforces the robustness of our findings.

      We take note of your feedback regarding the novelty of the study being overstated and the background being understated. While axonal projections navigate without distinct landmarks, like the midline or the layers, columns, and segments, they pose more challenges and uncertainties. As highlighted, our key contribution lies in elucidating how axonal projections without clear landmarks are guided, with our research demonstrating how a newly formed cluster of cells at a specific time and location provides the necessary guidance cues for axons.

      We value your insights, and we have carefully addressed these points in our manuscript revision to improve the overall quality and presentation of our research.

      Recommendations For The Authors:

      Reviewer #1 (Recommendations For The Authors):

      The overall idea of using the s-LNvs as a model is indeed intriguing. There are genetic tools available to tackle these cells with great precision.

      However, based on the stage at which these cells are investigated raises some issues, that I feel are critical to be addressed.

      These neurons develop their axonal projections during embryogenesis and are fully functioning when the larvae hatch, thus to investigate axonal pathfinding one would have to address embryonic development.

      The larval brain indeed continues to grow during larval life, however extensive work from the Hartenstein lab, Truman lab, and others have shown that the secondary (larval born) neurons do not yet wire into the brain, but stall their axonal projections.

      It is thus quite unclear, what the authors are actually studying.

      One interpretation could be that the authors observe changes in axon length due to morphological changes in the brain. Indeed, the fact that the MB expands the anatomy of the surrounding neuropil changes too.

      Moreover, it is unclear when exactly the Pdf-Gal4 (and other drivers) are active, thus how far (embryonic) development of s-LNvs is affected, or if it's all happening in the differentiated, functioning neuron. (Gal4 temporal delay and dynamics during embryonic development may further complicate the issue). As far as I am aware the MB drivers might already be active during embryonic stages.

      Since the raised issue is quite fundamental, I am not sure what might be the best and most productive fashion to address this.

      Eg. either to completely re-focus the topic on "neural morphology maintenance" or to study the actual development of these cells.

      We thank the reviewer for the detailed and insightful feedback on our study. We have tested whether Pdf-Gal4 could effectively label s-LNv, and tracked the s-LNv projection in the early stage after larvae hatching. We did not observe the PDF antibody staining signal and the GFP signal driven by Pdf-GAL4 when the larvae were newly hatched. At 2-4 hours ALH, PDF signals were primarily concentrated at the end of axons, while GFP signals were mainly concentrated at the cell body. Helfrich-Förster initially detected immunoreactivity for PDF in the brains approximately 4-5 hours ALH. The GFP signal expressed by Pdf-GAL4 driver does have signal delay. However, at 8 hours ALH, the GFP signal strongly co-localized with the PDF signal within the axons (see revised manuscript lines 98-101) (Figure 1—figure supplement 1).

      Based on previous research findings and our staining of Clk856-GAL4 >GFP, it is indeed confirmed that the dorsal projection of s-LNvs in Drosophila is formed during the embryonic stage (Figure 3—figure supplement 1). The s-LNvs in first-instar larval Drosophila are capable of detecting signal output and may play a role in regulating certain behaviors. Our selection of tools for characterizing the projection pattern of s-LNv was not optimal, leading us to overlook the crucial detail that the projection had already formed during its embryonic stage.

      However, even when employing Clk856-GAL4 to suppress Dscam expression from the embryonic stage, the initial segment of the dorsal projection of s-LNvs (the vertical part) remains unaffected. Instead, the projection distance is severely shortened towards the midline, and this defect persists until the adult stage. It is for this reason that we delineate the dorsal projections of s-LNvs into two distinct phases: the vertical and horizontal parts, rather than a mere expansion in correspondence with the development of the larval brain.

      From the results searched in the Virtual Fly Brain (VFB) database (https://www.virtualflybrain.org/), it is clear that the neurons that form synaptic connections with s-LNvs at the adult stage are essentially completely different from the neurons that are associated with them at the L1 larval stage. Thus, most neurons that form synapses with s-LNvs in the early larvae either cease to exist after metamorphosis or assume other roles in the adult stage. Similar to the scenario where Cajal-Retzius cells and GABAergic interneurons establish transient synaptic connections with entorhinal axons and commissural axons, respectively, these cells form a transient circuit with presynaptic targets and subsequently undergo cell death during development. In our model, the neurons that synapse with s-LNvs in early development serve as "placeholders," offering positive or negative cues to guide the axonal targeting of s-LNvs towards their ultimate destination.

      Thank you again for your valuable feedback, and we have incorporated these considerations into our revised manuscript to enhance the clarity and depth of our research.

      Reviewer #2 (Recommendations For The Authors):

      Major:

      In the introduction too many revisions are cited and very few actual research papers. This should be corrected and the most significant papers in the field should be cited. For example, there is no reference to the pioneering work from the Christine Holt lab or the first paper looking at axon guidance and guideposts by Klose and Bentley, Isbister et al 1999.

      The introduction should encapsulate the actual knowledge based on actual research papers.

      We acknowledge your concern regarding the citation of review papers rather than primary research papers in the introduction. Following your suggestion, we have revised the introduction section to incorporate references to relevant research papers.

      In the introduction and discussion: The authors cite revisions where the signals that guide axons across different regions including turning are shown and they end up saying: "However, how the axons change their projection direction without well-defined landmarks is still unclear." I think the sentence should be changed. Many things are still not clear but this is not a good phrasing. Maybe they could focus on their temporal finding?

      We appreciate the reviewer's feedback and insightful suggestions. We agree that emphasizing the temporal aspect is crucial in our study. However, we also recognize the significance of understanding the origin of signals that guide axonal reorientation at specific locations. While axonal projections navigating without distinct landmarks pose more challenges and uncertainties compared to those guided by prominent landmarks like the midline, our research demonstrates the crucial role of a specific cell population near turning points in providing accurate guidance cues to ensure precise axonal reorientation. We have revised our phrasing in the introduction and discussion to better reflect these key points (see revised manuscript lines 69-71 and 350-354). Thank you for highlighting the significance of focusing on our temporal findings and the complexities involved in studying axonal projection.

      Many rather old papers have looked into the effect of repulsive guideposts to guide axon projections. In particular, I can think of the paper from Isbister et al. 1999 (DOI: 10.1242/dev.126.9.2007) that not only shows how semaphoring guides Ti axon projection but also shows how the pattern of expression of sema 2a changes during development to guide the correct projection. I really think that the novelty of the paper should be revised in light of the actual knowledge in the field.

      We appreciate the reviewer's reference to the seminal work by Isbister et al. (1999) and the importance of guidepost cells in axon projection guidance, which we have already cited in our revised manuscript. It is crucial to recognize that segmented patterns such as the limb segment traversed by Ti1 neuron projections or neural circuits formed in a layer- or column-specific manner also serve as intrinsic "guideposts," offering valuable insights into axonal pathfinding processes. In our model, explicit guidance cues are lacking. As highlighted, our key contribution lies in elucidating how axonal projections without clear landmarks are guided, with our research demonstrating how a newly formed cluster of cells at a specific time and location provides the necessary guidance cues for axons (see revised manuscript lines 350-354). We have ensured that our revised manuscript reflects these insights and emphasizes the significance of studying axonal guidance in the absence of distinct guideposts. Thank you for underscoring these essential points, which enhance our understanding of axonal projection dynamics.

      Minors:

      Line 54, the authors start talking about floorplate at the end of a section on Drosophila. Please use “In vertebrates”, or “in invertebrates” or “in Drosophila” etc.. when needed to put things in context.

      We thank the reviewer for this suggestion and have modified this sentence. Please refer to lines 62-63 of the revised manuscript.

      Line 69: many factors change the axonal outgrowth. The authors are missing the paper from Fernandez et al. 2020, who have shown that unc5 the receptor of netrin induces the stalling for sLNvs projections before the turn. https://doi.org/10.1016/j.cub.2020.04.025

      We thank the reviewer for this suggestion and have added this research article. Please refer to line 79 of the revised manuscript.

      Line 99: "precisely at the pivotal juncture". It I hard to see how it was done in the figures shown. Can the authors add a small panel with neuronal staining showing this (please no HRP)?

      For all figures, tee magenta is too strong and it is really hard to see the sLNvs projections. Can this be sorted, please?

      We have depicted the pivotal juncture in the schematic diagram on the left side of Figure 1C. Additionally, we have included a separate column of images without HRP in Figure 1A. Moreover, we have modified the pseudo-color of HRP from magenta to blue to enhance the visualization of the s-LNv projection. The figure legends have also been correspondingly modified.

      Line 407: Spatial position relationship between calyx and s-LNvs. OK107-GAL4 labels ... calyx and s-LNvs labeled by, which which.

      We have modified it according to your suggestion. Please refer to lines 430-432 of the revised manuscript.

      Line 137 typo RPRC

      We thank the reviewer for noticing this mistake, which has now been corrected. Please refer to line 148-149 of the revised manuscript.

      Section 158-164. the paper from Zhang et al 2019 needs to be cited since they have found the same effect of decreasing Dscam even if they didn't think about horizontal projection.

      Thanks to the suggestion, we have included in the manuscript the phenotype observed by Zhang et al. (2019) upon knocking down Dscam1-L in adults. Please refer to lines 170-172 of the revised manuscript.

      Line 176: typo senses (instead of sensor).

      Thank you for pointing out our mistake. We have modified it according to your suggestion. Please refer to line 189 of the revised manuscript.

      Line 193: more than Interesting it is Notable. Add "ubiquitus" knockdown.

      Thank you for the suggestion. We have included the word "ubiquitus" to enhance the precision of the narrative. Please refer to line 206 of the revised manuscript.

      Line 224: the pattern of expression of the crz cells is not visible where the projections of sLNvs are located. Are they in that region? Or further away?

      We've changed the pseudo-color of HRP, and in the updated Figure 5- figure supplement 1, you can see the projection pattern of crz+ cells, positioned close to the end of the s-LNv axon terminal.

      Line 243: applied? Do you mean "used"

      Thank you for the suggestion. We have revised it at line 256.

      Figure 5 Sup1: the schematic shows DNs proliferation that is not visible on the GFP image. Please comment.

      We have modified the Figure 5 figure supplementary 1 for 120 h per-GAL4, Pdf-GAL80 >GFP expression pattern. Due to the strong GFP intensity in some DN neurons, there was a loss of GFP signal. Additionally, in Figure 6 figure supplementary 1, we have added co-localization images of DN and s-LNv at 72 h and 96 h. To better illustrate the co-localization information, we have shown only a portion of the layers in the right panel. We hope these additions clarify your concerns.

      Line 251: cite Fernandez et al. 2020 with Purohit et al 2012.

      We have modified it according to your suggestion. Please refer to line 264 of the revised manuscript.

      Line 272: you have not shown synergistic effects because you have not modulated both pathways at the same time. You should talk about complementary.

      We have modified it according to your suggestion at lines 25, 285, 439.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1:

      (1) Point for more elaborate discussion: Apparently the timescale of negative feedback signals is conserved between endothelial cell migration in vitro (with human cells) and endothelial migration during the formation of ISVs in zebrafish. What do you think might be an explanation for such conserved timescales? Are there certain processes within cytoskeletal tension build up that require this quantity of time to establish? Or does it relate to the time that is needed to begin to express the YAP/TAZ target genes that mediate feedback?

      The underlying mechanisms responsible for the conserved timescale is a major direction that we continue to explore. Localization of YAP/TAZ to the nucleus is likely not rate-limiting. We showed previously that acute RhoA activation produced significant YAP/TAZ nuclear localization within minutes, while subsequent co-transcriptional activity aligned with the gene expression dynamics observed here (Berlew et al., 2021). We hypothesize that the dynamics of YAP/TAZdependent transcription and the translation of those target genes are rate-limiting for initial feedback loop completion (tic = 4 hours). This is supported by work from us and others in a variety of cell lines showing YAP/TAZ transcriptional responses take place during the first few hours after activation. (Franklin et al., 2020; Mason et al., 2019; Plouffe et al., 2018) While our data identify mediators of initial feedback loop completion, the molecular effectors that determine the timescale of new cytoskeletal equilibrium establishment (teq = 8 hours) remain unclear.

      (2) Do you expect different timescales for slower endothelial migratory processes (e.g. for instance during fin vascular regeneration which takes days)?

      We selected the ISV development model because it exhibits similar migratory kinetics to our previously-explored human ECFC migration in vitro. The comparable kinetics allowed us to study dynamics of the feedback loop in vivo on similar time scales, but we have not explored models featuring either slower or faster dynamics. 

      It would be interesting to test how feedback dynamics are impacted in distinct endothelial migratory processes. Our data suggest that the feedback loop is necessary for persistent migration; however, YAP and TAZ respond to a diversity of upstream regulators in addition to mechanical signals, which might depend on the process of vascular morphogenesis. For example, after fin amputation, inflammation and tissue regeneration may impact the biochemical and mechanical environment experienced by the endothelium. Additionally, cells display different migratory behaviors in ISV morphogenesis compared to fin regeneration. During ISV formation, sprouting tip cells migrate dorsally through avascular tissue, followed by stalk cells. (Ellertsdóttir et al., 2010) In contrast, the fin vasculature regenerates by forming an intermediate vascular plexus, where some venous-derived endothelial cells migrate towards the sprouting front, while others migrate against it. (Xu et al., 2014) We are excited to study the role of this feedback loop in these different modes of neovessel formation in future studies.

      (3) Is the ~4hrs and 8hrs feedback time window a general property or does it differ between specific endothelial cell types? In the veins the endothelial cells generate less stress fibers and adhesions compared to in the arteries. Does this mean that there might be a difference in the feedback time window, or does that mean that certain endothelial cell types may not have such YAP/TAZcontrolled feedback system?

      Recent studies suggest that venous endothelial cells are the primary endothelial subtype responsible for blood vessel morphogenesis. (Lee et al., 2022, 2021; Xu et al., 2014) They are highly motile and mechanosensitive, migrating against blood flow. (Lee et al., 2022) The Huveneers group has shown that the actin cytoskeleton is differently organized in adult arteries and veins in response to biomechanical properties of its extracellular matrix, rather than intrinsic differences between arterial and venous cells. (van Geemen et al., 2014) This suggests that arterial and venous cells have distinct cytoskeletal setpoints due to mechanical cues in their environment (Price et al., 2021). We expect this to impact the degree of cytoskeletal remodeling and cell migration at equilibrium, rather than the kinetics of the feedback loop per se, though we have not yet tested this hypothesis. Testing these predictions on cytoskeletal setpoint stability and adaptation is a major direction that we continue to explore. 

      (4) The experiments are based on perturbations to prove that transcriptional feedback is needed for endothelial migration. What would happen if the feedback systems is always switched on? An experiment to add might be to analyse the responsiveness of endothelial cells expressing constitutively active YAP/TAZ.

      This is a problem that we are actively pursuing. Though the feedback system forms a coherent loop, we anticipate that the identity of the node of the loop selected for constitutive activation will influence the outcome, depending on whether that node is rate-limiting for feedback kinetics and the extent of intersection of that node with other signaling events in the cell. For example, we have observed that constitutive YAP activation drives profound changes to the transcriptional landscape including, but not limited to, RhoA signaling (Jones et al., 2023). We further anticipate that constitutive activation of feedback loop nodes may alter feedback dynamics, while dynamic or acute perturbation will be required to dissect these contributions in real time. For these reasons, ongoing work in the lab is pursuing these questions using optogenetic tools that enable precise spatial and temporal control (Berlew et al., 2021).   

      (5) To investigate the role of YAP-mediated transcription in an accurate time-dependent manner the authors may consider using the recently developed optogenetic YAP translocation tool: https://doi.org/10.15252/embr.202154401

      We are enthusiastic about the power of optogenetics to interrogate the nodes and timescales of this feedback system, and we are now funded to pursue this line of research. 

      Reviewer #2:

      The idea is intriguing, but it is not clear how the feedback actually works, so it is difficult to determine if the events needed could occur within 4 hrs. Specifically, it is not clear what gene changes initiated by YAP/TAZ translocation eventually lead to changes in Rho signaling and contractility. Much of the evidence to support the model is preliminary. Some of the data is consistent with the model, but alternative explanations of the data are not excluded. The fish washout data is quite interesting and does support the model. It is unclear how some of the in vitro data supports the model and excludes alternatives.

      Major strengths:

      The combination of in vitro and in vivo assessment provides evidence for timing in physiologically relevant contexts, and a rigorous quantification of outputs is provided. The idea of defining temporal aspects of the system is quite interesting.

      Major weaknesses:

      The evidence for a "loop" is not strong; rather, most of the data can also be interpreted as a linear increase in effect with time once a threshold is reached. Washout experiments are key to setting up a time window, yet these experiments are presented only for the fish model. A major technical challenge is that siRNA experiments take time to achieve depletion status, making precise timing of events on short time scales problematic. Also, Actinomycin D blocks most transcription so exposure for hours likely leads to secondary and tertiary effects and perhaps effects on viability. No RNA profiling is presented to validate proposed transcriptional changes.

      We thank the reviewer for these helpful suggestions. We have expanded our explanation of the history and known mediators of the feedback loop in the introduction. We and, independently, the Huveneers group recently reported that human endothelial cells maintain cytoskeletal equilibrium for persistent motility through a YAP/TAZ-mediated feedback loop that modulates cytoskeletal tension. (Mason et al., 2019; van der Stoel et al., 2020) Because YAP and TAZ are activated by tension of the cytoskeleton (Dupont et al., 2011), suppression of cytoskeletal tension by YAP/TAZ transcriptional target genes constitutes a negative feedback loop (Fig. 1A). We described key components of this cell-intrinsic feedback loop, which acts as a control system to maintain cytoskeletal homeostasis for persistent motility via modulation of Rho-ROCK-myosin II activity. (Mason et al., 2019) Both we and the Huveneers group found that perturbation of genes and pathways regulated by YAP/TAZ mechanoactivation can functionally rescue motility in YAP/TAZ-depleted cells (e.g., RhoA/ROCK/myosin II, NUAK2, DLC1). (Mason et al., 2019; van der Stoel et al., 2020) We further showed previously that both YAP/TAZ depletion and acute YAP/TAZ-TEAD inhibition consistently increased stress fiber and FA maturation and arrested cell motility, accounting for these limitations of siRNA. (Mason et al., 2019)

      Enduring limitations to the temporal, spatial, and cell-specific control of the genetic and pharmacologic methods have inspired us to initiate alternative approaches, which are the subject of ongoing efforts. Further research will be necessary in the zebrafish to determine the extent to which the observed migratory dynamics are driven by cytoskeletal arrest. 

      To identify early YAP/TAZ-regulated transcriptional changes, we have added RNA profiling of control and YAP/TAZ depleted cells cultured on stiff matrices for four hours. Genes upregulated by YAP/TAZ depletion were enriched for Gene Ontology (GO) terms associated with Rho protein signal transduction, vascular development, cellular response to vascular endothelial growth factor (VEGF) stimulus, and endothelial cell migration (Fig. 9B). These data support a role for YAP and TAZ as negative feedback mediators that maintain cytoskeletal homeostasis for endothelial cell migration and vascular morphogenesis.  

      Reviewer #3:

      The authors used ECFC - endothelial colony forming cells (circulating endothelial cells that activate in response to vascular injury).

      Q: Did the authors characterize these cells and made sure that they are truly endothelial cells - for example examine specific endothelial markers, arterial-venous identity markers & Notch signalling status, overall morphology etc prior to the start of the experiment. How were ECFC isolated from human individuals, are these "healthy" volunteers - any underlying CVD risk factors, cells from one patient or from pooled samples, what injury where these humans exposed to trigger the release of the ECPFs into the circulation, etc. The materials & methods on ECFC should be expanded.

      Human umbilical cord blood-derived ECFCs were isolated at Indiana University School of Medicine and kindly provided by Dr Mervin Yoder. Cells were cultured as described by the Yoder group (Rapp et al., 2011) and our prior paper (Mason et al., 2019). We have expanded the materials and methods section to describe the source and characterization of these cells.

      The authors suggest that loss of YAP/TAZ phenocopies actinomycin-D inhibition - "both transcription inhibition and YAP/TAZ depletion impaired polarization, and induced robust ventral stress fiber formation and peripheral focal adhesion maturation". However, the cell size of actinomycin-D treated cells (Fig. 1B, top right panel), differs from the endothelial cell size upon siYAP/TAZ (Fig. 1E, top right panel) - and vinculin staining seems more pronounced in actinomycin-D treated cells (B, bottom right) when compared to siYAP/TAZ group. Cell shape is defined by acto-myosin tension.

      Q: Besides Fraction of focal adhesion >1um; focal adhesion number did the authors measure additional parameters related to cytoskeleton remodelling / focal adhesions that can substantiate their statement on similarity between loss of YAP/TAZ and actinomycin-D treatment. Would it be possible to make a more specific genetic intervention (besides YAP/TAZ) interfering with the focal adhesion pathway as opposed to the broad spectrum inhibitor actinomyocin-D.

      Our previous paper (Mason et al., 2019) delineated the mechanistic relationships between YAP/TAZ signaling, focal adhesion turnover, actomyosin polymerization, and the intervening mechanisms of myosin regulation. Specifically, we demonstrated that YAP/TAZ regulate the myosin phosphatase kinase, NUAK2, and ARHGAP genes to mediate this feedback. Expanding on this work, the current study aimed to define the temporal kinetics of the cytoskeletal mechanotransductive feedback in vitro and in vivo. We used actinomycin-D and YAP/TAZ depletion to interrogate the role of transcriptional regulation and YAP/TAZ signaling, respectively. In this revision, we have added RNA profiling that identifies early YAP/TAZ-regulated transcriptional changes and further points to other molecular mediators of focal adhesions (e.g. TRIO, RHOB, THBS1) that will be the subjects of future studies.    

      Q: Does the actinomycin-D treatment affect responsiveness to Vegf? induce apoptosis or reduce survival of the ECFC?

      We have not looked specifically at the effect of actinomycin-D treatment on responsiveness to VEGF. However, actinomycin-D has been reported to reduce transcription of VEGF receptors (E et al., 2012). In contrast, we found that YAP/TAZ depletion upregulated GO terms associated with endothelial cell migration and response to VEGF stimulus (Fig. 9B), as well as receptors to angiogenic growth factors, including KDR and FLT4 (Fig. 9E). These results suggest YAP/TAZ depleted cells may be more sensitive to VEGF stimulation but remain nonmotile due to cytoskeletal arrest.

      We showed previously that long-term treatment with actinomycin-D reduces ECFC survival (Mason et al., 2019).

      Q: Which mechanism links ECM stiffness with endothelial surface area in the authors scenario. In zebrafish, activity of endothelial guanine exchange factor Trio specifically at endothelial cell junctions (Klems, Nat Comms, 2020) and endoglin in response to hemodynamic factors (Siekmann, Nat Cell Biol 2017) have been show to control EC shape/surface area - do these factors play a role in the scenario proposed by the authors.

      Our new transcriptional profiling indicates both Trio and endoglin are regulated through YAP and TAZ in human ECFCs. We plan to follow up on these findings.

      Q: The authors report that EC migrate faster on stiff substrate, and concomitantly these cells have a larger surface area. What is the physiological rationale behind these observations. Did the authors observe such behaviors in their zebrafish ISV model? How do these observations integrate with the tip - stalk cell shuffling model (Jakobsson & Gerhardt, Nat Cell Biol, 2011) and Notch activity in developing ISVs.

      This question raises important distinctions between the mode of migration in ISV morphogenesis and endothelial cells adherent to substrates. Cells behave and respond to mechanical cues differently in 2D vs. 3D matrices. (LaValley and Reinhart-King, 2014) Additionally, the microenvironment in vivo is much more complex, combining numerous biochemical signals and changing mechanical properties. (Whisler et al., 2023) We are actively investigating the downstream targets of YAP/TAZ mechanotransduction and how that integrates with other pathways known to regulate vascular morphogenesis, such as Notch signaling. 

      The authors examined the formation of arterial intersegmental vessels in the trunk of developing zebrafish embryos in vivo. They used a variety of pharmacological inhibitors of transcription and acto-myosin remodelling and linked the observed morphological changes in ISV morphogenesis with changes in endothelial cell motility.

      Q: Reduced formation and dorsal extension of ISVs may have several reasons, including reduced EC migration and proliferation. The Tg(fl i1a:EGFP) reporter however is not the most suitable line to monitor migration of individual endothelial cells. Can the authors repeat the experiments in Tg(fl i1a:nEGFP); Tg(kdrl:HRAS-mCherry) double transgenics to visualize movement-migration of the individual endothelial cells and EC proliferation events, in the different treatment regimes.

      So far, we have not tracked individual endothelial cells during ISV morphogenesis. We agree this is the best approach and are pursuing a similar technique for these experiments.

      ISV formation is furthermore affected by Notch signalling status and a series of (repulsive) guidance cues.

      Q: Does de novo blockade of gene expression with Actinomycin D affect Notch signalling status, expression of PlexinD - sFlt1, netrin1 or arterial-venous identify genes.

      While we have not performed gene expression analysis under the Actinomycin D condition, Actinomycin D functions as a broad transcription inhibitor. We are currently pursuing the downstream targets of YAP/TAZ mechanotransduction in both ECFCs and zebrafish.

      Remark: The authors may want to consider using the Tg(fl i1:LIFEACT-GFP) reporter for in vivo imaging of actin remodelling events.

      We thank the reviewer for their helpful suggestion.

      Remark: the authors report "As with broad transcription inhibition, in situ depletion of YAP and TAZ by RNAi arrested cell motility, illustrated here by live-migration sparklines over 10 hours: siControl: , siYAP/TAZ: (25 μm scale-bar: -)". Can the authors make a separate figure panel for this, how many cells were measured?

      Please refer to our previous publication for the complete details on this data (Mason et al., 2019). We have added the citation in the text.

      Remark: in the wash-out experiments, exposure to the inhibitors is not the same in the different scenarios - could it be that the longer exposure time induces "toxic" side effect that cannot be "washed out" when compared to the short treatment regimes?

      This is a possible limitation of the pharmacological approach and have included it in the discussion section. We are currently exploring alternative approaches to interrogate the timescale of the feedback loop more precisely.  

      References

      Berlew EE, Kuznetsov IA, Yamada K, Bugaj LJ, Boerckel JD, Chow BY. 2021. Single-Component Optogenetic Tools for Inducible RhoA GTPase Signaling. Advanced Biology 5:2100810. doi:10.1002/adbi.202100810

      Dupont S, Morsut L, Aragona M, Enzo E, Giulitti S, Cordenonsi M, Zanconato F, Le Digabel J,Forcato M, Bicciato S, Elvassore N, Piccolo S. 2011. Role of YAP/TAZ in mechanotransduction. Nature 474:179–183. doi:10.1038/nature10137

      E G, Cao Y, Bhattacharya S, Dutta S, Wang E, Mukhopadhyay D. 2012. Endogenous Vascular Endothelial Growth Factor-A (VEGF-A) Maintains Endothelial Cell Homeostasis by Regulating VEGF Receptor-2 Transcription. J Biol Chem 287:3029–3041. doi:10.1074/jbc.M111.293985

      Ellertsdóttir E, Lenard A, Blum Y, Krudewig A, Herwig L, Affolter M, Belting H-G. 2010. Vascular morphogenesis in the zebrafish embryo. Developmental Biology, Special Section: Morphogenesis 341:56–65. doi:10.1016/j.ydbio.2009.10.035

      Franklin JM, Ghosh RP, Shi Q, Reddick MP, Liphardt JT. 2020. Concerted localization-resets precede YAP-dependent transcription. Nat Commun 11:4581. doi:10.1038/s41467-02018368-x

      Jones DL, Hallström GF, Jiang X, Locke RC, Evans MK, Bonnevie ED, Srikumar A, Leahy TP, Nijsure MP, Boerckel JD, Mauck RL, Dyment NA. 2023. Mechanoepigenetic regulation of extracellular matrix homeostasis via Yap and Taz. Proceedings of the National Academy of Sciences 120:e2211947120. doi:10.1073/pnas.2211947120

      LaValley DJ, Reinhart-King CA. 2014. Matrix stiffening in the formation of blood vessels. Advances in Regenerative Biology 1:25247. doi:10.3402/arb.v1.25247

      Lee H-W, Shin JH, Simons M. 2022. Flow goes forward and cells step backward: endothelial migration. Exp Mol Med 54:711–719. doi:10.1038/s12276-022-00785-1

      Lee H-W, Xu Y, He L, Choi W, Gonzalez D, Jin S-W, Simons M. 2021. Role of Venous Endothelial Cells in Developmental and Pathologic Angiogenesis. Circulation 144:1308–1322. doi:10.1161/CIRCULATIONAHA.121.054071

      Mason DE, Collins JM, Dawahare JH, Nguyen TD, Lin Y, Voytik-Harbin SL, Zorlutuna P, Yoder MC, Boerckel JD. 2019. YAP and TAZ limit cytoskeletal and focal adhesion maturation to enable persistent cell motility. Journal of Cell Biology 218:1369–1389. doi:10.1083/jcb.201806065

      Plouffe SW, Lin KC, Moore JL, Tan FE, Ma S, Ye Z, Qiu Y, Ren B, Guan K-L. 2018. The Hippo pathway effector proteins YAP and TAZ have both distinct and overlapping functions in the cell. J Biol Chem 293:11230–11240. doi:10.1074/jbc.RA118.002715

      Price CC, Mathur J, Boerckel JD, Pathak A, Shenoy VB. 2021. Dynamic self-reinforcement of gene expression determines acquisition of cellular mechanical memory. Biophysical Journal 120:5074–5089. doi:10.1016/j.bpj.2021.10.006

      Rapp BM, Saadatzedeh MR, Ofstein RH, Bhavsar JR, Tempel ZS, Moreno O, Morone P, Booth DA, Traktuev DO, Dalsing MC, Ingram DA, Yoder MC, March KL, Murphy MP. 2011. Resident Endothelial Progenitor Cells From Human Placenta Have Greater Vasculogenic Potential Than Circulating Endothelial Progenitor Cells From Umbilical Cord Blood. Cell Med 2:85–96. doi:10.3727/215517911X617888

      Tammela T, Zarkada G, Nurmi H, Jakobsson L, Heinolainen K, Tvorogov D, Zheng W, Franco CA, Murtomäki A, Aranda E, Miura N, Ylä-Herttuala S, Fruttiger M, Mäkinen T, Eichmann A, Pollard JW, Gerhardt H, Alitalo K. 2011. VEGFR-3 controls tip to stalk conversion at vessel fusion sites by reinforcing Notch signalling. Nat Cell Biol 13:1202–1213. doi:10.1038/ncb2331

      van der Stoel M, Schimmel L, Nawaz K, van Stalborch A-M, de Haan A, Klaus-Bergmann A, Valent ET, Koenis DS, van Nieuw Amerongen GP, de Vries CJ, de Waard V, Gloerich M, van Buul JD, Huveneers S. 2020. DLC1 is a direct target of activated YAP/TAZ that drives collective migration and sprouting angiogenesis. Journal of Cell Science 133:jcs239947. doi:10.1242/jcs.239947

      van Geemen D, Smeets MWJ, van Stalborch A-MD, Woerdeman LAE, Daemen MJAP, Hordijk PL, Huveneers S. 2014. F-Actin–Anchored Focal Adhesions Distinguish Endothelial Phenotypes of Human Arteries and Veins. Arteriosclerosis, Thrombosis, and Vascular Biology 34:2059–2067. doi:10.1161/ATVBAHA.114.304180

      Whisler J, Shahreza S, Schlegelmilch K, Ege N, Javanmardi Y, Malandrino A, Agrawal A, Fantin A, Serwinski B, Azizgolshani H, Park C, Shone V, Demuren OO, Del Rosario A, Butty VL, Holroyd N, Domart M-C, Hooper S, Szita N, Boyer LA, Walker-Samuel S, Djordjevic B, Sheridan GK, Collinson L, Calvo F, Ruhrberg C, Sahai E, Kamm R, Moeendarbary E. 2023. Emergent mechanical control of vascular morphogenesis. Science Advances 9:eadg9781. doi:10.1126/sciadv.adg9781

      Xu C, Hasan SS, Schmidt I, Rocha SF, Pitulescu ME, Bussmann J, Meyen D, Raz E, Adams RH, Siekmann AF. 2014. Arteries are formed by vein-derived endothelial tip cells. Nat Commun 5:5758. doi:10.1038/ncomms6758

    2. eLife assessment

      This valuable manuscript delineates the role of YAP/TAZ-dependent transcriptional suppression in a mechanodransductive feedback loop. The evidence presented in the manuscript is generally solid. However, compared to an earlier version, some concerns remain. In particular, the in vivo validation should be strengthened, and the in vitro and in vivo models used in this work should be carefully compared in order to improve the main message of the manuscript.

    3. Reviewer #1 (Public Review):

      This manuscript puts forward the concept that there is a specific time window during which YAP/TAZ driven transcription provides feedback for optimal endothelial cell adhesion, cytoskeletal organization and migration. The study follows up on previous elegant findings from this group and others which established the importance of YAP/TAZ-mediated transcription for persistent endothelial cell migration. The data presented here extends the concept at two levels: first, the data may explain why there are differences between experimental setups where YAP/TAZ activity are inhibited for prolonged times (e.g. cultures of YAP knockdown cells), versus experiments in which the transient inhibition of YAP/TAZ and (global) transcription affects endothelial cell dynamics prior to their equilibrium.

      All experiments are convincing, clearly visualized and quantified.

      The strength of the paper is that it clearly indicates that there are temporal controlled feedback systems which which is important for endothelial collective cell behavior.

      A limitation of the study is that the inhibitory studies in vivo may include off-target effects as well. Future endeavors, including specific knockout models, optogenetics and/or transgenic zebrafish lines that visualize endothelial cell properties (proliferation and migration) will be informative to track individual endothelial cell responses upon feedback signals.

    4. Reviewer #2 (Public Review):

      Summary:

      Here the effect of overall transcription blockade, and then specifically depletion of YAP/TAZ transcription factors was tested on cytoskeletal responses, starting from a previous paper showing YAP/TAZ-mediated effects on the cytoskeleton and cell behaviors. Here, primary endothelial cells were assessed on substrates of different stiffness and parameters such as migration, cell spreading, and focal adhesion number/length were tested upon transcriptional manipulation. Zebrafish subjected to similar manipulations were also assessed during the phase of intersegmental vessel elongation. The conclusion was that there is a feedback loop of 4 hours that is important for the effects of mechanical changes to be translated into transcriptional changes that then permanently affect the cytoskeleton.

      The idea is intriguing and a previous paper contains data supporting the overall model. The fish washout data is quite interesting and supports the kinetics conclusions. New transcriptional profiling in this version supports that cytoskeletal genes are differentially regulated with YAP/TAZ manipulations.

      Major strengths: The combination of in vitro and in vivo assessment provides evidence for timing in physiologically relevant contexts, and rigorous quantification of outputs is provided. The idea of defining temporal aspects of the system is quite interesting. New RNA profiling supports the model.

      Weaknesses:

      Actinomycin D blocks most transcription so exposure for hours likely leads to secondary and tertiary effects and perhaps effects on viability.

    5. Reviewer #4 (Public Review):

      Summary:

      Mason DE et al. have extended their previous study on continuous migration of cells regulated by a feedback loop that controls gene expression by YAP and TAZ. Time scale of the negative feedback loop is derived from the authors' adhesion-spreading-polarization-migration (ASPM) assay. Involvement of transcription-translation in the negative feedback loop is evidenced by the experiments using Actinomycin D. The time scale of mechanotransduction-dependent feedback demonstrated by cytoskeletal alteration in the actinomycin D-treated endothelial colony forming cells (ECFCs) and that shown in the ECFCs depleted of YAP/TAZ by siRNA. The authors examine the time scale when ECFCs are attached to MeHA matrics (soft, moderate, and stiff substrate) and show the conserved time scale among the conditions they use, although instantaneous migration, cell area, and circularity vary. Finally, they tried to confirm that the time scale of the feedback loop-dependent endothelial migration by the effect of washout of Actinomycin D (inhibition of gene transcription), Puromycin (translational inhibition), and Verteporfin (YAP/TAZ inhibitor) on ISV extension during sprouting angiogenesis. They conclude that endothelial motility required for vascular morphogenesis is regulated by mechanotransduction-mediated feedback loop that is dependent on YAP/TAZ-dependent transcriptional regulation.

      Strengths:

      The authors conduct ASPM assay to find the time scale of feedback when ECFCs attach to three different matrics. They observe the common time scale of feedback. Thus, under very specific conditions they use, the reproducibility is validated by their ASPM assay. The feedback loop mediated by inhibition of gene expression by Actinomycin D is similar to that obtained from YAP/TAZ-depleted cells, suggesting the mechanotranduction might be involved in the feedback loop. The time scale representing infection point might be interesting when considering the continuous motility in cultured endothelial cells, although it might not account for the migration of endothelial cells that is controlled by a wide variety of extracellular cues. In vivo, stiffness of extracellular matrix is merely one of the cues.

      Weaknesses:

      ASPM assay is based on attachment-dependent phenomenon. The time scale including the inflection point determined by ASPM experiments using cultured cells and the mechanotransduction-based theory do not seem to fit in vivo ISV elongation. Although it is challenging to find the conserved theory of continuous cell motility of endothelial cells, the data is preliminary and does not support the authors' claim. There is no evidence that mechanotransduction solely determines the feedback loop during elongation of ISVs. The points to be addressed are listed in recommendations for the authors.

    1. Reviewer #3 (Public Review):

      Summary:

      In Okholm et al., the authors evaluate the functional impact of circHIPK3 in bladder cancer cells. By knocking it down and performing an RNA-seq analysis, the authors found a thousand deregulated genes which look unaffected by miRNAs sponging function and that are, instead, enriched for a 11-mer motif. Further investigations showed that the 11-mer motif is shared with the circHIPK3 and able to bind the IGF2BP2 protein. The authors validated the binding of IGF2BP2 and demonstrated that IGF2BP2 KD antagonizes the effect of circHIPK3 KD and leads to the upregulation of genes containing the 11-mer. Among the genes affected by circHIPK3 KD and IGF2BP2 KD, resulting in downregulation and upregulation respectively, the authors found STAT3 gene which also consistently leads to the concomitant upregulation of one of its targets TP53. The authors propose a mechanism of competition between circHIPK3 and IGF2BP2 triggered by IGF2BP2 nucleation, potentially via phase separation.

      Strengths:

      The number of circRNAs continues to drastically grow however the field lacks detailed molecular investigations. The presented work critically addresses some of the major pitfalls in the field of circRNAs and there has been a careful analysis of aspects frequently poorly investigated. The time-point KD followed by RNA-seq, investigation of miRNAs-sponge function of circHIPK3, identification of 11-mer motif, identification and validation of IGF2BP2, and the analysis of copy number ratio between circHIPK3 and IGF2BP2 in assessing the potential ceRNA mode of action has been extensively explored and, comprehensively convincing.

      Weaknesses:

      The authors addressed the majority of the weak points raised initially. However, the role played by the circHIPK3 in cancer remains elusive and not elucidated in full in this study.

      Overall, the presented study surely adds some further knowledge in describing circHIPK3 function, its capability to regulate some downstream genes, and its interaction and competition for IGF2BP2. However, whereas the experimental part sounds technically logical, it remains unclear the overall goal of this study and the achieved final conclusions.

      This study is a promising step forward in the comprehension of the functional role of circHIPK3. These data could possibly help to better understand the circHIPK3 role in cancer.

    2. eLife assessment

      This work explores the role of one the most abundant circRNAs, circHIPK3, in bladder cancer cells, showing with convincing data that circHIPK3 depletion affects thousands of genes and that those downregulated (including STAT3) share an 11-mer motif with circHIPK3, corresponding to a binding site for IGF2BP2. The experiments demonstrate that circHIPK3 can compete with the downregulated mRNAs targets for IGF2BP2 binding and that IGF2BP2 depletion antagonizes the effect of circHIPK3 depletion by upregulating the genes containing the 11-mer. These important findings contribute to the growing recognition of the complexity of cancer signaling regulation and highlight the intricate interplay between circRNAs and protein-coding genes in tumorigenesis.

    3. Reviewer #1 (Public Review):

      In this work the authors propose a new regulatory role for one of the most abundant circRNAs, circHIPK3. They demonstrate that circHIPK3 interacts with an RNA binding protein (IGF2BP2), sequestering it away from its target mRNAs. This interaction is shown to regulate the expression of hundreds of genes that share a specific sequence motif (11-mer motif) in their untranslated regions (3'-UTR), identical to one present in circHIPK3 where IGF2BP2 binds. The study further focuses on the specific case of STAT3 gene, whose mRNA product is found to be downregulated upon circHIPK3 depletion. This suggests that circHIPK3 sequesters IGF2BP2, preventing it from binding to and destabilizing STAT3 mRNA. The study presents evidence supporting this mechanism and discusses its potential role in tumor cell progression. These findings contribute to the growing complexity of understanding cancer regulation and highlight the intricate interplay between circRNAs and protein-coding genes in tumorigenesis.

      Strengths:

      The authors show mechanistic insight into a proposed novel "sponging" function of circHIPK3 which is not mediated by sequestering miRNAs but rather a specific RNA binding protein (IGF2BP2). They address the stoichiometry of the molecules involved in the interaction, which is a critical aspect that is frequently overlooked in this type of study. They provide both genome-wide analysis and a specific case (STAT3) which is relevant for cancer progression. Overall, the authors have significantly improved their manuscript in their revised version.

      Weaknesses:

      There are seemingly contradictory effects of circHIPK3 and STAT3 depletion in cancer progression. However, the authors have addressed these issues in their revised manuscript, incorporating potential reasons that might explain such complexity.

    4. Reviewer #2 (Public Review):

      The manuscript by Okholm and colleagues identified an interesting new instance of ceRNA involving a circular RNA. The data are clearly presented and support the conclusions. Quantification of the copy number of circRNA and quantification of the protein were performed, and this is important to support the ceRNA mechanism.

      This is the second rebuttal and the authors further improved the manuscript. The data are of interest to the large spectrum of readers of the journal.

      Comments on revision:

      The authors explain that they have compared primer efficiencies of two linear Laccase version amplicons and their divergent primers targeting circHIPK3 using amplification standard curves (not shown). They claim that all amplicons were found to be directly comparable, ensuring that their estimation of cirRNA:lineal ratio estimation by RT-qPCR was accurate. I agree that this is not a technically trivial experiment. However, for this measurement to be valid, it is not enough to compare the efficiencies of primers using cDNA/DNA standard curves in the context of the qPCR reaction alone. Instead, one should perform the full RT-qPCR tandem reactions in the context of standard curves of the specific RNAs (for example, obtained by in vitro synthesis). RNA absolute amounts in these standard curves should be known in order to compare the different RNA species (linear or circular).

      I do not have major concerns about this issue.

    5. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1 (Recommendations For The Authors): 

      Major points about revised manuscript 

      (1) While I acknowledge that the Laccase2 vector is probably the best available in terms of its clean circRNA-expression potential, the authors still lack an estimation of the circRNA overexpression efficiency, specifically the circular-to-linear expression ratio. In their second rebuttal letter, the authors argue that they do not have the option to use another probe and that they are limited by the Backsplicing junction (BSJ)-specific one. I assume they mean that such a probe might only partially hybridize with the linear form and therefore give a poor or no signal in the Northern blot. However, in this referee's opinion, it is precisely because of this limitation that the authors should have used another probe against both the linear and circular RNAs to simultaneously and quantitatively detect both isoforms. This would have allowed them to reliably estimate a circular-to-linear ratio. Perhaps the linear isoform is indeed not expressed or is very low for this circRNA overexpression vector, but the probe used by the authors does not prove it. I think that this addition to the manuscript is not strictly necessary at this stage, but it would certainly improve the results.  

      We fully agree with this recommendation. Our efforts to show this using northern blotting was unfortunately unsuccesful due to background signal. To accommodate the question about circ-to-linear ratio, we instead used an RT-qPCR strategy to measure the linear vs circRNA expression derived from the LaccasecircHIPK3 expression constructs/cell lines. To be able to compare obtained results from different amplicons, we measured primer efficiencies (using amplification standard curves – not shown) of two linear Laccase version amplicons and our divergent primers targeting circHIPK3, which were found to be directly comparable. Using these primer sets in RT-qPCR on the same RNA preparation (total cellular RNA) from the northern blot (Supplementary figure S5H) revealed a ~4 fold higher expression of circHIPK3 compared to linear precursor RNA (Supplementary Figure S5I). 

      This demonstrates that the Laccase vector system efficiently produces circHIPK3 RNA as expected. 

      The few changes to the manuscript (results section text and reference to Supplementary Figure S5I) has been highlighted in yellow. The materials and methods section and Table S1 have been modified to include description of RTqPCR and specific primers.

    1. Reviewer #1 (Public Review):

      Plasticity in the basolateral amygdala (BLA) is thought to underlie the formation of associative memories between neutral and aversive stimuli, i.e. fear memory. Concomitantly, fear learning modifies the expression of BLA theta rhythms, which may be supported by local interneurons. Several of these interneuron subtypes, PV+, SOM+, and VIP+, have been implicated in the acquisition of fear memory. However, it was unclear how they might act synergistically to produce BLA rhythms that structure the spiking of principal neurons so as to promote plasticity. Cattani et al. explored this question using small network models of biophysically detailed interneurons and principal neurons.

      Using this approach, the authors had four principal findings:<br /> (1) Intrinsic conductances in VIP+ interneurons generate a slow theta rhythm that periodically inhibits PV+ and SOM+ interneurons, while disinhibiting principal neurons.<br /> (2) A gamma rhythm arising from the interaction between PV+ and principal neurons establishes the precise timing needed for spike-timing-dependent plasticity.<br /> (3) Removal of any of the interneuron subtypes abolishes conditioning-related plasticity.<br /> (4) Learning-related changes in principal cell connectivity enhance the expression of slow theta in the local field potential.

      The strength of this work is that it explores the role of multiple interneuron subtypes in the formation of associative plasticity in the basolateral amygdala. The authors use biophysically detailed cell models that capture many of their core electrophysiological features, which helps translate their results into concrete hypotheses that can be tested in vivo. Moreover, they try to align the connectivity and afferent drive of their model with those found experimentally. However, the weakness is that their attempt to align with the experimental literature (specifically Krabbe et al. 2019) is performed inconsistently. Some connections between cell types were excluded without adequate justification (e.g. SOM+ to PV+). In addition, the construction of the afferent drive to the network does not reflect the stimulus presentations that are given in fear conditioning tasks. For instance, the authors only used a single training trial, the conditioning stimulus was tonic instead of pulsed, the unconditioned stimulus duration was artificially extended in time, and its delivery overlapped with the neutral stimulus, instead of following its offset. These deviations undercut the applicability of their findings.

      This study partly achieves its aim of understanding how networks of biophysically distinctive interneurons interact to generate nested rhythms that coordinate the spiking of principal neurons. What still remains to demonstrate is that this promotes plasticity for training protocols that emulate what is used in studies of fear conditioning.

      Setting aside the issues with the conditioning protocol, the study offers a model for the generation of multiple rhythms in the BLA that is ripe for experimental testing. The most promising avenue would be in vivo experiments testing the role of local VIP+ neurons in the generation of slow theta. That would go a long way to resolving whether BLA theta is locally generated or inherited from medial prefrontal cortex or ventral hippocampus afferents.

      The broader importance of this work is that it illustrates that we must examine the function of neurons not just in terms of their behavioral correlates, but by their effects on the microcircuit they are embedded within. No one cell type is instrumental in producing fear learning in the BLA. Each contributes to the orchestration of network activity to produce plasticity. Moreover, this study reinforces a growing literature highlighting the crucial role of theta and gamma rhythms in BLA function.

    2. Reviewer #2 (Public Review):

      The authors of this study have investigated how oscillations may promote fear learning using a network model. They distinguished three types of rhythmic activities and implemented an STDP rule to the network aiming to understand the mechanisms underlying fear learning in the BLA. My comments are the following.

      (1) Gamma oscillations are generated locally; thus, it is appropriate to model in any cortical structure. However, the generation of theta rhythms is based on the interplay of many brain areas therefore local circuits may not be sufficient to model these oscillations. Moreover, to generate the classical theta, a laminal structure arrangement is needed (where neurons form layers like in the hippocampus and cortex)(Buzsaki, 2002), which is clearly not present in the BLA. To date, I am not aware of any study which has demonstrated that theta is generated in the BLA. All studies that recorded theta in the BLA performed the recordings referenced to a ground electrode far away from the BLA, an approach that can easily pick up volume conducted theta rhythm generated e.g., in the hippocampus or other layered cortical structure. To clarify whether theta rhythm can be generated locally, one should have conducted recordings referenced to a local channel (see Lalla et al., 2017 eNeuro). In summary, at present, there is no evidence that theta can be generated locally within the BLA. Though, there can be BLA neurons, firing of which shows theta rhythmicity, e.g., driven by hippocampal afferents at theta rhythm, this does not mean that theta rhythm per se can be generated within the BLA as the structure of the BLA does not support generation of rhythmic current dipoles. This questions the rationale of using theta as a proxy for BLA network function which does not necessarily reflect the population activity of local principal neurons in contrast to that seen in the hippocampus.

      (2) The authors distinguished low and high theta. This may be misleading, as the low theta they refer to is basically a respiratory-driven rhythm typically present during an attentive state (Karalis and Sirota, 2022; Bagur et al., 2021, etc.). Thus, it would be more appropriate to use breathing-driven oscillations instead of low theta. Again, this rhythm is not generated by the BLA circuits, but by volume conducted into this region. Yet, the firing of BLA neurons can still be entrained by this oscillation. I think it is important to emphasize the difference.

      (3) The authors implemented three interneuron types in their model, ignoring a large fraction of GABAergic cells present in the BLA (Vereczki et al., 2021). Recently, the microcircuit organization of the BLA has been more thoroughly uncovered, including connectivity details for PV interneurons, firing features of neurochemically identified interneurons (instead of mRNA expression-based identification, Sosulina et al., 2010), synaptic properties between distinct interneuron types as well as principal cells and interneurons using paired recordings. These recent findings would be vital to incorporate into the model instead of using results obtained in the hippocampus and neocortex. I am not sure that a realistic model can be achieved by excluding many interneuron types.

      (4) The authors set the reversal potential of GABA-A receptor-mediated currents to -80 mV. What was the rationale for choosing this value? The reversal potential of IPSCs has been found to be -54 mV in fast-spiking (i.e., parvalbumin) interneurons and around -72 mV in principal cells (Martina et al., 2001, Veres et al., 2017).

      (5) Proposing neuropeptide VIP as a key factor for learning is interesting. Though, it is not clear why this peptide is more important in fear learning in comparison to SST and CCK, which are also abundant in the BLA and can effectively regulate the circuit operation in cortical areas.

    1. eLife assessment

      This useful modeling study explores how the biophysical properties of interneuron subtypes in the basolateral amygdala enable them to produce nested oscillations whose interactions facilitate functions such as spike-timing-dependent plasticity. The strength of evidence is currently viewed as incomplete because of insufficient grounding in prior experimental results and insufficient consideration of alternative explanations. This work will be of interest to investigators studying circuit mechanisms of fear conditioning as well as rhythms in the basolateral amygdala.

    2. Reviewer #1 (Public Review):

      Plasticity in the basolateral amygdala (BLA) is thought to underlie the formation of associative memories between neutral and aversive stimuli, i.e. fear memory. Concomitantly, fear learning modifies the expression of BLA theta rhythms, which may be supported by local interneurons. Several of these interneuron subtypes, PV+, SOM+, and VIP+, have been implicated in the acquisition of fear memory. However, it was unclear how they might act synergistically to produce BLA rhythms that structure the spiking of principal neurons so as to promote plasticity. Cattani et al. explored this question using small network models of biophysically detailed interneurons and principal neurons.

      Using this approach, the authors had four principal findings:

      (1) Intrinsic conductances in VIP+ interneurons generate a slow theta rhythm that periodically inhibits PV+ and SOM+ interneurons, while disinhibiting principal neurons.<br /> (2) A gamma rhythm arising from the interaction between PV+ and principal neurons establishes the precise timing needed for spike-timing-dependent plasticity.<br /> (3) Removal of any of the interneuron subtypes abolishes conditioning-related plasticity.<br /> (4) Learning-related changes in principal cell connectivity enhance expression of slow theta in the local field potential.

      The strength of this work is that it explores the role of multiple interneuron subtypes in the formation of associative plasticity in the basolateral amygdala. The authors use biophysically detailed cell models that capture many of their core electrophysiological features, which helps translate their results into concrete hypotheses that can be tested in vivo. Moreover, they try to align the connectivity and afferent drive of their model with those found experimentally.

      Deficient in this study is the construction of the afferent drive to the network, which does elicit activities that are consistent with those observed to similar stimuli. It still remains to be demonstrated that their mechanism promotes plasticity for training protocols that emulate the kinds of activities observed in the BLA during fear conditioning.

      Setting aside the issues with the conditioning protocol, the study offers a model for the generation of multiple rhythms in the BLA that is ripe for experimental testing. The most promising avenue would be in vivo experiments testing the role of local VIP+ neurons in the generation of slow theta. That would go a long way to resolving whether BLA theta is locally generated or inherited from medial prefrontal cortex or ventral hippocampus afferents.

      The broader importance of this work is that it illustrates that we must examine the function of neurons not just in terms of their behavioral correlates, but by their effects on the microcircuit they are embedded within. No one cell type is instrumental in producing fear learning in the BLA. Each contributes to the orchestration of network activity to produce plasticity. Moreover, this study reinforces a growing literature highlighting the crucial role of theta and gamma rhythms in BLA function.

    3. Reviewer #2 (Public Review):

      The authors of this study have investigated how oscillations may promote fear learning using a network model. They distinguished three types of rhythmic activities and implemented an STDP rule to the network aiming to understand the mechanisms underlying fear learning in the BLA.

      After the revision, the fundamental question, namely, whether the BLA networks can or cannot intrinsically generate any theta rhythms, is still unanswered. The author added this sentence to the revised version: "A recent experimental paper, (Antonoudiou et al., 2022), suggests that the BLA can intrinsically generate theta oscillations (3-12 Hz) detectable by LFP recordings under certain conditions, such as reduced inhibitory tone." In the cited paper, the authors studied gamma oscillations, and when they applied 10 uM Gabazine to the BLA slices observed rhythmic oscillations at theta frequencies. 10 uM Gabazine does not reduce the GABA-A receptor-mediated inhibition but eliminates it, resulting in rhythmic populations burst driven solely by excitatory cells. Thus, the results by Antonoudiou et al., 2022 contrast with, and do not support, the present study, which claims that rhythmic oscillations in the BLA depend on the function of interneurons. Thus, there is still no convincing evidence that BLA circuits can intrinsically generate theta oscillations in intact brain or acute slices. If one extrapolates from the hippocampal studies, then this is not surprising, as the hippocampal theta depends on extra-hippocampal inputs, including, but not limited to the entorhinal afferents and medial septal projections (see Buzsaki, 2002). Similarly, respiratory related 4 Hz oscillations are also driven by extrinsic inputs. Therefore, at present, it is unclear which kind of physiologically relevant theta rhythm in the BLA networks has been modelled.

    4. Author response:

      The following is the authors’ response to the current reviews. 

      eLife assessment:

      This useful modeling study explores how the biophysical properties of interneuron subtypes in the basolateral amygdala enable them to produce nested oscillations whose interactions facilitate functions such as spike-timing-dependent plasticity. The strength of evidence is currently viewed as incomplete because of insufficient grounding in prior experimental results and insufficient consideration of alternative explanations. This work will be of interest to investigators studying circuit mechanisms of fear conditioning as well as rhythms in the basolateral amygdala. 

      Response to eLife assessment:

      We disagree with the overall assessment of our paper. The current reviews published below focus on two kinds of perceived inadequacies. Reviewer 1 (R1) was concerned that the fear conditioning paradigm used in the model is not compatible with some of the experiments we are modeling. The reviewer helpfully suggested in the Recommendations for the Authors some papers, which R1 believed exposed this incompatibility. In our reading, those data are indeed compatible with our hypotheses, as we will explain in our reply. Furthermore, the point raised by R1 is an issue for the entire field.  We will suggest a solution to that issue based on published data.

      Reviewer 2 (R2) said that there is no evidence that the BLA is capable of producing, by itself, the rhythms that have been observed during fear conditioning in BLA and, furthermore, that the paper we cited to support such evidence, in fact, refutes our argument. We believe that the reasoning used by reviewer 2 is wrong and that the framework of R2 for what counts as evidence is inadequate. We spell out our arguments below in the reply to the reviewers.

      Finally, we believe this work is of interest far beyond investigators studying fear conditioning. The work shows how rhythms can create the timing necessary for spike-timing-dependent plasticity using multiple time scales that come from multiple different kinds of interneurons found both in BLA and, more broadly, in cortex. Thus, the work is relevant for all kinds of associative learning, not just fear conditioning. Furthermore, it is one of the first papers to show how rhythms can be central in mechanisms of higher-order cognition.

      Reviewer #1:

      We thank Reviewer 1 for his kind remarks about our first set of responses and their understanding of the importance of the work.  There was only one remaining point to be addressed:

      Deficient in this study is the construction of the afferent drive to the network, which does elicit activities that are consistent with those observed to similar stimuli. It still remains to be demonstrated that their mechanism promotes plasticity for training protocols that emulate the kinds of activities observed in the BLA during fear conditioning. 

      It is true that some fear conditioning protocols involve non-overlapping US and CS, raising the question of how plasticity happens or whether behavioral effects may happen without plasticity. This is an issue for the entire field (Sun et al., F1000Research, 2020). Several papers (Quirk, Repa and LeDoux, 1995; Herry et al, 2007; Bordi and Ledoux 1992) show that the pips in auditory fear conditioning increase the activity of some BLA neurons: after an initial transient, the overall spike rate is still higher than baseline activity. The question remains as to whether the spiking is sustained long enough and at a high enough rate for STDP to take place when US is presented sometime after the stop of the CS.

      Experimental recordings cannot speak to the rate of spiking of BLA neurons during US due to recording interference from the shock. However, evidence seems to suggest that ECS activity should increase during the US due to the release of acetylcholine (ACh) from neurons in the basal forebrain (BF) (Rajebhosale et al., 2024). Pyramidal cells of the BLA robustly express M1 muscarinic ACh receptors (Muller et al., 2013; McDonald and Mott, 2021) and M1 receptors target spines receiving glutamatergic input (McDonald et al., 2019). Thus, ACh from BF should elicit a long-lasting depolarization in pyramidal cells. Indeed, the pairing of ACh with even low levels of spiking of BLA neurons results in a membrane depolarization that can last 7 – 10 s (Unal et al., 2015). This implies that the release of ACh can affect the consequences of the CS in successive trials. This should include higher spiking rates and more sustained activity in the ECS neurons after the first presentation of US, thus ensuring a concomitant activation of ECS and fear (F) neurons necessary for STDP to take place. Hence, we suggest that a solution to the problem raised by R1 may be solved by considering the role of ACh release by BF. To the best of our knowledge, there is nothing in the literature that contradicts this potential solution. The model we have may be considered a “minimal” model that puts in by hand the higher frequency due to the cholinergic drive without explicitly modeling it. As R1 says, it is important for us to give the motivation of that higher frequency; in the next revision, we will be explicit about how the needed adequate firing rate can come about without an overlap of CS and US in any given trial.

      Reviewer #2:

      The authors of this study have investigated how oscillations may promote fear learning using a network model. They distinguished three types of rhythmic activities and implemented an STDP rule to the network aiming to understand the mechanisms underlying fear learning in the BLA. 

      After the revision, the fundamental question, namely, whether the BLA networks can or cannot intrinsically generate any theta rhythms, is still unanswered. The author added this sentence to the revised version: "A recent experimental paper, (Antonoudiou et al., 2022), suggests that the BLA can intrinsically generate theta oscillations (3-12 Hz) detectable by LFP recordings under certain conditions, such as reduced inhibitory tone." In the cited paper, the authors studied gamma oscillations, and when they applied 10 uM Gabazine to the BLA slices observed rhythmic oscillations at theta frequencies. 10 uM Gabazine does not reduce the GABA-A receptor-mediated inhibition but eliminates it, resulting in rhythmic populations burst driven solely by excitatory cells. Thus, the results by Antonoudiou et al., 2022 contrast with, and do not support, the present study, which claims that rhythmic oscillations in the BLA depend on the function of interneurons. Thus, there is still no convincing evidence that BLA circuits can intrinsically generate theta oscillations in intact brain or acute slices. If one extrapolates from the hippocampal studies, then this is not surprising, as the hippocampal theta depends on extra-hippocampal inputs, including, but not limited to the entorhinal afferents and medial septal projections (see Buzsaki, 2002). Similarly, respiratory related 4 Hz oscillations are also driven by extrinsic inputs. Therefore, at present, it is unclear which kind of physiologically relevant theta rhythm in the BLA networks has been modelled. 

      Reviewer 2 (R2) says “the fundamental question, namely, whether the BLA networks can or cannot intrinsically generate any theta rhythms, is still unanswered.” In our revision, we cited (Antonoudiou et al., 2022), who showed that BLA can intrinsically generate theta oscillations (3-12 Hz) detectable by LFP recordings. R2 pointed out that this paper produces such theta under conditions in which the inhibition is totally removed. R2 then states that the resulting rhythmic populations burst at theta “are driven solely by excitatory cells. Thus, the results by (Antonoudiou et al., 2022) contrast with, and do not support, the present study, which claims that rhythmic oscillations in the BLA depend on the function of interneurons. Thus, there is still no convincing evidence that BLA circuits can intrinsically generate theta oscillations in intact brain or acute slices.”

      This reasoning of R2 is faulty. With all GABAergic currents omitted, the LFP is composed of excitatory currents and intrinsic currents. Our model of the LFP includes all synaptic and membrane currents. In our model, the high theta comes from the spiking activity of the SOM cells, which increase their activity if the inhibition from VIP cells is removed. We are including a new simulation, which models the activity of the slice in the presence of kainate (as done in Antonoudiou et al., 2022), providing additional excitation to the network. If the BLA starts at high excitation, our model produces an ongoing gamma in the VIP cells that suppress SOM cells and allows a PING gamma to form between PV and F cells; with Gabazine (modeled as the removal of all the GABAergic synapses), this PING is no longer possible and so the gamma rhythm disappears. As expected, the simulation shows that the model produces theta with Gabazine; the model also shows that a PING rhythm is produced without Gabazine, and that this rhythm goes away with Gabazine because PING requires feedback inhibition (see Author response image 1). Thus, the theta increase with Gabazine in the (Antonoudiou et al., 2022) paper can be reproduced in our model, so that paper does support the model.

      Author response image 1.

      Spectral properties of the BLA network without (black) versus with Gabazine (magenta). Power spectra of the LFP proxy, which is the linear sum of AMPA, GABA (only present in the absence of Gabazine, D-, NaP-, and H-currents. Both power spectra are represented as mean and standard deviation across 10 network realizations. Bottom: inset between 35 and 50 Hz.

      Nevertheless, we agree that this paper alone is not sufficient evidence that the BLA can produce a low theta. We have recently learned of a new paper (Bratsch-Prince et al., 2024) that is directly related to the issue of whether the BLA by itself can produce low theta, and in what circumstances. In this study, intrinsic BLA theta is produced in slices with ACh stimulation (without needing external glutamate input) which, in vivo, would be produced by the basal forebrain (Rajebhosale et al., eLife, 2024) in response to salient stimuli. The low-theta depends on muscarinic activation of CCK interneurons, a group of interneurons that overlaps with the VIP neurons in our model (Krabbe 2017; Mascagni and McDonald, 2003).  

      We suspect that the low theta produced in (Bratsch-Prince et al., 2024) is the same as the low theta in our model. We do not explicitly include ACh modulation of BLA in our paper, but in current work with experimentalists, we aim to show that ACh is essential to the theta by activating the BLA VIP cells. In our re-revised version, we will discuss Bratsch-Prince et al., 2024 and its connection to our hypothesis that the theta oscillations can be produced within the BLA.

      Note that we have already included a paragraph stating explicitly that our hypothesis in no way contradicts the idea that inputs to the BLA may include theta oscillations. Indeed, the following paragraphs in the revised paper describe the complexity of trying to understand the origin of brain rhythms in vivo. R2 did not appear to take this complexity, and the possible involvement of neuromodulation, into account in their current position that the theta rhythms cannot be produced intrinsically in the BLA.

      From revised paper: “Where the rhythms originate, and by what mechanisms. A recent experimental paper, (Antonoudiou et al. 2022), suggests that the BLA can intrinsically generate theta oscillations (3-12 Hz) detectable by LFP recordings under certain conditions, such as reduced inhibitory tone. They draw this conclusion in mice by removing the hippocampus, which can volume conduct to BLA, and noticing that other nearby brain structures did not display any oscillatory activity. Our model also supports the idea that intrinsic mechanisms in the BLA can support the generation of the low theta, high theta, and gamma rhythms.

      Although the BLA can produce these rhythms, this does not rule out that other brain structures also produce the same rhythms through different mechanisms, and these can be transmitted to the BLA. Specifically, it is known that the olfactory bulb produces and transmits the respiratory-related low theta (4 Hz) oscillations to the dorsomedial prefrontal cortex, where it organizes neural activity (Bagur et al., 2021). Thus, the respiratory-related low theta may be captured by BLA LFP because of volume conduction or through BLA extensive communications with the prefrontal cortex. Furthermore, high theta oscillations are known to be produced by the hippocampus during various brain functions and behavioral states, including during spatial exploration (Vanderwolf, 1969) and memory formation/retrieval (Raghavachari et al., 2001), which are both involved in fear conditioning. Similarly to the low theta rhythm, the hippocampal high theta can manifest in the BLA. It remains to understand how these other rhythms may interact with the ones described in our paper.”

      We believe our current paper is important to show how detailed biophysical modeling can unearth the functional implications of physiological details (such as the biophysical bases of rhythms), which are often (indeed, usually) ignored in models, and why rhythms may be essential to some cognitive processes (including STDP).  Indeed, for evaluating our paper it is necessary to go back to the purpose of a model, especially one such as ours, which is “hypothesis/data driven”.  The hypotheses of the model serve to illuminate the functional roles of the physiological details (such as the biophysical bases for the rhythms), giving meaning to the data.  Of course, the hypotheses must be plausible, and we think that the discussion above easily clears that bar.  Hypotheses should also be checked experimentally, and a model that explains the implications of a hypothesis, such as ours, provides motivation for doing the hard work of experimental testing.  We think that R1 understands this and has been very helpful.

      —————

      The following is the authors’ response to the original reviews.

      eLife assessment

      This useful modeling study explores how the biophysical properties of interneuron subtypes in the basolateral amygdala enable them to produce nested oscillations whose interactions facilitate functions such as spike-timing-dependent plasticity. The strength of evidence is currently viewed as incomplete because the relevance to plasticity induced by fear conditioning is viewed as insufficiently grounded in existing training protocols and prior experimental results, and alternative explanations are not sufficiently considered. This work will be of interest to investigators studying circuit mechanisms of fear conditioning as well as rhythms in the basolateral amygdala. 

      Most of our comments below are intended to rebut the sentence: “The strength of evidence is currently viewed as incomplete because the relevance to plasticity induced by fear conditioning is viewed as insufficiently grounded in existing training protocols and prior experimental results, and alternative explanations are not sufficiently considered”. 

      We believe this work will be interesting to investigators interested in dynamics associated with plasticity, which goes beyond fear learning. It will also be of interest because of its emphasis on the interactions of multiple kinds of interneurons that produce dynamics used in plasticity, in the cortex (which has similar interneurons) as well as BLA. We note that the model has sufficiently detailed physiology to make many predictions that can be tested experimentally. Details are below in the answer to reviewers.

      Reviewer #1 (Public Comments):  

      (1) … the weakness is that their attempt to align with the experimental literature (specifically Krabbe et al. 2019) is performed inconsistently. Some connections between cell types were excluded without adequate justification (e.g. SOM+ to PV+). 

      In order to constrain our model, we focused on what is reported in (Krabbe et al., 2019) in terms of functional connectivity instead of structural connectivity. Thus, we included only those connections for which there was strong functional connectivity. For example, the SOM to PV connection is shown to be small (Krabbe et al., 2019, Supp. Fig. 4, panel t). We also omitted PV to SOM, PV to VIP, SOM to VIP, VIP to excitatory projection neurons; all of these are shown in (Krabbe et al. 2019, Fig. 3 (panel l), and Supp. Fig. 4 (panels m,t)) to have weak functional connectivity, at least in the context of fear conditioning. 

      We reply with more details below to the Recommendations for the Authors, including new text.

      (2) The construction of the afferent drive to the network does not reflect the stimulus presentations that are given in fear conditioning tasks. For instance, the authors only used a single training trial, the conditioning stimulus was tonic instead of pulsed, the unconditioned stimulus duration was artificially extended in time, and its delivery overlapped with the neutral stimulus, instead of following its offset. These deviations undercut the applicability of their findings.  

      Regarding the use of a single long presentation of US rather than multiple presentations (i.e., multiple trials): in early versions of this paper, we did indeed use multiple presentations. We were told by experimental colleagues that the learning could be achieved in a single trial. We note that, if there are multiple presentations in our modeling, nothing changes; once the association between CS and US is learned, the conductance of the synapse is stable. Also, our model does not need a long period of US if there are multiple presentations.  

      We agree that, in order to implement the fear conditioning paradigm in our in-silico network, we made several assumptions about the nature of the CS and US inputs affecting the neurons in the BLA and the duration of these inputs. A Poisson spike train to the BLA is a signal that contains no structure that could influence the timing of the BLA output; hence, we used this as our CS input signal. We also note that the CS input can be of many forms in general fear conditioning (e.g., tone, light, odor), and we wished to de-emphasize the specific nature of the CS. The reference mentioned in the Recommendations for authors, (Quirk, Armony, and LeDoux 1997), uses pulses 2 seconds long. At the end of fear conditioning, the response to those pulses is brief. However, in the early stages of conditioning, the response goes on for as long as the figure shows. The authors do show the number of cells responding decreases from early to late training, which perhaps reflects increasing specificity over training. This feature is not currently in our model, but we look forward to thinking about how it might be incorporated. Regarding the CS pulsed protocol used in (Krabbe et al., 2019), it has been shown that intense inputs (6kHz and 12 kHz inputs) can lead to metabotropic effects that last much longer than the actual input (200 ms duration) (Whittington et al., Nature, 1995). Thus, the effective input to the BLA may indeed be more like Poisson.

      Our model requires the effect of the CS and US inputs on the BLA neuron activity to overlap in time in order to instantiate fear learning. Despite paradigms involving both overlapping (delay conditioning, where US coterminates with CS (Lindquist et al., 2004), or immediately follows CS (e.g., Krabbe et al., 2019)) and non-overlapping (trace conditioning) CS/US inputs existing in the literature, we hypothesized that concomitant activity in CS- and US-encoding neuron activity should be crucial in both cases. This may be mediated by the memory effect, as suggested in the Discussion of our paper, or by metabotropic effects as suggested above, or by the contribution from other brain regions. We will emphasize in our revision that the overlap in time, however instantiated, is a hypothesis of our model. It is hard to see how plasticity can occur without some memory trace of US. This is a consequence of our larger hypothesis that fear learning uses spiketiming-dependent plasticity; such a hypothesis about plasticity is common in the modeling literature. 

      We reply with more details below to the Recommendations for the Authors, including new text.

      Reviewer #1 (Recommendations For The Authors): 

      Major points: 

      (1) This paper draws extensively from Krabbe et al. 2019, but it does not do so consistently. The paper would be strengthened if it tried to better match the circuit properties and activations.

      Specifically: 

      a. Krabbe found that PV interneurons were comparably activated by the US (see Supp Fig 1). Your model does not include that. The basis for the Krabbe 2019 claim that PV US responses are weaker is that they have a slightly larger proportion of cells inhibited by the US, but this is not especially compelling. In addition, their Fig 2 showed that VIP and SOM cells receive afferents from the same set of upstream regions. 

      b. The model excluded PV-SOM connections, but this does not agree with Krabbe et al. 2019, Table 2. PV cells % connectivity and IPSC amplitudes were comparable to those from VIP interneurons. 

      c. ECS to PV synapses are not included. This seems unlikely given the dense connectivity between PV interneurons and principal neurons in cortical circuits and the BLA (Woodruff and Sah 2007 give 38% connection probability in BLA). 

      We thank the Reviewer for raising these points, which allow us to clarify how we constrained our model and to do more simulations. Specifically: 

      a. (Wolff et al., Nature, 2014), cited by (Krabbe et al. 2018), reported that PV and SOM interneurons are on average inhibited by the US during the fear conditioning. However, we agree that (Krabbe et al., 2019) added to this by specifying that PV interneurons respond to both CS+ and US, although the fraction of US-inhibited PV interneurons is larger. As noted by the Reviewer, in the model we initially considered the PV interneurons responding only to CS+ (identified as “CS” in our manuscript). For the current revision, we ran new simulations in which the PV interneuron receives the US input, instead of CS+. It turned out that this did not affect the results, as shown in the figure below: all the network realizations learn the association between CS and fear. In the model, the PING rhythm between PV and F is the crucial component for establishing fine timing between ECS and F, which is necessary for learning. Having PV responding to the same input as F, i.e., US, facilitates their entrainment in PING and, thus, successful learning. 

      As for afferents of VIP and SOM from upstream regions, in (Krabbe et al., 2019) is reported that “[…] BLA SOM interneurons receive a different array of afferent innervation compared to that of VIP and PV interneurons, which might contribute to the differential activity patterns observed during fear learning.” Thus, in the model, we are agnostic about inputs to SOM interneurons; we modeled them to fire spontaneously at high theta.

      To address these points in the manuscript, we added some new text in what follows:

      (1) New Section “An alternative network configuration characterized by US input to PV, instead of CS, also learns the association between CS and fear” in the Supplementary information:

      “We constrained the BLA network in Fig. 2 with CS input to the PV interneuron, as reported in (Krabbe et al., 2018). However, (Krabbe et al., 2019) notes that a class of PV interneurons may be responding to US rather than CS. Fig. S3 presents the results obtained with this variation in the model (see Fig. 3 A,B for comparison) and shows that all the network realizations learn the association between CS and fear. In the model, the PING rhythm between PV and F is the crucial component for establishing fine timing between ECS and F, which is necessary for learning. Having PV responding to the same input as F, i.e., US, facilitates their entrainment in PING and, thus, successful fear learning.

      We model the VIP interneuron as affected by US; in addition, (Krabbe et al. 2019) reports that a substantial proportion of them is mildly activated by CS. Replacing the US by CS does not change the input to VIP cells, which is modeled by the same constant applied current. Thus, the VIP CS-induced activity is a bursting activity at low theta, similar to the one elicited by US in Fig. 2.”

      (2) Section “With the depression-dominated plasticity rule, all interneuron types are needed to provide potentiation during fear learning” in Results: “Finally, since (Krabbe et al., 2019) reported that a fraction of PV interneurons are affected by US, we have also run the simulations for single neuron network with the PV interneuron affected by US instead of CS. In this case as well, all the network realizations are learners (see Fig. S3). ”

      (3) Section “Conditioned and unconditioned stimuli” in Materials and Methods: “To make Fig. S3, we also considered a variation of the model with PV interneurons affected by US, instead of CS, as reported in (Krabbe et al. 2019).”

      b. Re the SOM to PV connection: As reported in the reply to the public reviews, we considered the prominent functional connections reported in (Krabbe et al., 2019), instead of structural connections. That is, we included only those connections for which there was strong functional connectivity. For example, the SOM to PV connection is shown to be small (Supp. Fig. 4, panel t, in (Krabbe et al., 2019)). We also omitted PV to SOM, PV to VIP, SOM to VIP, and VIP to excitatory projection neurons; all of these are shown in (Krabbe et al. 2019, Fig. 3 (panel l), and Supp. Fig. 4 (panels m,t)) to have weak functional connectivity, at least in the context of fear conditioning.

      In order to clarify this point, in Section “Network connectivity and synaptic currents” in Materials and Methods, we now say:

      “We modeled the network connectivity as presented in Fig. 2B, derived from the prominent functional, instead of structural, connections reported in (Krabbe et al., 2019).”

      c. Re the ECS to PV synapses: We thank the Reviewer for the reference provided; as the Reviewer says, the ECS to PV synapses are not included. Upon adding this connection in our network, we found that, unlike the connection suggested in part a above, introducing these synapses would, in fact, change the outcome. Thus, the omission of this connection must be considered an implied hypothesis. Including those synapses with a significant strength would alter the PING rhythm created by the interactions between F and PV, which is crucial for ECS and F fine timing. Thanks very much for showing us that this needs to be said. Our hypothesis does not contradict the dense connections mentioned by the Reviewer; such dense connectivity does not mean that all pyramidal cells connect to all interneurons. This hypothesis may be taken as a prediction of the model.

      The absence of this connection is now discussed at the end of a new Section of the Discussion entitled “Assumptions and predictions of the model”, which reads as follows:

      “Finally, the model assumes the absence of significantly strong connections from the excitatory projection cells ECS to PV interneurons, unlike the ones from F to PV. Including those synapses would alter the PING rhythm created by the interactions between F and PV, which is crucial for ECS and F fine timing. We note that in (Woodruff and Sah, 2007) only 38% of the pyramidal cells are connected to PV cells. The functional identity of the connected pyramidal cells is unknown. Our model suggests that successful fear conditioning requires F to PV connections and that ECS to PV must be weak or absent.”

      (2) Krabbe et al. 2019 and Davis et al. 2017 were referenced for the construction of the conditioned and unconditioned stimulus pairing protocol. The Davis citation is not applicable here because that study was a contextual, not cued, fear conditioning paradigm. Regarding Krabbe, the pairing protocol was radically different from what the authors used. Their conditioned stimulus was a train of tone pips presented at 0.9 Hz, which lasted 30 s, after which the unconditioned stimulus was presented after tone offset. The authors should determine how their network behaves when this protocol is used. Also, note that basolateral amygdala responses to tone stimuli are primarily brief onset responses (e.g. Quirk, Armony, and LeDoux 1997), and not the tonic activation used in the model.  

      We replied to this point in our responses to the Reviewer’s Public Comments as follows:

      “We agree that, in order to implement the fear conditioning paradigm in our in-silico network, we made several assumptions about the nature of the CS and US inputs affecting the neurons in the BLA and the duration of these inputs. A Poisson spike train to the BLA is a signal that contains no structure that could influence the timing of the BLA output; hence, we used this as our CS input signal. We also note that the CS input can be of many forms in general fear conditioning (e.g., tone, light, odor), and we wished to de-emphasize the specific nature of the CS. The reference mentioned in the Recommendations for authors, (Quirk, Armony, and LeDoux 1997), uses pulses 2 seconds long. At the end of fear conditioning, the response to those pulses is brief. However, in the early stages of conditioning, the response goes on for as long as the figure shows. The authors do show the number of cells responding decreases from early to late training, which perhaps reflects increasing specificity over training. This feature is not currently in our model, but we look forward to thinking about how it might be incorporated. Regarding the CS pulsed protocol used in (Krabbe et al., 2019), it has been shown that intense inputs (6kHz and 12 kHz inputs) can lead to metabotropic effects that last much longer than the actual input (200 ms duration) (Whittington et al., Nature, 1995). Thus, the effective input to the BLA may indeed be more like

      Poisson.”

      Current answer to the Reviewer:

      There are several distinct issues raised by the Reviewer in the more detailed critique. We respectfully disagree that the model is not applicable to context-dependent fear learning where the context acts as a CS, though we should have been more explicit. Specifically, our CS input can describe both the cue and the context. We included the following text in the Results section “Interneuron rhythms provide the fine timing needed for depression-dominated STDP to make the association between CS and fear”:

      “In our simulations, the CS input describes either the context or the cue in contextual and cued fear conditioning, respectively. For the context, the input may come from the hippocampus or other non-sensory regions, but this does not affect its role as input in the model.”

      The second major issue is whether the specific training protocols used in the cited papers need to be exactly reproduced in the signals received by the elements of our model; we note that there are many transformations that can occur between the sensory input and the signals received by the BLA. In the case of auditory fear conditioning, a series of pips, rather than individual pips, are considered the CS (e.g., (Stujenske et al., 2014; Krabbe et al. 2019)). Our understanding is that a single pip does not elicit a fear response; a series of pips is required for fear learning. This indicates that it is not the neural code of a single pip that matters, but rather the signal entering the amygdala that incorporates any history-dependent signaling that could lead to spiking throughout the sequence of pips.  Also, as mentioned above, intense inputs at frequencies about 6kHz and 12kHz can lead to metabotropic effects that last much longer than each brief pip (~200 ms), thus possibly producing continuous activity in neurons encoding the input. Thus, we believe that our use of the Poisson spike train is reasonable. 

      However, we are aware that the activity of neurons encoding CS can be modulated by the pips: neurons encoding auditory CS display a higher firing rate when each pip is presented and a Poisson-like spike train between pips (Herry et al., Journal of Neuroscience, 2007). Here we confirm that potentiation is present even in the presence of the fast transient response elicited by the pips. We said in the original manuscript that there is learning for a Poisson spike train CS input at ~50 Hz; this describes the neuronal activity in between pips. For the revision, we asked whether learning is preserved when CS is characterized by higher frequencies, which would describe the CS during and right after each pip. We show in the new Fig. S4 that potentiation is ensured for a range of CS frequencies. The figure shows the learning speed as a function of CS and US frequencies. For all the CS frequencies considered, i) there is learning, ii) learning speed increases with CS frequency. Thus, potentiation is present even when pips elicit a faster transient response.

      To better specify this in the manuscript, 

      We added the following sentences in the Results section “With the depressiondominated plasticity rule, all interneuron types are needed to provide potentiation during fear learning”: 

      “We note that the CS and US inputs modeled as independent Poisson spike trains represent stimuli with no structure. Although we have not explicitly modeled pulsating pips, as common in auditory fear conditioning (e.g., (Stujenske 2014; Krabbe 2019)), we show in Fig. S4 that potentiation can be achieved over a relatively wide range of gamma frequencies. This indicates that overall potentiation is ensured if the gamma frequency transiently increases after the pip.”

      We added the section “The full network potentiates for a range of CS frequencies“ and figure S4 in the Supplementary Information:

      We included in Materials and Methods “Conditioned and unconditioned stimuli” the following sentences:

      “Finally, for Fig.S4, we considered a range of frequencies for the CS stimulus. To generate the three Poisson spike trains with average frequencies from 48 to 64 Hz in Fig. S4, we set 𝜆 = 800, 1000, 1200.”

      Finally, to address the comment about the need for CS and US overlapping in time to instantiate fear association, we added the following text in the Results section “Assumptions and predictions of the model”:

      “Finally, our model requires the effect of the CS and US inputs on the BLA neuron activity to overlap in time in order to instantiate fear learning. Despite paradigms involving both overlapping (delay conditioning, where US co-terminates with CS (e.g., (Lindquist et al., 2004)), or immediately follows CS (e.g., Krabbe et al., 2019)) and non-overlapping (trace conditioning) CS/US inputs exist, we hypothesized that concomitant activity in CS- and US-encoding neuron activity should be crucial in both cases. This may be mediated by the memory effect due to metabotropic effects (Whittington et al., Nature, 1995) as suggested above, or by the contribution from other brain regions (see section “Involvement of other brain structures” in the Discussion). The fact that plasticity occurs with US memory trace is a consequence of our larger hypothesis that fear learning uses spike-timing-dependent plasticity; such a hypothesis about plasticity is common in the modeling literature.”

      (3) As best as I could tell, only a single training trial was used in this study. Fair enough, especially given that fear learning can occur with a single trial. However, most studies of amygdala fear conditioning have multiple trials (~5 or more). How does the model perform when multiple trials are given?  

      The association between CS and fear acquired after one trial, i.e., through a potentiated ECS to F connection, is preserved in the presence of multiple trials.  Indeed, the association would be weakened or erased (through depression of the ECS to F connection) only if ECS and F did not display good fine timing, i.e., F does not fire right after ECS most of the time. However, the implemented circuit supports the role of interneurons in providing the correct fine timing, thus preventing the association acquired from being erased.  

      In the second paragraph of the Results section “With the depression-dominated plasticity rule, all interneuron types are needed to provide potentiation during fear learning”, we made the above point by adding the following text:

      “We note that once the association between CS and fear is acquired, subsequent presentations of CS and US do not weaken or erase it: the interneurons ensure the correct timing and pauses in ECS and F activity, which are conducive for potentiation.”

      (4) The LFP calculations are problematic. First, it is unclear how they were done. Did the authors just take the transmembrane currents they included and sum them, or were they scaled by distance from the 'electrode' and extracellular conductivity (as one would derive from the Laplace equation)? Presumably, the spatial arrangement of model neurons was neglected so distance was not a factor. 

      Second, if this is the case, then the argument for excluding GABAergic conductances seems flawed. If the spatial arrangement of neurons is relevant to whether to include or exclude GABAergic conductances, then wouldn't a simulation without any spatial structure not be subject to the concern of laminar vs. nuclear arrangement? 

      Moreover, to the best I can tell, the literature the authors use to justify the exclusion of

      GABAergic currents does not make the case for a lack of GABAergic contribution in non-laminar structures. Instead, those studies only argue that in a non-laminar structure, AMPA currents are detectable, not that GABA cannot be detected. Thus, the authors should either include the GABAergic currents when calculating their simulated LFP, or provide a substantially better argument or citation for their exclusion. 

      We thank the Reviewer for pointing this out; this comment helped us rethink how to model the LFP. The origin of the LFP signal in BLA has not been fully determined, but factors thought to be important include differences in the spatial extension of the arborization in excitatory and inhibitory neurons, in the number of synaptic boutons, and spatial distributions of somata and synapses (Lindén et al 2011; Łęski 2013; Mazzoni et al. 2015). In the first version of the manuscript, we excluded the GABAergic currents because it is typically assumed that they add very little to the extracellular field as the inhibitory reversal potential is close to the resting membrane potential. For the revision, we re-ran the simulations during pre and post fear conditioning and we modeled the LFP as the sum of the AMPA, GABA and NaP-/H-/D- currents. With this new version of the LFP, we added a new Fig. 6 showing that there is a significant increase in the low theta power, but not in the high theta power, with fear learning (Fig. 6 C, D, E). This increase in the low theta power was mainly due to the AMPA currents created by the newly established connection from ECS to F, which allowed F to be active after fear conditioning in response to CS. 

      However, as the Reviewer mentioned, our network has no spatial extent: neurons are modeled as point cells. Thus, our current model does not include the features necessary to model some central aspects of the LFP. Despite that, our model does clearly demonstrate how rhythmic activity in the spike timing of neurons within the network changes due to fear learning (Fig. 6B). The spiking outputs of the network are key components of the inputs to the LFP, and thus we expect the rhythms in the spiking to be reflected in more complex descriptions of the LFP. But we also discovered that different LFP proxies provide different changes in rhythmic activity comparing pre- and post-fear learning; although we have no principled way to choose a LFP proxy, we believe that the rhythmic firing is the essential finding of the model.

      We have added the following to the manuscript:

      (1) In the new version of Fig. 6, we present the power spectra of the network spiking activity (panel B), along with the power spectra of the LFP proxy that includes the GABA, AMPA, and NaP-/H-/D- currents (panels C, D, E). 

      (2) We modified the conclusion of the Results section entitled “Increased low-theta frequency is a biomarker of fear learning” by saying:

      “In this section, we explore how plasticity in the fear circuit affects the network dynamics, comparing after fear conditioning to before. We first show that fear conditioning leads to an increase in low theta frequency power of the network spiking activity compared to the pre-conditioned level (Fig. 6 A,B); there is no change in the high theta power. We also show that the LFP, modeled as the linear sum of all the AMPA, GABA, NaP-, D-, and H- currents in the network, similarly reveals a low theta power increase and no significant variation in the high theta power (Fig. 6 C,D,E). These results reproduce the experimental findings in (Davis et al., 2017), and (Davis et al., 2017), and Fig 6 F,G show that the low theta increase is due to added excitation provided by the new learned pathway. The additional unresponsive ECS and F cells in the network were included to ensure we had not biased the LFP towards excitation. Nevertheless, although both the AMPA and GABA currents contribute to the power increase in the low theta frequency range (Fig. 6F), the AMPA currents show a dramatic power increase relative to the baseline (the average power ratio of AMPA and GABA post- vs pre-conditioning across 20 network realizations is 3*103 and 4.6, respectively). This points to the AMPA currents as the major contributor to the low theta power increase. Specifically, the newly potentiated AMPA synapse from ECS to F ensures F is active after fear conditioning, thus generating strong currents in the PV cells to which it has strong connections (Fig. 6G). Finally, the increase in power is in the low theta range because ECS and F are allowed to spike only during the active phase of the low theta spiking VIP neurons. We have also explored another proxy for the LFP (see Supplementary Information and Fig. S6).”

      In the Supplementary Information, we included a figure and some text in the new section entitled “A higher low theta power increase emerges in LFP approximated with the sum of the absolute values of the currents compared to their linear sum”:

      “Given that our BLA network comprises a few neurons described as single-compartment cells with no spatial extension and location, the LFP cannot be computed directly from our model’s read-outs. In the main text, we choose as an LFP proxy the linear sum of the AMPA, GABA, and P-/H-/D-currents. We note that if the LFP is modeled as the sum of the absolute value of the currents, as suggested by (Mazzoni et al. 2008; Mazzoni et al. 2015), an even higher low theta power increase arises after fear conditioning compared to the linear sum. Differences in the power spectra also arise if other LFP proxies (e.g., only AMPA currents, only GABA currents) are considered. A principled description of an LFP proxy would require modeling the three-dimensional BLA anatomy, including that of the interneurons VIP and SOM; this is outside the scope of the current paper. (See (Feng et al. 2019) for a related project in the BLA.)”

      (3) We updated the Materials and Methods section “Local field potentials and spectral analysis” to explain how we compute the LFP in the revised manuscript: 

      “We considered as an LFP proxy as the linear sum of all the AMPA, GABA, NaP, D, and H currents in the network. The D-current is in the VIP interneurons, and NaP-current and H-current are in SOM interneurons.”

      Although it is beyond the scope of the current work, an exploration of the most accurate proxy of the LFP in the amygdala is warranted. Such a study could be accomplished by adopting a similar approach as in (Mazzoni et al., 2015), where several LFP proxies based on point-neuron leaky-integrate and fire neuronal network were compared with a “groundtruth” LFP obtained in an analogous realistic three-dimensional network model. 

      To explicitly mention this issue in the paper, we add a paragraph in the “Limitations and caveats” section in the Discussion, which reads as follows:

      “LFPs recorded in the experiments are thought to be mainly created by transmembrane currents in neurons located around the electrode and depend on several factors, including the morphology of the arborization of contributing neurons and the location of AMPA and GABA boutons (Katzner et al. 2009; Lindén et al 2011; Łęski 2013; Mazzoni et al. 2015). Since our model has no spatial extension, we used an LFP proxy; this proxy was shown to reflect the rhythmic output of the network, which we believe to be the essential result (for more details see Results “Increased low-theta frequency is a biomarker of fear learning”, and Supplementary Information “A higher low theta power increase emerges in LFP approximated with the sum of the absolute values of the currents compared to their linear sum”).”

      (4)     We have removed the section “Plasticity between fear neuron and VIP slows down overall potentiation” in Results and sections “Plasticity between the fear neuron (F) and VIP slows down overall potentiation” and “Plastic F to VIP connections further increase lowtheta frequency power after fear conditioning” in the Supplementary Information. This material is extraneous since we are using a new proxy for LFP.

      Minor points: 

      (1) In Figure 3C, the y-axis tick label for 0.037 is written as "0.37."

      We thank the reviewer for finding this typo; we fixed it.

      (2) Figure 5B is unclear. It seems to suggest that the added ECS and F neurons did not respond to either the CS or UCS. Is this true? If so, why include them in the model? How would their inclusion change the model behavior? 

      It is correct that the added ECS and F neurons did not respond to the CS or US (UCS); they are constructed to be firing at 11 Hz in the absence of any connections from other cells.  These cells were included to be part of our computation of the LFP.  Specifically, adding in those cells would make the LFP take inhibition into account more, and we wanted to make sure that were not biasing our computation away from the effects of inhibition.  As shown in the paper (Fig. 6B), even with inhibition onto these non-responsive cells, the LFP has the properties claimed in the paper concerning the changes in the low theta and high-theta power, because the LFP is dominated by new excitation rather than the inhibition. 

      First, in the Results section “Network with multiple heterogeneous neurons can establish the association between CS and fear”, we commented on the added ECS and F neurons that do not respond to either CS or US by saying the following:

      “The ECS cells not receiving CS are inhibited by ongoing PV activity during the disinhibition window (Fig. 5B); they are constructed to be firing at 11 Hz in the absence of any connections from other cells. The lack of activity in those cells during fear conditioning implies that there is no plasticity from those ECS cells to the active F. Those cells are included for the calculation of the LFP (see below in “Increased low-theta frequency is a biomarker of fear learning”.)”

      Furthermore, we add the following sentence in the Results section “Increased low-theta frequency is a biomarker of fear learning”: 

      “The additional unresponsive ECS and F cells in the network were included to ensure we had not biased the LFP towards excitation.”

      (3) Applied currents are given as current densities, but these are difficult to compare with current levels observed from whole-cell patch clamp recordings. Can the currents be given as absolute levels, in pA/nA. 

      In principle, it is possible to connect current densities with absolute levels, as requested. However, we note that the number of cells in models is orders of magnitude smaller than the number being modeled. It is common in modeling to adjust physiological parameters to achieve the qualitative properties that are important to the model, rather than trying to exactly match particular recordings.

      We added to the Methods description why we choose units per unit area, rather than absolute units. 

      “All the currents are expressed in units per area, rather than absolute units, to avoid making assumptions about the size of the neuron surface.”

      (4) Regarding: "We note that the presence of SOM cells is crucial for plasticity in our model since they help to produce the necessary pauses in the excitatory projection cell activity. However, the high theta rhythm they produce is not crucial to the plasticity: in our model, high theta or higher frequency rhythms in SOM cells are all conducive to associative fear learning. This opens the possibility that the high theta rhythm in the BLA mostly originates in the prefrontal cortex and/or the hippocampus (Stujenske et al., 2014, 2022)." The chain of reasoning in the above statement is unclear. The second sentence seems to be saying contradictory things. 

      We agree that the sentence was confusing; thank you for pointing it out. We have revised the paragraph to make our point clearer. The central points are: 1) having the SOM cells in the BLA is critical to the plasticity in the model, and 2) these cells may or may not be the source of the high theta observed in the BLA during fear learning.

      We deleted from the discussion the text reported by the Reviewer, and we added the following one to make this point clearer:

      “We note that the presence of SOM cells is crucial for plasticity in our model since they help to produce the necessary pauses in the excitatory projection cell activity. The BLA SOM cells do not necessarily have to be the only source of the high theta observed in the BLA during fear learning; the high theta detected in the LFP of the BLA also originates from the prefrontal cortex and/or the hippocampus (Stujenske et al., 2014, 2022).”

      (5) Regarding: "This suggests low theta power change is not just an epiphenomenon but rather a biomarker of successful fear conditioning." Not sure this is the right framing for the above statement. The power of the theta signal in the LFP reflects the strengthening of connections, but it itself does not have an impact on network activity. Moreover, whether something is epiphenomenal is not relevant to the question of whether it can serve as a successful biomarker. A biomarker just needs to be indicative, not causal. 

      We intended to say why the low theta power change is a biomarker in the sense of the Reviewer. That is: experiments have shown that, with learning, the low theta power increases. The modeling shows in addition that, when learning does not take place, the low power does not increase. That means that the low theta power increases if and only if there is learning, i.e., the change in low theta power is a biomarker. To make our meaning clearer, we have changed the quoted sentences to read: 

      “This suggests that the low theta power change is a biomarker of successful fear conditioning: it occurs when there is learning and does not occur when there is no learning.”

      Reviewer #2 (Public Comments): 

      We thank the Reviewer for raising these interesting points. Below are our public replies and the changes we made to the manuscript to address the Reviewer’s objections.

      (1) Gamma oscillations are generated locally; thus, it is appropriate to model in any cortical structure. However, the generation of theta rhythms is based on the interplay of many brain areas therefore local circuits may not be sufficient to model these oscillations.

      Moreover, to generate the classical theta, a laminal structure arrangement is needed (where neurons form layers like in the hippocampus and cortex)(Buzsaki, 2002), which is clearly not present in the BLA. To date, I am not aware of any study which has demonstrated that theta is generated in the BLA. All studies that recorded theta in the BLA performed the recordings referenced to a ground electrode far away from the BLA, an approach that can easily pick up volume conducted theta rhythm generated e.g., in the hippocampus or other layered cortical structure. To clarify whether theta rhythm can be generated locally, one should have conducted recordings referenced to a local channel (see Lalla et al., 2017 eNeuro). In summary, at present, there is no evidence that theta can be generated locally within the BLA. Though, there can be BLA neurons, firing of which shows theta rhythmicity, e.g., driven by hippocampal afferents at theta rhythm, this does not mean that theta rhythm per se can be generated within the BLA as the structure of the BLA does not support generation of rhythmic current dipoles. This questions the rationale of using theta as a proxy for BLA network function which does not necessarily reflect the population activity of local principal neurons in contrast to that seen in the hippocampus.

      In both modeling and experiments, a laminar structure does not seem to be needed to produce a theta rhythm. A recent experimental paper, (Antonoudiou et al. 2022), suggests that the BLA can intrinsically generate theta oscillations (3-12 Hz) detectable by LFP recordings under certain conditions, such as reduced inhibitory tone. The authors draw this conclusion by looking at mice ex vivo slices. The currents that generate these rhythms are in the BLA, since the hippocampus was removed to eliminate hippocampal volume conduction and other nearby brain structures did not display any oscillatory activity. Also, in the modeling literature, there are multiple examples of the production of theta rhythms in small networks not involving layers; these papers explain the mechanisms producing theta from non-laminated structures (Dudman et al., 2009, Kispersky et al., 2010, Chartove et al. 2020).  We are not aware of any model description of the mechanisms of theta that do require layers.

      We added the following text in the introduction of the manuscript to make this point clearer:  “A recent rodent experimental study (Antonoudiou et al. 2022) suggests that BLA can intrinsically generate theta oscillations (3-12 Hz).”

      (2) The authors distinguished low and high theta. This may be misleading, as the low theta they refer to is basically a respiratory-driven rhythm typically present during an attentive state (Karalis and Sirota, 2022; Bagur et al., 2021, etc.). Thus, it would be more appropriate to use breathing-driven oscillations instead of low theta. Again, this rhythm is not generated by the BLA circuits, but by volume conducted into this region. Yet, the firing of BLA neurons can still be entrained by this oscillation. I think it is important to emphasize the difference.

      Many rhythms of the nervous system can be generated in multiple parts of the brain by multiple mechanisms. We do not dispute that low theta appears in the context of respiration; however, this does not mean that other rhythms with the same frequencies are driven by respiration. Indeed, in the response to question 1 above, we showed that theta can appear in the BLA without inputs from other regions. In our paper, the low theta is generated in the BLA by VIP neurons. Using intrinsic currents known to exist in VIP neurons (Porter et al., 1998), modeling has shown that such neurons can intrinsically produce a low theta rhythm. This is also shown in the current paper. This example is part of a substantial literature showing that there are multiple mechanisms for any given frequency band. 

      To elaborate more on this in the manuscript, we added the following new section in the discussion:

      “Where the rhythms originate, and by what mechanisms. A recent experimental paper, (Antonoudiou et al. 2022), suggests that the BLA can intrinsically generate theta oscillations (3-12 Hz) detectable by LFP recordings under certain conditions, such as reduced inhibitory tone. They draw this conclusion in mice by removing the hippocampus, which can volume conduct to BLA, and noticing that other nearby brain structures did not display any oscillatory activity. Our model also supports the idea that intrinsic mechanisms in the BLA can support the generation of the low theta, high theta, and gamma rhythms. 

      Although the BLA can produce these rhythms, this does not rule out that other brain structures also produce the same rhythms through different mechanisms, and these can be transmitted to the BLA. Specifically, it is known that the olfactory bulb produces and transmits the respiratory-related low theta (4 Hz) oscillations to the dorsomedial prefrontal cortex, where it organizes neural activity (Bagur et al., 2021). Thus, the respiratory-related low theta may be captured by BLA LFP because of volume conduction or through BLA extensive communications with the prefrontal cortex. Furthermore, high theta oscillations are known to be produced by the hippocampus during various brain functions and behavioral states, including during spatial exploration (Vanderwolf, 1969) and memory formation/retrieval (Raghavachari et al., 2001), which are both involved in fear conditioning. Similarly to the low theta rhythm, the hippocampal high theta can manifest in the BLA. It remains to understand how these other rhythms may interact with the ones described in our paper.”

      We also note that the presence of D-currents in the BLA VIP interneurons should be confirmed experimentally, and that the ability of VIP interneurons to generate the BLA low theta rhythm constitutes a prediction of our computational model. These points are specified in the first paragraph in the Discussion entitled “Assumptions and predictions of the model”:

      “The interneuron descriptions in the model were constrained by the electrophysiological properties reported in response to hyperpolarizing currents (Sosulina et al., 2010). Specifically, we modeled the three subtypes of VIP, SOM, and PV interneurons displaying bursting behavior, regular spiking with early spike-frequency adaptation, and regular spiking without spike-frequency adaptation, respectively. Focusing on VIP interneurons, we were able to model the bursting behavior by including the D-type potassium current. This current is thought to exist in the VIP interneurons in the cortex (Porter et al., 1998), but whether this current is also found in the VIP interneurons the BLA is still unknown. Similarly, we endowed the SOM interneurons with NaP- and H-currents, as the OLM cells in the hippocampus. Due to these currents, the VIP and SOM cells are able to show  low- and high-theta oscillations, respectively. The presence of these currents and the neurons’ ability to exhibit oscillations in the theta range during fear conditioning and at baseline in BLA, which are assumptions of our model, should be tested experimentally.”

      (3) The authors implemented three interneuron types in their model, ignoring a large fraction of GABAergic cells present in the BLA (Vereczki et al., 2021). Recently, the microcircuit organization of the BLA has been more thoroughly uncovered, including connectivity details for PV+ interneurons, firing features of neurochemically identified interneurons (instead of mRNA expression-based identification, Sosulina et al., 2010), synaptic properties between distinct interneuron types as well as principal cells and interneurons using paired recordings. These recent findings would be vital to incorporate into the model instead of using results obtained in the hippocampus and neocortex. I am not sure that a realistic model can be achieved by excluding many interneuron types.

      The interneurons and connectivity that we used were inspired by the functional connectivity reported in (Krabbe et al., 2019) (see above answer to Reviewer #1). As reported in (Vereczki et al., 2021), there are multiple categories and subcategories of interneurons; that paper does not report on which ones are essential for fear conditioning. We did use all the highly represented categories of the interneurons, except NPYcontaining neurogliaform cells.

      The Reviewer says “I am not sure that a realistic model can be achieved by excluding many interneuron types”. We agree with the Reviewer that discarding the introduction of other interneurons subtypes and the description of more specific connectivity (soma-, dendrite-, and axon-targeting connections) may limit the ability of our model to describe all the details in the BLA. However, this work represents a first effort towards a biophysically detailed description of the BLA rhythms and their function. As in any modeling approach, assumptions about what to describe and test are determined by the scientific question; details postulated to be less relevant are omitted to obtain clarity. The interneuron subtypes we modeled, especially VIP+ and PV+, have been reported to have a crucial role in fear conditioning (Krabbe et al., 2019). Other interneurons, e.g. cholecystokinin and SOM+, have been suggested as essential in fear extinction. Thus, in the follow-up of this work to explain fear extinction, we will introduce other cell types and connectivity. In the current work, we have achieved our goals of explaining the origin of the experimentally found rhythms and their roles in the production of plasticity underlying fear learning. Of course, a more detailed model may reveal flaws in this explanation, but this is science that has not yet been done.

      We elaborate more on this in a new section in the Discussion entitled “Assumptions and predictions of the model”. The paragraph related to this point reads as follows:

      “Our model, which is a first effort towards a biophysically detailed description of the BLA rhythms and their functions, does not include the neuron morphology, many other cell types, conductances, and connections that are known to exist in the BLA; models such as ours are often called “minimal models” and constitute the majority of biologically detailed models. Such minimal models are used to maximize the insight that can be gained by omitting details whose influence on the answers to the questions addressed in the model are believed not to be qualitatively important. We note that the absence of these omitted features constitutes hypotheses of the model: we hypothesize that the absence of these features does not materially affect the conclusions of the model about the questions we are investigating. Of course, such hypotheses can be refuted by further work showing the importance of some omitted features for these questions and may be critical for other questions. Our results hold when there is some degree of heterogeneity of cells of the same type, showing that homogeneity is not a necessary condition.”

      (4) The authors set the reversal potential of GABA-A receptor-mediated currents to -80 mV. What was the rationale for choosing this value? The reversal potential of IPSCs has been found to be -54 mV in fast-spiking (i.e., parvalbumin) interneurons and around -72 mV in principal cells (Martina et al., 2001, Veres et al., 2017).

      A GABA-A reversal potential around -80 mV is common in the modeling literature (Jensen et al., 2005; Traub et al., 2005; Kumar et al., 2011; Chartove et al., 2020). Other computational works of the amygdala, e.g. (Kim et al., 2016), consider GABA-A reversal potential at -75 mV based on the cortex (Durstewitz et al., 2000). The papers cited by the reviewer have a GABA-A reversal potential of -72 mV for synapses onto pyramidal cells; this is sufficiently close to our model that it is not likely to make a difference. For synapses onto PV+ cells, the papers cited by the reviewer suggest that the GABA-A reversal potential is -54 mV; such a reversal potential would lead these synapses to be excitatory instead of inhibitory. However, it is known (Krabbe et al., 2019; Supp. Fig. 4b) that such synapses are in fact inhibitory. Thus, we wonder if the measurements of Martina and Veres were made in a condition very different from that of Krabbe. For all these reasons, we consider a GABA-A reversal potential around -80 mV in amygdala to be a reasonable assumption.

      In section “Network connectivity and synaptic currents” in “Materials and Methods” we provided references to motivate our choice of considering a GABA-A reversal potential around -80 mV:

      “The GABAa current reversal potential (𝐸!) is set to −80        𝑚𝑉, as common in the modeling literature (Jensen et al., 2005; Traub et al., 2005; Kumar et al., 2011; Chartove et al., 2020).”

      (5) Proposing neuropeptide VIP as a key factor for learning is interesting. Though, it is not clear why this peptide is more important in fear learning in comparison to SST and CCK, which are also abundant in the BLA and can effectively regulate the circuit operation in cortical areas.

      Other peptides seem to be important in overall modulation of fear, but VIP is especially important in the first part of fear learning, the subject of our paper. Re SST: we hypothesize that SST interneurons are critical in fear extinction and preventing fear generalization, but not to initial fear learning. The peptide of the CCK neurons, which overlap with VIP cells, has been proposed to promote the switch between fear and safety states after fear extinction (Krabbe al. 2018). Thus, these other peptides are likely more important for other aspects of fear learning.  

      In the Discussion, we have added:

      “We hypothesize that SST peptide is critical in fear extinction and preventing fear generalization, but not to initial fear learning. Also, the CCK peptide has been proposed to promote the switch between fear and safety states after fear extinction (Krabbe al. 2018).”

      Reviewer #2 (Recommendations For The Authors): 

      We note that Reviewer #2’s Recommendations For The Authors have the same content as the Public Comments. Thus, the changes to the manuscript we implemented above address also the private critiques listed below.

      (1) As the breathing-driven rhythm is a global phenomenon accompanying fear state, one might restrict the analysis to this oscillation. The rationale beyond this restriction is that the 'high' theta in the BLA has an unknown origin (since it can originate from the ventral hippocampus, piriform cortex etc.). 

      In response to point 4 made by Reviewer 1 (Recommendations for the Authors) (p. 13), referring to high theta in the BLA, we previously wrote: 1) having the SOM cells in the BLA is critical to the plasticity in the model, and 2) these cells may or may not be the source of the high theta observed in the BLA during fear learning.

      In the Public Critiques, Reviewer 2 relates the respiratory rhythm to the low theta. We answered this point in point 2 of the Reviewer’s Public Comments (at p. 15).

      (2) I would include more interneurons in the network model incorporating recent findings. 

      This point was answered in our response to point 3 of the Reviewer’s Public Comments.

      (3) The reversal potential for GABA-A receptor-mediated currents would be good to set to measured values. In addition, I would use AMPA conductance values that have been measured in the BLA. 

      We addressed this objection in our response to point 4 of the Reviewer’s Public Comments.

      Reviewer #3 (Public comments):

      Weaknesses: 

      (1) The main weakness of the approach is the lack of experimental data from the BLA to constrain the biophysical models. This forces the authors to use models based on other brain regions and leaves open the question of whether the model really faithfully represents the basolateral amygdala circuitry. 

      (2) Furthermore, the authors chose to use model neurons without a representation of the morphology. However, given that PV+ and SOM+ cells are known to preferentially target different parts of pyramidal cells and given that the model relies on a strong inhibition form SOM to silence pyramidal cells, the question arises whether SOM inhibition at the apical dendrite in a model representing pyramidal cell morphology would still be sufficient to provide enough inhibition to silence pyramidal firing.

      3) Lastly, the fear learning relies on the presentation of the unconditioned stimulus over a long period of time (40 seconds). The authors justify this long-lasting input as reflecting not only the stimulus itself but as a memory of the US that is present over this extended time period. However, the experimental evidence for this presented in the paper is only very weak.

      We are repeating here the answers we gave in response to the public comments, adding further relevant points.

      (1) Our neurons were constrained by electrophysiology properties in response to hyperpolarizing currents in the BLA (Sosulina et al., 2010). We can reproduce these electrophysiological properties by using specific membrane currents known to be present in similar neurons in other brain regions (D-current in VIP interneurons in the cortex, and NaP- and H-currents in OLM/SOM cells in the hippocampus). Also, though a much more detailed description of BLA interneurons was given in (Vereczki et al., 2021), it is not clear that this level of detail is relevant to the questions that we were asking, especially since the experiments described were not done in the context of fear learning.

      (2) It is true that we did not include the morphology, which undoubtedly makes a difference to some aspects of the circuit dynamics. Furthermore, it is correct that the model relies on a strong inhibition from SOM and PV to silence the excitatory projection neurons. We agree that the placement of the SOM inhibition on the pyramidal neurons can make a difference on some aspects of the circuit behavior. We are assuming that the inhibition from the SOM cells can inhibit the pyramidal cells firing, which can be seen as a hypothesis of our model. It is well known that VIP cells disinhibit pyramidal cells through inhibition of SOM and PV cells (Krabbe et al. 2019); hence, this hypothesis is generally believed. This choice of parameters comes from using simplified models: it is standard in modeling to adjust parameters to compensate for simplifications.

      Re points 1) and 2), in a new paragraph (“Assumptions and predictions of the model”) in the Discussion reported in response to Reviewer #2 (public comments)’s point 3, we stated that modeling requires the omission of many details to bring out the significance of other details.

      (3) 40 seconds is the temporal interval we decided to use to present the results. In the Results, we also showed that there is learning over a shorter interval of time (15 seconds) where CS and US/memory of US should both be present. Thus, our model requires 15 seconds over a single or multiple trials for associative learning to be established. We included references to additional experimental papers to support our reasoning in the last paragraph of section “Assumptions and predictions of the model” in the Discussion, also reported in response to Reviewer #1 point 2 (Recommendations for the Authors). We said there that some form of memory or overlap in the activity of the excitatory projection neurons is necessary for spike-timing-dependent plasticity.

      The authors achieved the aim of constructing a biophysically detailed model of the BLA not only capable of fear learning but also showing spectral signatures seen in vivo. The presented results support the conclusions with the exception of a potential alternative circuit mechanism demonstrating fear learning based on a classical Hebbian (i.e. non-depression-dominated) plasticity rule, which would not require the intricate interplay between the inhibitory interneurons. This alternative circuit is mentioned but a more detailed comparison between it and the proposed circuitry is warranted.

      Our model accounts for the multiple rhythms observed in the context of fear learning, as well as the known involvement of multiple kinds of interneurons. We did not say explicitly enough why our complicated model may be functionally important in ways that cannot be fulfilled with a simpler model with the non depression-dominated Hebbian rule. To explain this, we have added the following in the manuscript discussion: 

      “Although fear learning can occur without the depression-dominated rule, we hypothesize that it is necessary for other aspects of fear learning and regulation. That is, in pathological cases, there can be overgeneralization of learning. We hypothesize that the modulation created by the involvement of these interneurons is normally used to prevent such overgeneralization. However, this is beyond the scope of the present paper.”

      We have also written an extra paragraph about generalization in the Discussion “Synaptic plasticity in our model”:

      “With the classical Hebbian plasticity rule, we show that learning can occur without the involvement of the VIP and SOM cells. Although fear learning can occur without the depressiondominated rule, we hypothesize that the latter is necessary for other aspects of fear learning and regulation. Generalization of learning can be pathological, and we hypothesize that the modulation created by the involvement of VIP and SOM interneurons is normally used to prevent such overgeneralization. However, in some circumstances, it may be desirable to account for many possible threats, and then a classical Hebbian plasticity rule could be useful. We note that the involvement or not of the VIP-SOM circuit has been implicated when there are multiple strategies for solving a task (Piet et al., 2024). In our situation, the nature of the task (including reward structure) may determine whether the learning rule is depression-dominated and therefore whether the VIP-SOM circuit plays an important role.”

      Reviewer #3 (Recommendations For The Authors): 

      We thank the Reviewer for all the recommendations. We replied to each of them below.

      In general, there are some inconsistencies in the naming (e.g. sometimes you write PV sometimes PV+,...), please use consistent abbreviations throughout the manuscript. You also introduce some of the abbreviations multiple times. 

      We modified the manuscript to remove all the inconsistencies in the naming. 

      Introduction: 

      - In the last section you speak about one recent study but actually cite two articles. 

      We removed the reference to (Perrenoud and Cardin, 2023), which is a commentary on the Veit et al. article.

      Results: 

      - 'Brain rhythms are thought to be encoded and propagated largely by interneurons' What do you mean by encoded here? 

      We agree with the Reviewer that the verb “to encode” is not accurate. We modified the sentence as follows:

      “Brain rhythms are thought to be generated and propagated largely by interneurons”.

      - The section 'Interneurons interact to modulate fear neuron output' could be clearer. Start with describing the elements of the circuit, then the rhythms in the baseline. 

      We reorganized the section as follows:

      “Interneurons interact to modulate fear neuron output. Our BLA network consists of interneurons, detailed in the previous section, and excitatory projection neurons (Fig. 2A). Both the fear-encoding neuron (F), an excitatory projection neuron, and the VIP interneuron are activated by the noxious stimulus US (Krabbe et al., 2019). As shown in Fig. 2A (top, right), VIP disinhibits F by inhibiting both SOM and PV, as suggested in (Krabbe et al., 2019). We do not include connections from PV to SOM and VIP, nor connections from SOM to PV and VIP, since those connections have been shown to be significantly weaker than the ones included (Krabbe et al., 2019). The simplest network we consider is made of one neuron for each cell type. We introduce a larger network with some heterogeneity in the last two sections of the Results.

      Fig. 2A (bottom) shows a typical dynamic of the network before and after the US input onset, with US modeled as a Poisson spike train at ~50 Hz; the network produces all the rhythms originating from the interneurons alone or through their interactions with the excitatory projection neurons (shown in Fig. 1). Specifically, since VIP is active at low theta during both rest and upon the injection of US, it then modulates F at low theta cycles via SOM and PV. In the baseline condition, the VIP interneuron has short gamma bursts nested in low theta rhythm. With US onset, VIP increases its burst duration and the frequency of low theta rhythm. These longer bursts make the SOM cell silent for long periods of each low theta cycle, providing F with windows of disinhibition and contributing to the abrupt increase in activity right after the US onset. Finally, in Fig. 2A, PV lacks any external input and fires only when excited by F. Thanks to their reciprocal interactions, PV forms a PING rhythm with F, as depicted in Fig.1C.”

      - Figure 3C: The lower dashed line has the tick label '0.37' which should read '0.037'. 

      We fixed it.

      - The section describing the network with multiple neurons could be clearer, especially, it is not really clear how these different ECS and F neurons receive their input. 

      We answered the same objection in the reply to Reviewer #1 in point 2 under “minor issues.”

      Discussion: 

      - The paragraph 'It has also been suggested that ventral tegmental area has a role in fear expression (Lesas et al.,2023). Furthermore, it has been reported that the prelimbic cortex (PL) modulates the BLA SOM cells during fear retrieval, and the latter cells are crucial to discriminate non-threatening cues when desynchronized by the PL inputs (Stujenske et al., 2022).' is merely stating facts but I don't see how they relate to the presented work. 

      We thank the Reviewer for pointing out that this was confusing. What we meant to emphasize was that later stages of fear conditioning and extinction appear to require more than the BLA. We specifically mention the discrimination of non-threatening cues at the end of the paragraph, which now reads as follows:

      “Other brain structures may be involved in later stages of fear responsiveness, such as fear extinction and prevention of generalization. It has been reported that the prelimbic cortex (PL) modulates the BLA SOM cells during fear retrieval, and the latter cells are crucial to discriminate non-threatening cues when desynchronized by the PL inputs (Stujenske et al., 2022). Brain structures such as the prefrontal cortex and hippocampus have been documented to play a crucial role also in fear extinction, the paradigm following fear conditioning aimed at decrementing the conditioned fearful response through repeated presentations of the CS alone. As reported by several studies, fear extinction suppresses the fear memory through the acquisition of a distinct memory, instead of through the erasure of the fear memory itself (Harris et al., 2000; Bouton, 2002; Trouche et al., 2013; Thompson et al., 2018). Davis et al., 2017 found a high theta rhythm following fear extinction that was associated with the suppression of threat in rodents. Our model can be extended to include structures in the prefrontal cortex and the hippocampus to further investigate the role of rhythms in the context of discrimination of non-threatening cues and extinction. We hypothesize that a different population of PV interneurons plays a crucial role in mediating competition between fearful memories, associated with a low theta rhythm, and safety memories, associated with a high theta rhythm; supporting experimental evidence is in (Lucas et al., 2016; Davis et al., 2017; Chen et al., 2022).”

      - The comparison to other models BLA is quite short and seems a bit superficial. A more indepth comparison seems warranted. 

      We thank the reviewer for suggesting that a more in-depth comparison between our and other models in the literature would improve the manuscript. We rewrote entirely the first paragraph of that section. The new content reads as follows:

      “Comparison with other models. Many computational models that study fear conditioning have been proposed in the last years; the list includes biophysically detailed models (e.g., (Li 2009; Kim et al., 2013a)), firing rate models (e.g., Krasne 2011; Ball 2012; Vlachos 2011), and connectionist models (e.g., Moustafa 2013; Armony 1997; Edeline 1992) (for a review see (Nair et al., 2016)). Both firing rate models and connectionist models use an abstract description of the interacting neurons or regions. The omission of biophysical details prevents such models from addressing questions concerning the roles of dynamics and biophysical details in fear conditioning, which is the aim of our model.  There are also biophysically detailed models (Li 2009; Kim 2013; Kim 2016; Feng 2019), which differ from ours in both the physiology included in the model and the description of how plastic changes take place.  One main difference in the physiology is that we differentiated among types of interneurons, since the fine timing produced for the latter was key to our use of rhythms to produce spike-time dependent plasticity. The origin of the gamma rhythm (but not the other rhythms) was investigated in Feng et al 2019, but none of these papers connected the rhythms to plasticity.

      The most interesting difference between our work and that in (Li 2009; Kim 2013; Kim 2016) is the modeling of plasticity.  We use spike-time dependent plasticity rules.  The models in (Li 2009; Kim 2013; Kim 2016) were more mechanistic about how the plasticity takes place, starting with the known involvement of calcium with plasticity.  Using a hypothesis about back propagation of spikes, the set of papers together come up with a theory that is consistent with STDP and other instantiations of plasticity (Shouval 2002a; Shouval 2002b).  For the purposes of our paper, this level of detail, though very interesting, was not necessary for our conclusions.  By contrast, in order for the rhythms and the interneurons to have the dynamic roles they play in the model, we needed to restrict our STDP rule to ones that are depression-dominated.  Our reading of (Shouval 2002) suggests to us that such subrules are possible outcomes of the general theory.  Thus, there is no contradiction between the models, just a difference in focus; our focus was on the importance of the much-documented rhythms (Seidenbecher et al., 2003; Courtin et al., 2014b; Stujenske et al., 2014; Davis et al., 2017) in providing the correct spike timing.  We showed in the Supplementary Information (“Classical Hebbian plasticity rule, unlike the depression-dominated one, shows potentiation even with no strict pre and postsynaptic spike timing”) that if the STDP rule was not depression dominated, the rhythms need not be necessary.  We hypothesize that the necessity of strict timing enforced by the depression-dominated rule may foster the most appropriate association with fear at the expense of less relevant associations.”

      - The paragraph 'This could happen among some cells responding to weaker sensory inputs that do not lead to pre-post timing with fear neurons. This timing could be modified by the "triconditional rule", as suggested in (Grewe et al., 2017).' is not very clear. What exactly is 'this' in the first sentence referring to? If you mention the 'tri-conditional rule' here, please briefly explain it and how it would solve the issue at hand here.  

      We apologize that the sentence reported was not sufficiently clear. “This” refers to “depression”. We meant that, in our model, depression during fear conditioning happens every time there is no pre-post timing between neurons encoding the neutral stimuli and fear cells; poor pre-post timing can characterize the activity of neurons responding to weaker sensory inputs and does not lead to associative learning. We modified that paragraph as follows:

      “The study in (Grewe et al., 2017) suggests that associative learning resulting from fear conditioning induces both potentiation and depression among coactive excitatory neurons; coactivity was determined by calcium signaling and thus did not allow measurements of fine timing between spikes. In our model, we show how potentiation between coactive cells occurs when strict pre-post spike timing and appropriate pauses in the spiking activity arise. Depression happens when one or both of these components are not present. Thus, in our model, depression represents the absence of successful fear association and does not take part in the reshaping of the ensemble encoding the association, as instead suggested in (Grewe et al., 2017). A possible follow-up of our work involves investigating how fear ensembles form and modify through fear conditioning and later stages. This follow-up work may involve using a tri-conditional rule, as suggested in (Grewe et al. 2017), in which the potential role of neuromodulators is taken into account in addition to the pre- and postsynaptic neuron activity; this may lead to both potentiation and depression in establishing an associative memory.”

      - In the limitations and caveats section you mention that the small size of the network implies that they represent a synchronous population. What are the potential implications for the proposed rhythm-dependent mechanism? What are your expectations for larger networks? 

      We apologize if we were not adequately clear. We are guessing that the Reviewer thought we meant the entire population was synchronous, which it is not. We meant that, when we use a single cell to represent a subpopulation of cells of that type, that subpopulation is effectively synchronous. For larger networks in which each subtype is represented by many cells, there can be heterogeneity within each subtype. We have shown in the paper that the basic results still hold under some heterogeneity; however, they may fail if the heterogeneity is too large.

      We mentioned in a new section named “Assumptions and predictions of the model” in response to point 3 made by Reviewer #2.

      - The discussion is also missing a section on predictions/new experiments that can be derived from the model. How can the model be confirmed, what experiments/results would break the model? 

      To answer this question, we put in a new section in the Discussion entitled “Assumptions and predictions of the model”. The first paragraph of this section is in the reply to Reviewer #2 point 2; the second paragraph is in the reply to Reviewer #2 point 3; the last paragraph is in the Reply to Reviewer #1 point c; the rest of the section reads as follows:

      “Our study suggests that all the interneurons are necessary for associative learning provided that the STDP rule is depression-dominated. This prediction could be tested experimentally by selectively silencing each interneuron subtype in the BLA: if the associative learning is hampered by silencing any of the interneuron subtypes, this validates our study. Finally, the model prediction could be tested indirectly by acquiring more information about the plasticity rule involved in the BLA during associative learning. We found that all the interneurons are necessary to establish fear learning only in the case of a depression-dominated rule. This rule ensures that fine timing and pauses are always required for potentiation: interneurons provide both fine timing and pauses to pyramidal cells, making them crucial components of the fear circuit. 

      The modeling of the interneurons assumes the involvement of various intrinsic currents; the inclusion of those currents can be considered hypotheses of the model. Our model predicts that blockade of D-current in VIP interneurons (or silencing VIP interneurons) will both diminish low theta and prevent fear learning. Finally, the model assumes the absence of significantly strong connections from the excitatory projection cells ECS to PV interneurons, unlike the ones from F to PV. Including those synapses would alter the PING rhythm created by the interactions between F and PV, which is crucial for fine timing between ECS and F needed for LTP.”

    1. eLife assessment

      This work is an important contribution to the development of a biologically plausible theory of statistical modeling of spiking activity. The authors convincingly implemented the statistical inference of input likelihood in a simple neural circuit, demonstrating the relationship between synaptic homeostasis, neural representations, and computational accuracy. This work will be of interest to neuroscientists, both theoretical and experimental, who are exploring how statistical computation is implemented in neural networks. There are questions about the performance of the methods in the case where other biologically significant parameters, such as firing rate and thresholds, are optimized together with the synaptic weights.

    1. eLife assessment

      This study provides valuable new insights into how multisensory information is processed in the lateral cortex of the inferior colliculus, a poorly understood part of the auditory midbrain. By developing new imaging techniques that provide the first optical access to the lateral cortex in a living animal, the authors provide convincing in vivo evidence that this region contains separate subregions that can be distinguished by their sensory inputs and neurochemical profiles, as suggested by previous anatomical and in vitro studies. This work provides a foundation for future research exploring how this part of the auditory midbrain contributes to multisensory-based behavior.

    2. Reviewer #1 (Public Review):

      In this paper the authors provide a characterisation of auditory responses (tones, noise, and amplitude modulated sounds) and bimodal (somatosensory-auditory) responses and interactions in the higher order lateral cortex (LC) of the inferior colliculus (IC) and compare these characteristic with the higher order dorsal cortex (DC) of the IC - in awake and anaesthetised mice. Dan Llano's group have previously identified gaba'ergic patches (modules) in the LC distinctly receiving inputs from somatosensory structures, surrounded by matrix regions receiving inputs from auditory cortex. They here use 2P calcium imaging combined with an implanted prism to - for the first time - get functional optical access to these subregions (modules and matrix) in the lateral cortex of IC in vivo, in order to also characterise the functional difference in these subparts of LC. They find that both DC and LC of both awake and anaesthetised appears to be more responsive to more complex sounds (amplitude modulated noise) compared to pure tones and that under anesthesia the matrix of LC is more modulated by specific frequency and temporal content compared to the gaba'ergic modules in LC. However, while both LC and DC appears to have low frequency preferences, this preference for low frequencies is more pronounced in DC. Furthermore, in both awake and anesthetized mice somatosensory inputs are capable of driving responses on its own in the modules of LC, but very little in the matrix. The authors now compare bimodal interactions under anaesthesia and awake states and find that effects are different in some cases under awake and anesthesia - particularly related to bimodal suppression and enhancement in the modules.

      The paper provides new information about how subregions with different inputs and neurochemical profiles in the higher order auditory midbrain process auditory and multisensory information, and is useful for the auditory and multisensory circuits neuroscience community.

      The manuscript is improved by the response to reviewers. The authors have addressed my comments by adding new figures and panels, streamlining the analysis between awake and anaesthetised data (which has led to a more nuanced, and better supported conclusion), and adding more examples to better understand the underlying data. In streamlining the analyses between anaesthetised and awake data I would probably have opted for bringing these results into merged figures to avoid repetitiveness and aid comparison, but I acknowledge that that may be a matter of style. The added discussions of differences between awake and anaesthesia in the findings and the discussion of possible reasons why these differences are present help broaden the understanding of what the data looks like and how anaesthesia can affect these circuits.

      As mentioned in my previous review, the strength of this study is in its demonstration of using prism 2p imaging to image the lateral shell of IC to gain access to its neurochemically defined subdivisions, and they use this method to provide a basic description of the auditory and multisensory properties of lateral cortex IC subdivisions (and compare it to dorsal cortex of IC). The added analysis, information and figures provide a more convincing foundation for the descriptions and conclusions stated in the paper. The description of the basic functionality of the lateral cortex of the IC are useful for researchers interested in basic multisensory interactions and auditory processing and circuits. The paper provides a technical foundation for future studies (as the authors also mention), exploring how these neurochemically defined subdivisions receiving distinct descending projections from cortex contribute to auditory and multisensory based behaviour.

      Minor comment:<br /> - The authors have now added statistics and figures to support their claims about tonotopy in DC and LC. I asked for and I think allows readers to better understand the tonotopical organisation in these areas. One of the conclusions by the authors is that the quadratic fit is a better fit that a linear fit in DCIC. Given the new plots shown and previous studies this is likely true, though it is worth highlighting that adding parameters to a fitting procedure (as in the case when moving from linear to quadratic fit) will likely lead to a better fit due to the increased flexibility of the fitting procedure.

    3. Reviewer #2 (Public Review):

      Summary:

      The study describes differences in responses to sounds and whisker deflections as well as combinations of these stimuli in different neurochemically defined subsections of the lateral and dorsal cortex of the inferior colliculus in anesthetised and awake mice.

      Strengths:

      A major achievement of the work lies in obtaining the data in the first place as this required establishing and refining a challenging surgical procedure to insert a prism that enabled the authors to visualise the lateral surface of the inferior colliculus. Using this approach, the authors were then able to provide the first functional comparison of neural responses inside and outside of the GABA-rich modules of the lateral cortex. The strongest and most interesting aspects of the results, in my opinion, concern the interactions of auditory and somatosensory stimulation. For instance, the authors find that a) somatosensory-responses are strongest inside the modules and b) somatosensory-auditory suppression is stronger in the matrix than in the modules. This suggests that, while somatosensory inputs preferentially target the GABA-rich modules, they do not exclusively target GABAergic neurons within the modules (given that the authors record exclusively from excitatory neurons we wouldn't expect to see somatosensory responses if they targeted exclusively GABAergic neurons) and that the GABAergic neurons of the modules (consistent with previous work) preferentially impact neurons outside the modules, i.e. via long-range connections.

      Weaknesses:

      While the findings are of interest to the subfield they have only rather limited implications beyond it and the writing is not quite as precise as it could be.

    4. Reviewer #3 (Public Review):

      The lateral cortex of the inferior colliculus (LC) is a region of the auditory midbrain noted for receiving both auditory and somatosensory input. Anatomical studies have established that somatosensory input primarily impinges on "modular" regions of the LC, which are characterized by high densities of GABAergic neurons, while auditory input is more prominent in the "matrix" regions that surround the modules. However, how auditory and somatosensory stimuli shape activity, both individually and when combined, in the modular and matrix regions of the LC has remained unknown.

      The major obstacle to progress has been the location of the LC on the lateral edge of the inferior colliculus where it cannot be accessed in vivo using conventional imaging approaches. The authors overcame this obstacle by developing methods to implant a microprism adjacent to the LC. By redirecting light from the lateral surface of the LC to the dorsal surface of the microprism, the microprism enabled two-photon imaging of the LC via a dorsal approach in anesthetized and awake mice. Then, by crossing GAD-67-GFP mice with Thy1-jRGECO1a mice, the authors showed that they could identify LC modules in vivo using GFP fluorescence while assessing neural responses to auditory, somatosensory, and multimodal stimuli using Ca2+ imaging. Critically, the authors also validated the accuracy of the microprism technique by directly comparing results obtained with a microprism to data collected using conventional imaging of the dorsal-most LC modules, which are directly visible on the dorsal IC surface, finding good correlations between the approaches.

      Through this innovative combination of techniques, the authors found that matrix neurons were more sensitive to auditory stimuli than modular neurons, modular neurons were more sensitive to somatosensory stimuli than matrix neurons, and bimodal, auditory-somatosensory stimuli were more likely to suppress activity in matrix neurons and enhance activity in modular neurons. Interestingly, despite their higher sensitivity to somatosensory stimuli than matrix neurons, modular neurons in the anesthetized prep were overall more responsive to auditory stimuli than somatosensory stimuli (albeit with a tendency to have offset responses to sounds). This suggests that modular neurons should not be thought of as primarily representing somatosensory input, but rather as being more prone to having their auditory responses modified by somatosensory input. However, this trend was different in the awake prep, where modular neurons became more responsive to somatosensory stimuli. Thus, to this reviewer, one of the most intriguing results of the present study is the extent to which neural responses in the LC changed in the awake preparation. While this is not entirely unexpected, the magnitude and stimulus specificity of the changes caused by anesthesia highlight the extent to which higher-level sensory processing is affected by anesthesia and strongly suggests that future studies of LC function should be conducted in awake animals.

      Together, the results of this study expand our understanding of the functional roles of matrix and module neurons by showing that responses in LC subregions are more complicated than might have been expected based on anatomy alone. The development of the microprism technique for imaging the LC will be a boon to the field, finally enabling much-needed studies of LC function in vivo. The experiments were well-designed and well-controlled, the limitations of two-photon imaging for tracking neural activity are acknowledged, and appropriate statistical tests were used.

    5. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This study provides important new insights into how multisensory information is processed in the lateral cortex of the inferior colliculus, a poorly understood part of the auditory midbrain. By developing new imaging techniques that provide the first optical access to the lateral cortex in a living animal, the authors provide convincing in vivo evidence that this region contains separate subregions that can be distinguished by their sensory inputs and neurochemical profiles, as suggested by previous anatomical and in vitro studies. Additional information and analyses are needed, however, to allow readers to fully appreciate what was done, and the comparison of multisensory interactions between awake and anesthetized mice would benefit from being explored in more detail.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this paper, the authors provide a characterisation of auditory responses (tones, noise, and amplitude-modulated sounds) and bimodal (somatosensory-auditory) responses and interactions in the higher-order lateral cortex (LC) of the inferior colliculus (IC) and compare these characteristics with the higher order dorsal cortex (DC) of the IC - in awake and anaesthetised mice. Dan Llano's group has previously identified gaba'ergic patches (modules) in the LC distinctly receiving inputs from somatosensory structures, surrounded by matrix regions receiving inputs from the auditory cortex. They here use 2P calcium imaging combined with an implanted prism to - for the first time - get functional optical access to these subregions (modules and matrix) in the lateral cortex of IC in vivo, in order to also characterise the functional difference in these subparts of LC. They find that both DC and LC of both awake and anaesthetised mice appear to be more responsive to more complex sounds (amplitude-modulated noise) compared to pure tones and that under anesthesia the matrix of LC is more modulated by specific frequency and temporal content compared to the gabaergic modules in LC. However, while both LC and DC appear to have low-frequency preferences, this preference for low frequencies is more pronounced in DC. Furthermore, in both awake and anesthetized mice, somatosensory inputs are capable of driving responses on their own in the modules of LC, but very little (possibly not at all) in the matrix. However, bimodal interactions may be different under awake and anesthesia in LC, which warrants deeper investigation by the authors: They find, under anesthesia, more bimodal enhancement in modules of LC compared to the matrix of LC and bimodal suppression dominating the matrix of LC. In contrast, under awake conditions bimodal enhancement is almost exclusively found in the matrix of LC, and bimodal suppression dominates both matrix and modules of LC.

      The paper provides new information about how subregions with different inputs and neurochemical profiles in the higher-order auditory midbrain process auditory and multisensory information, and is useful for the auditory and multisensory circuits neuroscience community.

      Strengths:

      The major strength of this study is undoubtedly the fact that the authors for the first time provide optical access to a subcortical region (the lateral cortex of the inferior colliculus (i.e. higher order auditory midbrain)) which we know (from previous work by the same group) have optically identifiable subdivisions with unique inputs and neurotransmitter release, and plays a central role in auditory and multisensory processing. A description of basic auditory and multisensory properties of this structure is therefore very useful for understanding auditory processing and multisensory interactions in subcortical circuits.

      Weaknesses:

      I have divided my comments about weaknesses and improvements into major and minor comments. All of which I believe are addressable by the reviewers to provide a more clear picture of their characterisation of the higher-order auditory midbrain.

      Major comment:

      (1) The differences between multisensory interactions in LC in anaesthetised and awake preparations appear to be qualitatively different, though the authors claim they are similar (see also minor comment related to figure 10H for further explanation of what I mean). However, the findings in awake and anaesthetised conditions are summarised differently, and plotting of similar findings in the awake figures and anaesthetised figures are different - and different statistics are used for the same comparisons. This makes it very difficult to assess how multisensory integration in LC is different under awake and anaesthetised conditions. I suggest that the authors plot (and test with similar statistics) the summary plots in Figure 8 (i.e. Figure 8H-K) for awake data in Figure 10, and also make similar plots to Figures 10G-H for anaesthetised data. This will help the readers understand the differences between bimodal stimulation effects on awake and anaesthetised preparations - which in its current form, looks very distinct. In general, it is unclear to me why the awake data related to Figures 9 and 10 is presented in a different way for similar comparisons. Please streamline the presentation of results for anaesthetised and awake results to aid the comparison of results in different states, and explicitly state and discuss differences under awake and anaesthetised conditions.

      We thank the reviewer for the valuable suggestion. We only highlighted the similarities between the data obtained from anesthetized and awake preparations to indicate the ability to reproduce the technique in awake animals for future assessment. Identifying those similarities between the two experimental setups was based on the comparison between modules vs matrix or LC vs DC within each experimental setup (awake vs anesthetized). Therefore, the statistics were chosen differently for each setup based on the size of the subjects (n) within each experimental preparation. However, we agree with the reviewer’s comment that there are differences between the anesthetized and awake data. To examine these differences, we ran the same statistics for Figure 5 (tonotopy of LC vs. DC-anesthetic animals) and Figure 9 (tonotopy of LC vs DC-awake animals). In addition, we added a new figure after Figure 9 to separate the statistical analysis from the maps. Accordingly, Figures 4 and 5 (maps and analysis, respectively -anesthetized animals) now match Figures 9 and 10 (maps and analysis, respectively – awake animals). We also did the same thing for Figures 7 (microprism imaging of the LC - anesthetized animals), 8 (imaging of the LC from the dorsal surface - anesthetized animals) as well as Figure 11 or old Figure 10 (microprism imaging of the LC - awake animals) to address the similarities and differences of the multisensory data between awake and anesthetized animals. We edited the text accordingly in the result and discussion sections.

      (2) The claim about the degree of tonotopy in LC and DC should be aided by summary statistics to understand the degree to which tonotopy is actually present. For example, the authors could demonstrate that it is not possible/or is possible to predict above chance a cell's BF based on the group of other cells in the area. This will help understand to what degree the tonotopy is topographic vs salt and pepper. Also, it would be good to know if the gaba'ergic modules have a higher propensity of particular BFs or tonotopic structure compared to matrix regions in LC, and also if general tuning properties (e.g. tuning width) are different from the matrix cells and the ones in DC.

      Thank you for the reviewer’s suggestion. We have examined the tonotopy of LC and DC using two regression models (linear and quadratic polynomial) between the BFs of the cells and their location on the anatomical axis. Therefore, the tonotopy is indicated by a significant regression fit with a high R2 between the BFs the cells, and their location within each structure. For the DC, there was a significant regression fit between the BFs of the cells and their locations over the rostromedial to the caudolateral axis. Additionally, the R2 of the quadratic polynomial fit was higher than that of the linear fit, which indicates a nonlinear distribution of cells based on their BFs, which is consistent with the presence of high-low-high tuning over the DC surface. Given that the microprism cannot image the whole area of the LC, and it images a slightly different area in each animal, it was very difficult to get a consistent map for the LC as well as a solid conclusion about the LC tonotopy. However, we have examined the regression fit between the BFs of cells and their location along the main four anatomical axes of the field of view obtained from each animal (dorsal to ventral), (rostral to caudal), (dorsocaudal to ventrorostral) (dorsorostral to ventrocoudal). Unlike the DC, the LC imaged via microprism showed a lower R2 for both linear and quadratic regression mostly in the dorsoventral axis. We show the fitting curves of these regressions in Figure 4-figure supplement 1 (anesthetized data) and Figure 9-figure supplement 1 (awake data). Despite the inconsistent tonotopy of the LC imaged via microprism, the modules were found to have a higher BFs median at 10 kHz compared to matrix that had a lower BFs median at 7.1 kHz, which was consistent across the anesthetized and awake animals. We have added these results in the corresponding spot in the results section (lines 193-197 and 361-364). We have examined the tuning width using the binarized receptive field sum (RFS) method in which each neuron was given a value of 1 if it responds to a single frequency (Narrow RF), but this value increases if the neuron responds to more neighbor frequencies (wide RF). We did this calculation across all the sound levels. Both DC and LC of the anesthetized animals had higher RFS mean and median than those of awake animals given that ketamine was known to broaden the RF. However, in both preparations (anesthetized and awake), the DC had a higher RFS mean than that of the LC, which could be consistent with the finding that the DC had a relatively lower SMI than the LC. To show these new data, we made a new Figure 10-figure supplement 1, and we edited the text accordingly [lines 372-379 & 527-531].

      (3) Throughout the paper more information needs to be given about the number of cells, sessions, and animals used in each panel, and what level was used as n in the statistical tests. For example, in Figure 4 I can not tell if the 4 mice shown for LC imaging are the only 4 mice imaged, and used in the Figure 4E summary or if these are just examples. In general, throughout the paper, it is currently not possible to assess how many cells, sessions, and animals the data shown comes from.

      Thank you for the reviewer’s comment. We do apologize for not adding this information. We added all the information regarding the size of the statistical subjects (number of cells or number of animals used) for every test outcome. To keep the flow of the text, we added the details of the statistical tests in the legends of the figures.

      (4) Throughout the paper, to better understand the summary maps and plots, it would be helpful to see example responses of the different components investigated. For example, given that module cells appear to have more auditory offset responses, it would be helpful to see what the bimodal, sound-only, and somatosensory responses look like in example cells in LC modules. This also goes for just general examples of what the responses to auditory and somatosensory inputs look like in DC vs LC. In general example plots of what the responses actually look like are needed to better understand what is being summarised.

      Thank you for the reviewer’s comment and suggestion. We modified Figure 6 and the text accordingly to include all the significant examples of cells discussed throughout the work.

      Reviewer #2 (Public Review):

      Summary:

      The study describes differences in responses to sounds and whisker deflections as well as combinations of these stimuli in different neurochemically defined subsections of the lateral and dorsal cortex of the inferior colliculus in anesthetised and awake mice.

      Strengths:

      The main achievement of the work lies in obtaining the data in the first place as this required establishing and refining a challenging surgical procedure to insert a prism that enabled the authors to visualise the lateral surface of the inferior colliculus. Using this approach, the authors were then able to provide the first functional comparison of neural responses inside and outside of the GABA-rich modules of the lateral cortex. The strongest and most interesting aspects of the results, in my opinion, concern the interactions of auditory and somatosensory stimulation. For instance, the authors find that a) somatosensory-responses are strongest inside the modules and b) somatosensory-auditory suppression is stronger in the matrix than in the modules. This suggests that, while somatosensory inputs preferentially target the GABA-rich modules, they do not exclusively target GABAergic neurons within the modules (given that the authors record exclusively from excitatory neurons we wouldn't expect to see somatosensory responses if they targeted exclusively GABAergic neurons), and that the GABAergic neurons of the modules (consistent with previous work) preferentially impact neurons outside the modules, i.e. via long-range connections.

      Weaknesses:

      While the findings are of interest to the subfield they have only rather limited implications beyond it. The writing is not as precise as it could be. Consequently, the manuscript is unclear in some places. For instance, the text is somewhat confusing as to whether there is a difference in the pattern (modules vs matrix) of somatosensory-auditory suppression between anesthetized and awake animals. Furthermore, there are aspects of the results which are potentially very interesting but have not been explored. For example, there is a remarkable degree of clustering of response properties evident in many of the maps included in the paper. Taking Figure 7 for instance, rather than a salt and pepper organization we can see auditory responsive neurons clumped together and non-responsive neurons clumped together and in the panels below we can see off-responsive neurons forming clusters (although it is not easy to make out the magenta dots against the black background). This degree of clustering seems much stronger than expected and deserves further attention.

      Thank you for the reviewer’s comment. We do apologize if some areas in the manuscript were imprecisely written. For anesthetized and awake data, we have only emphasized the similarities between the two setups to show the ability to use microprism in awake animals for future assessment. To highlight the differences between anesthetized and awake animals, we have now run uniform statistics for all the data collected from both setups. Accordingly, we have edited Figures 4 and 5 (tonotopy-anesthetized) to match Figures 9 and new Figure 10 (tonotopy-awake). Also, we edited Figures 7 and 8 (multisensory- anesthetized) to match Figure 11 or old Figure 10 (multisensory- awake). We edited the text accordingly in the results section and discussed the possible differences between anesthetized and awake data in the discussion section [lines 521-553].

      We agree with the reviewer’s comment that the cells were topographically clustered based on their responses. Some of these clusters include the somatosensory responsive cells, which were located mostly in the modules (Figures 7D and 8E). Also, the auditory responsive cells with offset responses were clustered mostly in the modules (Figures 7C and 8F). Accordingly, we have edited the text to emphasize this finding.

      We noticed also that some responsive cells to the tested stimulations were surrounded by nonresponsive cells. By comparing the response of the cells to different stimuli we found that while Figures 7 and 11 (old Figure 10) showed only the response of the cells to auditory stimulation (unmodulated broadband noise at 80 dB) and somatosensory stimulation (whisker deflection), some nonresponsive cells to these specific stimulations were found to be responsive to pure tones of different frequencies and amplitudes. As an indicator of the cells' viability, we additionally examined the spontaneous activity of the nonresponsive cells across different data sets. We note that spontaneous activity was rare for all cells even among the responsive cells to sound or somatosensory stimulations. This finding could be related to the possibility that the 2P imaging of calcium signals may not be sensitive enough to track spontaneous activity that may originate from single spikes. However, in some data sets, we have found that the cells that did not respond to any tested stimuli showed spontaneous activity when no stimulation was given indicating the viability of those cells. We have addressed the activity of the non-responsive cells in the text along with a new Figure 11-figure supplement 1.

      We changed the magenta into a green color to be suitable for the dark background. Also, we have completely changed the color palette of all of our images to be suitable for color-blind readers as suggested by reviewer 1.

      Reviewer #3 (Public Review):

      The lateral cortex of the inferior colliculus (LC) is a region of the auditory midbrain noted for receiving both auditory and somatosensory input. Anatomical studies have established that somatosensory input primarily impinges on "modular" regions of the LC, which are characterized by high densities of GABAergic neurons, while auditory input is more prominent in the "matrix" regions that surround the modules. However, how auditory and somatosensory stimuli shape activity, both individually and when combined, in the modular and matrix regions of the LC has remained unknown.

      The major obstacle to progress has been the location of the LC on the lateral edge of the inferior colliculus where it cannot be accessed in vivo using conventional imaging approaches. The authors overcame this obstacle by developing methods to implant a microprism adjacent to the LC. By redirecting light from the lateral surface of the LC to the dorsal surface of the microprism, the microprism enabled two-photon imaging of the LC via a dorsal approach in anesthetized and awake mice. Then, by crossing GAD-67-GFP mice with Thy1-jRGECO1a mice, the authors showed that they could identify LC modules in vivo using GFP fluorescence while assessing neural responses to auditory, somatosensory, and multimodal stimuli using Ca2+ imaging. Critically, the authors also validated the accuracy of the microprism technique by directly comparing results obtained with a microprism to data collected using conventional imaging of the dorsal-most LC modules, which are directly visible on the dorsal IC surface, finding good correlations between the approaches.

      Through this innovative combination of techniques, the authors found that matrix neurons were more sensitive to auditory stimuli than modular neurons, modular neurons were more sensitive to somatosensory stimuli than matrix neurons, and bimodal, auditory-somatosensory stimuli were more likely to suppress activity in matrix neurons and enhance activity in modular neurons. Interestingly, despite their higher sensitivity to somatosensory stimuli than matrix neurons, modular neurons in the anesthetized prep were far more responsive to auditory stimuli than somatosensory stimuli (albeit with a tendency to have offset responses to sounds). This suggests that modular neurons should not be thought of as primarily representing somatosensory input, but rather as being more prone to having their auditory responses modified by somatosensory input. However, this trend was reversed in the awake prep, where modular neurons became more responsive to somatosensory stimuli than auditory stimuli. Thus, to this reviewer, the most intriguing result of the present study is the dramatic extent to which neural responses in the LC changed in the awake preparation. While this is not entirely unexpected, the magnitude and stimulus specificity of the changes caused by anesthesia highlight the extent to which higher-level sensory processing is affected by anesthesia and strongly suggest that future studies of LC function should be conducted in awake animals.

      Together, the results of this study expand our understanding of the functional roles of matrix and module neurons by showing that responses in LC subregions are more complicated than might have been expected based on anatomy alone. The development of the microprism technique for imaging the LC will be a boon to the field, finally enabling much-needed studies of LC function in vivo. The experiments were well-designed and well-controlled, and the limitations of two-photon imaging for tracking neural activity are acknowledged. Appropriate statistical tests were used. There are three main issues the authors should address, but otherwise, this study represents an important advance in the field.

      (1) Please address whether the Thy1 mouse evenly expresses jRGECO1a in all LC neurons. It is known that these mice express jRGECO1a in subsets of neurons in the cerebral cortex, and similar biases in the LC could have biased the results here.

      Thank you for the reviewer’s comment. In the work published by Dana, et al, the expression of jRGECO1a in all Thy1 mouse lines was determined by the brightness of the jRGECO1a in the soma. Given that some cells do not show a detected level of jRGECO1a fluorescence until activated, the difference in expression shown in different brain regions could be related to the level of neuronal activity at the time of sample processing and not the expression levels of the indicator itself. To the best of our knowledge, there is no antibody for jRGECO1a, which can be used for detecting the expression levels of the indicator regardless of the neuronal activity. To test the hypothesis that DC and LC have different levels of jRGECO1a, we examined the expression levels of jRGECO1a after we perfused the mice with high potassium saline to elicit a general neuronal depolarization in the whole brain. Then we immunostained against NeuN (the neuronal marker) to quantify the percentage of the neurons expressing jRGECO1a to the total number of neurons (indicated by NeuN). To have a fair comparison, we restricted our analysis to include the areas imaged only by 2P as some regions were not accessible by microprism such as the deep ventral regions of the LC. There is a similar % of cells expressing jRGECO1a in DC and LC. As expected, the neurons expressing jRGECO1a were only nonGABAergic cells. We addressed these findings in the new Figure 3-figure Supplement 1 as well as the corresponding text in the results [lines 178-184] and methods sections [lines 878-892].

      (2) I suggest adding a paragraph or two to the discussion to address the large differences observed between the anesthetized and awake preparations. For example, somatosensory responses in the modules increased dramatically from 14.4% in the anesthetized prep to 63.6% in the awake prep. At the same time, auditory responses decreased from 52.1% to 22%. (Numbers for anesthetized prep include auditory responses and somatosensory + auditory responses.). In addition, the tonotopy of the DC shifted in the awake condition. These are intriguing changes that are not entirely expected from the switch to an awake prep and therefore warrant discussion.

      Thank you for the reviewer’s comment. To determine if differences exist between anesthetized and awake data, we have now used the same statistics and edited Figures 4,5,7,8,9, and 10 as well as added a new Figure 11. Accordingly, we have edited the result section and added a paragraph addressing the possible differences between the two preparations in the Discussion section [lines 521-553]..

      (3) For somatosensory stimuli, the authors used whisker deflection, but based on the anatomy, this is presumably not the only somatosensory stimulus that affects LC. The authors could help readers place the present results in a broader context by discussing how other somatosensory stimuli might come into play. For example, might a larger percentage of modular neurons be activated by somatosensory stimuli if more diverse stimuli were used?

      We agree with the reviewer’s point. Indeed, the modules are receiving different inputs from different somatosensory sources such as somatosensory cortex and dorsal column nuclei, which could indicate that the activity of the cells in the modular areas could be evoked by different types of somatosensory stimulations, which is an open area for future studies. We have discussed this point in the revised Discussion section [lines 516-520].

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Minor comments:

      (1) Figure 3H: The lateral surface seems quite damaged by the prism. An example slice of the imaging area of each mouse would help the reader better understand the extent of damage the prism leaves in the area of interest.

      Thank you for the reviewer’s comment. We already have included such images in Figures 4A, 7A, and 9A to present the field of view of all prism experiments. However, we need to clarify the point of tissue damage. The insertion of microprism may be associated with some tissue damage as a result of making the pocket for the microprism to be inserted, but it is not possible to get neuronal signals from a damaged field of view. Therefore, we do not believe that there is tissue damage to the parts of the LC imaged by microprism. However, there may be some areas where the microprism is not in direct contact with the LC surface. These areas are located mostly in the periphery of the field of view, and they are completely black as they are out of focus (i.e., the left side of Figure 3B). The right side of Figure 3b as well as Figure 3A have some black areas, which present the vasculatures, where there are no red signals because of the lack of jRGECO1a expression in those areas.

      (2) In relation to the data shown in Figure 4E it is claimed that LC is tuned to higher frequencies (lines 195-196). However, the majority of cells appear to be tuned to frequencies below 14kHz (with a median of 7.5 kHz), which is quite low for the mouse. I assume that the authors mean frequencies that are relatively higher than the DC, but it is worth mentioning in the text that the BFs found in the LC are quite low-frequency responses for the mouse.

      Thank you for the reviewer’s comment, which we agree with. We edited this part by acknowledging that around 50% of the LC cells had a low-frequency bias to 5 and 7.1 kHz. Then we mentioned that most of the LC cells are tuned to relatively higher frequencies than those of the DC [lines 215-218].

      (3) Figure 5A-C: Is it the tone-responsive cells plus an additional ~22% of cells that respond to AM, or are there also cells that respond to tones that do not respond to AM. Please break down to which degree the tone and AM responsive cells are overlapping.

      Thank you for the reviewer’s comment and suggestion. We broke down the responsive cells into cells responsive only to pure tone (tone selective cells or Tone-sel) or to only AM-noise (noise selective cells or Noise-sel) as well as cells responding to both sounds (nonselective cells or Non-sel). We examined the fractions of these categories of cells in both LC and DC within all responsive neurons. Accordingly, we have edited Figure 5A-C as well as the text [lines 229-243].

      (4) Figure 5D. It is unclear to me how a cell is classified as SMI or TMI responsive after computing the SMI or TMI for each cell. What statistic was used to determine if the cell was responsive or not?

      Thank you for the reviewer’s comment. We do apologize for the confusion caused by Figures 5D and E. These figures do not show the values of SMI or TMI, respectively. Rather, the figures show the percentage of the spectrally or temporally modulated cells, respectively. At each sound level, the cells were categorized into two main types. The spectrally modulated cells are those responsive to pure tones or unmodulated noise, so they can detect the spectral features of the sound (old Figure 5D or new Figure 5E). The temporally modulated cells are those responsive to AM-noise, so they can detect the temporal features of the sound of complex spectra like the broadband noise (old Figure 5E or new Figure 5F). To clear this confusion, we removed the words SMI and TMI from the figures, and then we renamed the x-axis label into “% of spectrally modulated cells” and “% of temporally modulated cells” for Figures 5D (new 5E) and E (new 5F), respectively.

      (5) Figure 5 D, E: Is the decrease in SMI and TMI modulated cells in the modules a result of simply lower sensitivity to sounds (i.e. higher response thresholds)? If a cell responds to neither tone, AM, or noise it will have a low SMI and TMI index. If this is the case that affects the interpretation, as it is then not a decrease in sensitivity to spectral or temporal modulation, but instead a difference in overall sound sensitivity.

      Thank you for the reviewer’s comment. We apologize for the confusion about Figures 5E and D, which did not show the SMI and TMI values. Rather, they show the percentage of spectrally or temporally modulated cells, respectively, as explained in our previous response. Therefore, Figure 5D shows the percentage of cells that can detect the spectral features of sound, while Figure 5E shows the percentage of cells that can detect the temporal features of sounds of complex spectra like broadband noise. Accordingly, Figures 5D and E show the sensitivity to different features of sound and not the overall sound sensitivity.

      (6) Figure 7 and 8: What is the false positive rate expected of the responsive cells using the correlation cell flagging criteria? Especially given that the fraction of cells responsive to somatosensory stimulation in LC (matrix) is 0.88% and 1.3% in DC, it is important to know what the expected false positive rate is in order to be able to state that there are actually somatosensory responses there or if this is what you would expect from false positives given the inclusion test used. Please provide an estimate of the false positive rate given your inclusion test and show that the rate found is statistically significantly above that level - and show this rate with a line in Figure 7 H, I.

      Thank you for the reviewer’s comment. To test the efficiency of the correlation method to determine the responsive cells, we initially ran an ROC curve comparing the automated method to a blinded human interpretation. The AUC of the ROC curve was 0.88. This high AUC value indicates that the correlation method can rank the random responsive cells than the random nonresponsive cells. At the correlation coefficient (0.4), which was the cutoff value to determine the responsive cells for somatosensory stimulation, the specificity was 87% and the sensitivity 72%, the positive predictive value was 73%, and the negative predictive value was 86%. Although the above percentages indicate the efficiency of the correlation method, we excluded all the false responsive cells from the analysis. Therefore, the fractions of cells in the graphs are the true responsive cells with no contamination of the non-responsive cells. We also modified Figures 7H and I to match the other data sets obtained from awake animals. Therefore, Figures 7H and I no longer show the average of the responsive cells. Instead, they show the % of different fractions of responsive cells within each cellular motif (modules and matrix). Accordingly, we believe that there is no need to include a rate line on the graph. We added the section describing the validation part to the methods section [lines 808-815].

      (7) Figure 7: Please clarify what is meant by a cell responding to 'both responding to somatosensory and auditory stimulation'. Does it mean that the cell has responses to both auditory and somatosensory stimulation when presented individually or if it responds to both presented together? If it is the former, I don't understand how the number to both can be higher than the number of somatosensory alone (as both requires it also to respond to somatosensory alone). If it is the latter (combined auditory and somatosensory) then it seems that somatosensory inputs remove the responsiveness of most cells that were otherwise responsive to auditory alone (e.g. in the module while 42% respond to sound alone, combined stimulation would leave only 10% of cells responsive). Please clarify what exactly the authors are plotting and stating here.

      Thank you for the reviewer’s comment. The responsive cells in Figure 7 are divided into three categories. Each category has a completely different group of cells. The first category is for the cells responding only to auditory stimulation (auditory-selective cells or Aud-sel). The second category is for the cells that respond only to somatosensory stimulation (somatosensory selective cells or Som-sel). The third category is for the cells that respond to both auditory and somatosensory stimulations when both stimulations are presented individually (auditory/somatosensory nonselective cells or Aud/Som-nonsel). Accordingly, the number of cells may be different across all these categories. We have clarified this part in the text [lines 299-303]. We have modified Figures 7, 8, and 11 (old Figure 10) to match the data from anesthetized and awake animals, so Figures 7H and I now show the collective % of the cells from all animals within modules vs matrix.

      (8) Why are the inferential statistics used in Figure 9F (chi-square test) and Figure 5A-C (t-test) when it tests the same thing (the only difference is one is anaesthetised data and the other awake)? Indeed, all Figure 9 and 10 (awake data figures) plots use chi-square tests to test differences in percentages instead of t-tests used in earlier (anaesthetised data figures) plots to test differences in percentages between groups. Please clarify the reason for this change in statistics used for similar comparisons.

      Thank you for the reviewer’s comment. Imaging the LC via microprism from awake animals confirmed the ability to run this technique with no interference to the ambulatory functions of the animals. Therefore, the main goal was to highlight the similarities between the data obtained from awake and anesthetized setups by highlighting the comparison between the LC and DC or between modules and matrix within each preparation (anesthetized vs awake). Accordingly, the statistics used to run these comparisons were chosen based on the number of the tested animals at each setup (7 anesthetized animals and 3 awake animals for prism insertion). The low number of animals used for awake data made us use the number of cells collectively from all animals instead of the number of animals, so we used the Chi-square test to examine the differences in percentages.

      (9) Figure 10H: The main text describes the results shown here as similar to what was seen in anaesthetised animals. But it looks to me like the results in awake animals are qualitatively different from the multisensory interaction seen in anaesthetised animals. In anaesthetised animals the authors find that there is a higher chance of auditory responses being enhanced by somatosensory inputs when cells are in the modules compared to in the matrix. However, in awake data, this relationship is flipped, with more bimodal enhancement found in the matrix compared to the modules. Furthermore, almost all cells in the modules are suppressed by combined somatosensory input which looks like it is different from what is found in anaesthestised mice and what is described in the discussion: 'we observed that combined auditory-somatosensory stimulation generally suppressed neural responses to auditory stimuli and that this suppression was most prominent in the LC matrix'.

      Thank you for the reviewer’s comment. Our statement was meant to show how the data obtained from awake and anesthetized animals were generally similar. However, we agree that the statement may not be suitable due to the possible differences between awake and anesthetized animals. To address a fair comparison between the anesthetized and awake preparations, we ran similar statistics and graphs for Figures 7, 8, and 11 (old Figure 10). Given that the areas occupied by modules and matrix are different across animals due to the irregular shape of the modules, we chose to run a chi-square test for all the data to quantify the collective % of responding cells within modules vs matrix from all tested animals for each experimental setup (anesthetized vs awake). The anesthetized and awake animals similarly showed that modules and matrix had higher fractions of auditory responsive cells. However, matrix had more cells responding to auditory stimulations than modules, while modules had more cells responding to somatosensory stimulation than matrix. In contrast, while the anesthetized animals showed higher fractions of offset auditory-responsive cells, which were mostly clustered in the modules, the offset auditory-responsive cells were very rare in awake animals (6 cells/one animal).

      Based on the fractions of cells with suppressed or enhanced auditory response induced by bimodal stimulation, the data obtained from anesthetized and awake animals showed that the auditory response in the matrix was suppressed more than enhanced by bimodal stimulation. In contrast, modules had different profiles across the experimental setups and locations. For instance, the modules imaged via microprism in the anesthetized and awake animals showed suppressed more than enhanced auditory responses, but modules imaged from the dorsal surface in anesthetized animals showed enhanced more than suppressed auditory responses. Additionally, modules had less suppressed and more enhanced auditory responses compared to matrix in the anesthetized animals regardless of the location of the modules (microprism or dorsal surface). Yet, modules from awake animals had more suppressed and less enhanced auditory responses compared to matrix. We have addressed these differences in the results and discussion section.

      Additional minor comments that I think the authors could use to aid their manuscript clarity:

      (1) The figure colour selection - especially in Figures 7 and 8 - is really hard to tell apart. Please choose more distinct colours, and a colour scheme that is appropriate for colour blind readers.

      Thank you for the reviewer’s suggestion. We have noticed that the magenta color assigned for the cells with offset responses was very difficult to distinguish from the black background. We have changed the magenta color to green to be different from the color of other cells. Using Photoshop, we chose a color scheme that is suitable for color-blind readers in all our maps.

      (2) The sentence in lines 331-334 should be rephrased for clarity.

      Thank you for the reviewer’s suggestion. We have rephrased the statement for clarity [lines 364-371].

      Reviewer #2 (Recommendations For The Authors):

      As mentioned in the public review the strong clustering evident in some of the maps (some of which may be related to module/matrix differences but certainly not all of it) seems worth scrutinizing further. Would we expect such a strong spatial segregation of auditory responsive and non-responsive neurons? Would we expect response properties (e.g. off-responsiveness) other than frequency tuning to show evidence of a topographic arrangement in the IC? In addressing this it would, of course, be important to rule out that this clustering is not down to some trivial experimental variables and truly reflects functional organization. For instance, are the patches of non-responsive neurons found in parts of the field of view with poor visibility, poor labelling, etc which may explain why it is difficult to pick up responses there? Are the neurons in non-responsive areas otherwise active (i.e. do they show spontaneous activity) or could they be 'dead'? Could the way neuropil signals are dealt with play a role here (it is weighted by 0.4 which strikes me as quite low)? In relation to this, I am also wondering to what extent the extreme overrepresentation (Figure 4) of neurons with a BF of 5kHz (some of this is, of course, down to the fact that the lower end of the frequency range was 5kHz and that the step size was 0.5 octaves), especially in the DC, is to be interpreted.

      Thank you for the reviewer’s comment. Before analysis, the ROIs of all cells were set around the cell bodies using the jRGECO1a signals as a reference, so all cells (responsive and nonresponsive) were collected from areas of good visibility of jRGECO1a signals. In other words, no cells were collected from regions having poor jRGECO1a signals. In Figures 7, 8, and 11 (old Figure 10), the cells showed response either only to unmodulated broadband noise at 80 dB as an auditory stimulus or to whisker deflection with specific speed and power as a somatosensory stimulus. Given that the two stimuli above had specific parameters, the remaining non-responsive cells may respond to auditory or somatosensory stimulations with other features. For instance, some nonresponsive cells to the unmodulated broadband noise were responding to pure tones with different amplitudes and frequencies or to different AM-noise with different amplitudes and modulation frequencies.  Also, these nonresponsive cells may not respond to any of our tested stimuli and may respond to other sensory stimulations. Some of the non-responsive cells showed spontaneous activity when no stimulations were presented. However, we can not rule out the possibility that some of these nonresponsive cells may not be viable. We have addressed the clustering properties in the revised version of the manuscript in the corresponding spots of the results and discussion sections. We have added a new supplementary figure (Figure 11- Figure Supplement 1) to show how the nonresponsive cells to the unmodulated noise may respond to other types of sound and to show the spontaneous activity of some non-responsive cells.

      For the neuropil, previous reports used the contamination factor (r) in a range of 0.3-0.7 (we referenced these studies in the method section [line 776) based on the tissue or cells imaged, vasculatures, and the objective used for imaging. Therefore, we optimized the contamination factor (r) to be 0.4 through a preliminary analysis based on the tissue we image (LC), and the objective used (16x with NA = 0.8 and 3 mm as a working distance).

      We agree that there is an overrepresentation of 5 kHz as the best tuning frequency for DC cells. The previous report (A. B. Wong & Borst, 2019) showed a large zone of the DC where cells were tuned to (2-8 kHz). Given that 5kHz was the lowest tested frequency in our experiment, we think that the low-frequency bias of the DC surface is consistent between studies. This finding also could be supported by the electrophysiology data obtained by spanning the recording electrodes through the IC tissue along the dorsoventral axis. In those experiments, the cells were tuned to lower frequencies at the dorsal surface of the IC.

      We have changed the magenta-colored cells to green ones, so it will be easier to identify the cells. As required by another reviewer, we changed the color pallets of some images and cellular maps to be suitable for color-blind readers. 

      The manuscript would benefit from more precise language in a number of places, especially in the results section.

      Line 220/221, for instance: "... a significant fraction of cells that did not respond to pure tones did respond to AM-noise" Strictly speaking, this sentence suggests that you considered here only the subset of neurons that did not respond to pure tones and then ran a test on that subset. The test that was done seems to suggest though that the authors tested whether the percentage of responsive cells was greater for pure tones or for AM noise.

      Thank you for the reviewer’s comment. We do apologize for the confusion. In the revised manuscript, we categorized the cells according to their response into cells responding to pure tone only (tone-selective cells or Tone-sel), Am-noise only (noise-selective cells or Nose-sel), and to both pure tone and am-noise (nonselective cells or Non-sel). We have modified Figure 5 accordingly. We did the same thing for the data obtained from awake animals and showed that in a new figure to easily match the analysis done for the anesthetized animals.

      Please refer to the figure panels in the text in consecutive order. 2B, for instance, is mentioned after 2H.

      Thank you for the reviewer’s comment. Throughout the paper, we kept the consecutive order of the figure panels within each figure to be in a smooth flow with the text. Yet, figure 2 was just the only exception for a good reason. Figure 2 is a complex one that includes many panels to show a parallel comparison between LC imaged via microprism and DC through single photon images, two-photon images, validating laser lesioning, and histology. Accordingly, we navigated many panels of the figure to efficiently highlight the aspects of this comparison. We prefer to keep Figure 2 as one figure with its current format to show this parallel comparison between LC and DC.

      The legend for Figure 2 could be clearer. For instance, there are two descriptions for panel D. Line 1009: "(C-E)" [i.e. C, D, E] and line 1010: "(D and F)".

      Thank you for the reviewer’s comment. It should be C and E, not C-E. We have fixed the mistake [line 1224]

      Line 275: What does 'with no preference' mean?

      Thank you for the reviewer’s comment. We do apologize for the confusion. There are three categories of cells. Some cells respond only to auditory stimulation, while others respond to only somatosensory stimulation. However, there is another group of cells that respond nonselectively to auditory and somatosensory stimulations or Aud/Som-nonsel cells. We edited the sentence to be clearer [lines 303-304].

      Line 281 (and other places): What does 'normalized against modules' mean?

      Thank you for the reviewer’s comment. This normalization was done by dividing the number of responsive cells of the same response type in the matrix by that in the modules. Therefore, the value taken by modules was always 1 and the value taken by the matrix is something around 1. Accordingly, the value for matrix could be > 1 if matrix had more cells than modules. In contrast, the value of matrix would be < 1 if matrix had fewer cells than modules. In the revised version, we used this normalization method to make the revised Figures 5C and 10C to describe the cell fractions responding to pure tone only, AM-noise only, or to both stimuli in the matrix vs modules. 

      Sentence starting on line 288. I don't find that point to be as obvious from the figures as the sentences seem to suggest. Are we to compare magenta points (auditory off cells) from 7C with green points in 7F?

      Thank you for the reviewer’s comment. We came to this conclusion based on our visual comparison of magenta points (now green in the revised version to increase the visibility) representing the auditory offset cells in Figure 7C and the green points in Figure 7F representing the cells responding to both somatosensory and auditory stimulations. In the revised manuscript, we statistically examined if the percentage of onset auditory response and offset auditory responses are different within the responsive cells to both somatosensory and auditory stimulations in the modules vs matrix. We have found that most of the cells responding to both somatosensory and auditory stimulations inside the modules had offset auditory responses, which could indicate a level of multisensory integration between somatosensory input and the offset auditory responses in these cells. We have added the statistical results to the revised manuscript to address this effect [lines 312-317]

      Lines 300-302: "These data suggest that the module/matrix system permits preservation of distinct multimodal response properties in the face of massive integration of inputs in the LC". First, I'm not quite sure what that sentence means. Second, it would be more appropriate for the discussion. Third, the fact that we are more likely to find response enhancement in the modules than in the matrix is nicely consistent with the idea (supported by work from the senior author's lab and others) that excitatory somatosensory input predominantly targets neurons in the modules (which is why we see mostly response enhancement in the modules) and that this input targets GABAergic neurons which then project to and inhibit neurons both outside and inside of their module. Therefore, I would recommend that the authors replace the aforementioned sentence with one that interprets these results in light of what we know about this somatosensory-auditory circuitry.

      Thank you for the reviewer’s comment. Despite the massive multimodal inputs, the LC receives from auditory vs nonauditory regions, the module/matrix system is a platform for distinct multimodal responses indicated by more somatosensory responsive cells in modules versus more auditory responsive cells in matrix, which matches the anatomical differences that were reported before. We edited the sentence in the light of the comparison between the data obtained from awake and anesthetized animals and moved it to the discussion section [lines 503-506].

      The term 'LC imaged via microprism' is used dozens of times throughout the manuscript. Replacing it with a suitable acronym or initialism could improve the flow of the text and would make some of the sentences less cumbersome.

      Thank you for the reviewer’s suggestion. We changed the term “LC imaged via microprism” into LC (microprism) throughout the revised manuscript.

      5A-C: It is unclear what is being compared here. What are the Ns? Different animals?

      Thank you for the reviewer’s comment. We do apologize for this missing information. We have added the number of subjects used in every statistical test in each corresponding figure legend.

      5G: minus symbol missing on the y-axis.

      Thank you for the reviewer’s comment. We gladly have fixed that.

      Figure 6: Are these examples or population averages?

      Thank you for the reviewer’s question. Every figure panel of the old Figure 6 represents a single trace of an example cell. However, we modified Figure 6 to include more examples of cells showing different responses complying with another reviewer’s suggestion. Each panel of the new Figure 6 represents the average response of 5 stimulations of the corresponding stimulus type. We preferred to show the average signal because it was the one used for the subsequent analysis.

      How are module borders defined?

      Thank you for the reviewer’s question. The modules were defined based on the intensity of the green channel that shows the expression of the GFP signals. The boundaries of modules were determined according to the distinction between high and low GFP signal boundaries of the modules. This step was done before data analysis to avoid any bias.

      7JKL: How are these to be interpreted? Does panel 7K, for instance, indicate that the fraction of neurons showing 'on' responses was roughly twice as large in the matrix than in the modules and that the fraction of neurons showing 'off' responses was roughly 10 times larger in the modules than in the matrix (the mean seems to be at about 1/10).

      Thank you for the reviewer’s comment. The data represented by Figures 7J-L defined the normalization of the number of cells of the same response type in the matrix against the modules. This normalization was done per animal, and then the data of the matrix were plotted against the normalization line at 1 representing the modules. The matrix will be claimed to have more cells than modules if the median of the matrix values > 1. In contrast, the matrix will be claimed to have fewer cells than the modules if the median of the matrix values < 1. Finally, if the median of matrix values = 1, this means there is no difference between matrix and modules. However, to match the data obtained from anesthetized animals (Figures 7 and 8) with those obtained from awake animals (Figure 11 or old Figure 10), we ran all data through the Chi-square test in the revised manuscript. Therefore, the format of Figures 7K-L was changed in the revised manuscript, so they became new Figures 7I-K.

      10A suggests that significantly more than half the neurons shown here are not auditory responsive. Perhaps I am misinterpreting something here but isn't that in contrast to what is shown in panel 9F?

      Thank you for the reviewer’s comment. The data shown in Figure 10A (or revised Figure 11A) represents the cellular response to only one stimulus (broadband noise at 80 dB with no modulation frequency), while Figure 9F (revised 10B) represents the cells responding to varieties of auditory stimulations of different combinations of frequencies and amplitudes (pure tones) as well as to AM-noise of different amplitudes and modulation frequencies. Accordingly, the old Figure 9F or revised Figure 10B shows different cell types based on their responses. For instance, some cells respond only to pure tone. Others respond only to AM-noise or to both pure tones and AM-noise. This may also support that the nonresponsive cells in Figure 10A (revised 11A) can respond to other types of sound features.

      The way I understood panels 7L and 8K there were more suppressed neurons in the matrix than in the modules (line 296: "cells in the modules had a higher odds of having an enhancement response to bimodal stimulation than matrix, while cells in the matrix had a higher odds of having a suppressive response to bimodal stimulation"). Now, panel 10F indicates that in awake mice there is a greater proportion of suppressed neurons in the modules than in the matrix. I may very well have overlooked or misread something but I may not be the only reader confused by this so please clarify.

      Thank you for the reviewer’s comment. We do apologize for this confusion. The ambiguity between Figures 7 and 8 (anesthetized animals) as well as Figure 10 (awake animals) comes from the fact that different statistics have been used for each preparation. In the revised version, we have fixed that by running the same statistics for all the data, and we accordingly revised Figures 7, 8, and 10 (new Figure 11). In brief, the matrix preserves a higher percentage of cells with suppressed auditory responses than those with enhanced auditory responses induced by bimodal stimulation in all conditions (anesthetized vs awake). In contrast, modules act differently across all tested conditions. While modules had more cells with enhanced auditory responses induced by bimodal interaction in anesthetized animals, they had more cells with suppressed response in awake animals indicating that modules could be sensitive to the effect of anesthesia compared to matrix. We addressed this effect in the discussion of the revised manuscript [lines 521-553].

      Line 438: ...as early AS...

      Thank you for the reviewer’s comment. We gladly fixed that [line 512].  

      Reviewer #3 (Recommendations For The Authors):

      My minor recommendations for the authors are as follows:

      (1) The text can be a bit difficult to follow in places. This is partly due to the convoluted nature of the results, but I suggest a careful read-through to look for opportunities to improve the prose. In particular, there is a tendency to use long sentences and long paragraphs. For example, the third paragraph of the introduction runs for almost fifty lines.

      Thank you for the reviewer’s comment and suggestion. We have fixed that.

      (2) This might be due to journal compression, but some of the bar graphs in the figures are difficult to read. For example, the individual data points, especially when filled with striped background colors get lost. Axes can become invisible, like the y-axis in 7L, and portions of bars, like in 7F, are not always rendered correctly. Error bars are sometimes hidden behind data points, as in 5C. Increasing line thickness and shifting individual data points away from error bars might help with this.

      Thank you for the reviewer’s comment and suggestion. We made all the data points with black color and filled circles to make the data points visible. We put all the data points behind the main columns, so they don’t block the error bars. We have fixed figures 7 and 5.

      (3) Throughout the manuscript, the authors use a higher SMI to indicate a preference of cells for auditory stimuli with "greater spectral... complexity" (e.g., lines 219 and 372). I find this interpretation a bit challenging since SMI compares a neuron's preference for tones over noise, and to me, tones seem like the least spectrally complex of all auditory stimuli. Perhaps some clarification of what the authors mean by this would help. For example, is the assumption that a neuron that prefers tones over noise is, either directly or indirectly, receiving input sculpted by inhibitory processes?

      Thank you for the reviewer’s comment. In general, higher SMI values indicate an increase in the preference of the cells to respond to pure tones than noise with no modulation (less spectral complexity). We will clarify this statement throughout the manuscript. However, the SMI value was not mentioned in lines 219 and 372. The statement mentioned in line 219 describes the revised figure 5C (old 5B), where more cells in matrix specifically respond to AM-noise compared to modules, which indicates the preference of the matrix to respond to sounds of greater spectral and temporal complexity. The statement in 372 in the discussion section refers to the finding in revised figures 5E and F (old 5D and E). In the revised figure 5E or old 5D, the data show that matrix has more cells responding to pure tones or noise with no modulation than modules, so matrix has a lower threshold to detect the spectral features of sound (revised figure 5E or old 5D). In the revised figure 5F or old 5E, the data show that matrix has more cells responding to AM-noise than modules, which indicates that matrix functions more to process the temporal features of sound. As explained above, all findings were related to the percentage of cells responding to specific sound stimuli and not the exact SMI values. We have revised the figures accordingly by removing the terms SMI and TMI from the figures, and we have clarified that in the text.

      (4) Lines 250-253: How does a decrease in SMI correspond to "an increase in pure tone responsiveness?" Doesn't a decrease suggest the opposite?

      Thank you for the reviewer’s comment, which we agree with. We do apologize for that. We have fixed this statement [lines 275-277] and any related findings accordingly.

      (5) Line 304: Add "imaged via microprism" or similar after "response profiles with the LC.".

      Thank you for the reviewer’s suggestion. We have fixed that. However, we changed the term “LC imaged via microprism” into “LC(microprism)” for simplicity as suggested by another reviewer [line 330].

      (6) Figure 5A and C: Both plots show that more neurons responded to AM-noise than tones, but it would be interesting to know how much the tone-responsive and AM-noise responsive populations overlapped. Were all tone-responsive neurons also responsive to AM-noise?

      Thank you for the reviewer’s comment. We have categorized the cells based on their response to pure tone only, AM-only, and both pure tone and AM-noise when each stimulus is presented individually. We have modified Figures 5A and C, and they are now Figures 5B and D.

      (7) Figure 5G: Missing negative sign before "0.5.".

      Thank you for the reviewer’s suggestion. We gladly have fixed that. However, old Figure 5G became a revised Figure 5H.  

      (8) Figure 7 legend, Line 1102: Missing period after "(C and E)".

      Thank you for the reviewer’s suggestion. We think that the period should be placed before (C and E) at the end of “respectively”. The parentheses refer to the statements after them. We gladly fixed that. [line 1394]

    1. Author response:

      The following is the authors’ response to the current reviews.

      Joint Public Review:

      Xie et al. propose that the asymmetric segregation of the NuRD complex is regulated in a V-ATPase-dependent manner, and plays a crucial role in determining the differential expression of the apoptosis activator egl-1 and thus critical for the life/death fate decision.

      Remaining concerns are the following:

      The authors should provide the point-by-point response to the following issues. In particular, authors should provide clear reasoning as to why they did not address some of the following comments in the previous revisions. The next response should be directly answering to the following concerns.

      (1) Discussion should be added regarding the criticism that NuRD asymmetric segregation is simply a result of daughter cell size asymmetry. It is perfectly fine that the NuRD asymmetry is due to the daughter cell size difference (still the nucleus within the bigger daughter would have more NuRD, which can determine the fate of daughter cells). Once the authors add this clarification, some criticisms about 'control' may become irrelevant.

      We thank the reviewer for this suggestion. We will add the following text in the revised discussion on page 14, line 26:

      “…We cannot rule out the possibility that NuRD asymmetric segregation results from daughter cell size asymmetry. According to this perspective, the nucleus in the larger daughter cell could possess more NuRD, potentially influencing the fate of the daughter cells. However, it is important to note that the nuclear protein histone or the MYST family histone acetyltransferase is equally segregated in daughter cells of different sizes.….”

      (2) ZEN-4 is a kinesin that predominantly associates with the midzone microtubules and a midbody during mitosis. Given that midbodies can be asymmetrically inherited during cell division, ZEN-4 is not a good control for monitoring the inheritance of cytoplasmic proteins during asymmetric cell division. Other control proteins, such as a transcriptional factor that predominantly localizes in the cytoplasm during mitosis and enters into nucleus during interphase, are needed to clarify the concern.

      We clarified the issue of ZEN-4 below:

      The critique assumes that "midbodies can be asymmetrically inherited during cell division." However, this assumption does not apply to our study of Q cell asymmetric divisions. In our earlier research, we demonstrated that midbodies in Q cells are released post-division and subsequently engulfed by surrounding epithelial cells (Chai et al., Journal of Cell Biology, 2012). Moreover, we have shown that midbodies from the first cell division in C. elegans embryos are also released and engulfed by the P1 cell (Ou et al., Cell Research, 2013). Therefore, the notion of midbody asymmetric inheritance is irrelevant to this manuscript. Additionally, our manuscript already presents the example of the MYST family histone acetyltransferase, illustrating a nuclear protein that predominantly localizes in the cytoplasm during mitosis and symmetrically enters the nucleus during interphase.

      As for pHluorin experiments, symmetric inheritance of GFP and mCherry is not an appropriate evidence to estimate the level of pHluorin during asymmmetric Q cell division. This issue remains unsolved.

      We acknowledge the limitation of pHluorin in measuring the pH level in a living cell. Future studies could be performed to measure the dynamics of pH levels when advanced tools are available.

      (3) Q-Q plot (quantile-quantile plot) in Figure S10 can be used for visually checking normality of the data, but it does not guarantee that the distribution of each sample is normal and has the standard deviation compared with the other samples. I recommend the authors to show the actual statistical comparison P-values for each case. The authors also need to show the number of replicate experiments for each figure panel.

      We thank the reviewer for pointing this out. We will provide P-values for each case and the number of replicate experiments in the revised Figure 5-figure supplement 1 ( corresponding to Figure S10) and the figure legend.

      The authors left inappropriate graphs in the revised manuscript. In Figure 3E, some error bars are disconnected and the other are stuck in the bars. In Figure S4C, LIN-53 in QR.a/p graph shows lines disconnected from error bars.

      We thank the reviewer for pointing this out. We will correct these error bars.

      I am bit confused with the error bars in Figure 2B. Each dot represents a fluorescent intensity ratio of either HDA-1 or LIN-53 between the two daughter cells in a single animal. Plots are shown with mean and SEM, but several samples (for example, the left end) exhibit the SEM error bar very close to a range of min and max. I might misunderstand this graph but am concerned that Figure 2B may contain some errors in representing these data sets. I would like to ask the authors to provide all values in a table format so that the reviewers could verify the statistical tests and graph representation.

      We thank the reviewer for pointing this out. We apologize for the typo in Figure 2B figure legend. We will correct SEM to SD.

      (4) The authors still do not provide evidence that the increase in sAnxV::GFP and Pegl-1gfp or the increase in H3K27ac at the egl-1 gene in hda-1(RNAi) and lin-53(RNAi) animals is not a consequence of global effects on development. Indeed, the images provided in Figure S7B demonstrate that there are global effects in these animals. no causal interactions have been demonstrated.

      We cannot exclude the global effects and have discussed this issue in our previous manuscript on page 9, line 26:

      “...Considering the pleiotropic phenotypes caused by loss of HDA-1, we cannot exclude the possibility that ectopic cell death might result from global changes in development, even though HDA-1 may directly contribute to the life-versus-death fate determination.”

      (5) Figure 4: Due to the lack of appropriate controls for the co-IP experiment (Fig. 4), I remain unconvinced of the claim that the NuRD complex and V-ATPase specifically interact. Concerning the co-IP, the authors now mention that the co-IP was performed three times: "Assay was performed using three biological replicates. Three independent biological replicates of the experiment were conducted with similar results." However, the authors did not use ACT-4::GFP or GFP alone as controls for their co-IP as previously suggested. This is critical considering that the evidence for a specific HDA-1::GFP - V-ATPase interaction is rather weak (compare interactions between HDA-1::GFP and V-ATPase subunits in Fig 4B with those of HDA-1::GFP and subunits of NuRD in Fig S8B).

      We conducted GFP pull-down experiments and MS spectrometric analysis for HDA-::GFP and ACT-4::GFP using identical protocols, yielding consistent results. We agree with the reviewer that in our Western blot, inclusion of ACT-4::GFP is a more effective negative control compared to empty beads.

      (6) Based on Fig 5E, it appears that Bafilomycin treatment causes pleiotropic effects on animals (see differences in HDA-1::GFP signal in the three rows). The authors now state: "Although BafA1-mediated disruption of lysosomal pH homeostasis is recognized to elicit a wide array of intracellular abnormalities, we found no evidence of such pleiotropic effects at the organismal level with the dosage and duration of treatment employed in this study". However, the 'evidence' mentioned is not shown. It is critical that the authors provide this evidence.

      We thank the Reviewer for pointing out this issue. We only checked the viability of the L1 larvae and morphology of animals at the organismal level with the BafA1 dosage and duration of treatment and did not notice any death of the animals and apparent abnormality in morphology (N > 20 for each treatment). However, as the reviewer pointed out, there can be some abnormalities at the cellular level. We thus revised this above description as the following, on page 11, line 27:

      “…Although BafA1-mediated disruption of lysosomal pH homeostasis is recognized to elicit a wide array of intracellular abnormalities, we did not observe any larval deaths and apparent abnormality in morphology at the organismal level (N > 20 for each treatment) at the dose and duration of treatment employed in this study...”


      The following is the authors’ response to the previous reviews.

      eLife assessment

      The authors propose that the asymmetric segregation of the NuRD complex in C. elegans is regulated in a V-ATPase-dependent manner, that this plays a crucial role in determining the differential expression of the apoptosis activator egl-1, and that it is therefore critical for the life/death fate decision in this species. If proven, the proposed model of the V-ATPase-NuRD-EGL-1-Apoptosis cascade would shed light onto the mechanisms underlying the regulation of apoptosis fate during asymmetric cell division, and stimulate further investigation into the intricate interplay between V-ATPase, NuRD, and epigenetic modifications. However, the strength of evidence for this is currently incomplete.

      Public Review:

      Xie et al. propose that the asymmetric segregation of the NuRD complex is regulated in a V-ATPase-dependent manner, and plays a crucial role in determining the differential expression of the apoptosis activator egl-1 and thus critical for the life/death fate decision.

      While the model is very intriguing, the reviewers raised concerns regarding the rigor of the method. One issue is with statistics (either insufficient information or inadequate use of statistics), and second is the concern that the asymmetry observed may be caused by one cell dying (resulting in protein degradation, RNA degradation etc). We recommend that the authors address these issues.

      We extend our sincere thanks to the Editors and Reviewers for their insightful comments on this study.

      Major #1:

      There are still many misleading statements/conclusions that are not rigorously tested or that are logically flawed. These issues must be thoroughly addressed for this manuscript to be solid.

      (1) Asymmetry detected by scRNA seq vs. imaging may not represent the same phenomenon, thus should not be discussed as two supporting pieces of evidence for the authors' model, and importantly each method has its own flaw. First, for scRNA seq, when cells become already egl-1 positive, those cells may be already dying, and thus NuRD complex's transcripts' asymmetry may not have any significance. The data presented in FigS1D, E show that there are lots of genes (6487 out of 8624) that are decreased in dying cells. Thus, it is not convincing to claim that NuRD asymmetry is regulated by differential RNA amount.

      We agree with the reviewer's comment. Indeed, scRNA-seq reveals phenomena different from those observed in protein imaging, and NuRD asymmetry may not be regulated by differential RNA levels. Seven years ago, when we started this project, NuRD asymmetry during asymmetric neuroblast division was unknown. We first found NuRD mRNA asymmetry using scRNA-seq and then NuRD protein asymmetry using fluorescence imaging. We have documented the whole process of discovering NuRD asymmetry, although the asymmetry of NuRD complex transcripts does not necessarily imply protein asymmetry. We have revised statements related to "NuRD asymmetry being regulated by differential RNA amounts" and discussed this issue in the revised manuscript on page 14, line 2:

      " The transcript asymmetry detected by scRNA-seq may not correspond to the protein asymmetry detected by microscopic imaging. Our scRNA-seq data shows that 6487 out of 8624 genes were not detected in egl-1-positive cells, the putative apoptotic cells. Cells that are egl-1 positive may be undergoing apoptosis, rendering the asymmetry of NuRD complex transcripts insignificant in inferring protein asymmetry. Thus, the observed transcript asymmetry of the NuRD subunits between live and dead cells may be coincidental with NuRD protein asymmetry during asymmetric neuroblast division, rather than serving as a regulatory mechanism."

      (2) Regarding NuRD protein's asymmetry, there are still multiple issues. Most likely explanation of their asymmetry is purely daughter size asymmetry. Because one cell is much bigger than the other (3 times larger), NuRD components, which are not chromatin associated, would be inherited to the bigger cell 3 times more than the smaller daughter. Then, upon nuclear envelope reformation, NuRD components will enter the nucleus, and there will be 3 times more NuRD components in the bigger daughter cell. It is possible that this is actually the underling mechanism to regulate gene expression differentially, but this possibility is not properly acknowledged. Currently, the authors use chromatin associated protein (Mys-1) as 'symmetric control', but this is not necessarily a fair comparison. For NuRD asymmetry to be meaningful, an example of protein is needed that is non-chromatin associated in mitosis, distributed to daughter cells proportional to daughter cell size, and re-enter nucleus after nuclear envelope formation to show symmetric distribution. And if daughter size asymmetry is the cause of NuRD asymmetry, other lineages that do not undergo apoptosis but exhibit daughter size asymmetry would also show NuRD asymmetry. The authors should comment on this (if such examples exist, it is fine in that in those cell types, NuRD asymmetry may be used for differential gene expression, not necessarily to induce cell death, but such comparison provides the explanation for NuRD asymmetry, and puts the authors finding in a better context).

      For more than one decade, we have meticulously explored the relationship between protein asymmetry and cell size asymmetry during ACDs of Q cells. A notable example of even protein distribution is the cytokinetic kinesin ZEN-4, as documented in our 2012 publication in the Journal of Cell Biology (Chai et al., JCB, 2012). This study, primarily focusing on the fate of the midbody post-cell division, also showcased the dynamics of GFP-tagged ZEN-4 during ACDs of QR.a cells in movie S1. Intriguingly, beyond its role in the cytokinetic ring, we observed a uniform dispersal of ZEN-4 throughout the cytoplasm. Remarkably, following cell division, ZEN-4 transitions evenly into the nuclei of the daughter cells, a phenomenon with implications yet to be fully understood. One hypothesis is that ZEN-4's nuclear localization may prevent the formation of ectopic microtubule bundles in the cytosol during interphase. Below, we present a snapshot from our original movie, clearly showing the symmetrical distribution of ZEN-4 into the nuclei of the two daughter cells.

      (3) For the analysis of protein asymmetry between two daughters in Fig S4C, the method of calibration is unclear, making it difficult to interpret the results.

      In Figure S4C, we quantified the relative total fluorescence of the Q cell, with the quantification method illustrated in Figure S4A. To further clarify our quantification approach, we have updated Figure S4A and the "Live-Cell Imaging and Quantification" section in the Materials and Methods:

      “…To determine the ratios of fluorescence intensities in the posterior to anterior half (P/A) of Q.a lineages or A/P of Q.p lineages, the cell in the mean intensity projection was divided into posterior and anterior halves. ImageJ software was used to measure the mean fluorescence intensities of two halves with background subtraction. The slide background's mean fluorescence intensity was measured in a region devoid of worm bodies. The background-subtracted mean fluorescence intensities of the two halves were divided to calculate the ratio. The same procedure was used to determine the fluorescence intensity ratios between two daughter cells. Total fluorescence intensity was the sum of the posterior and anterior fluorescence intensities or the sum of fluorescence intensities from two daughter cells (Figure S4A). …”

      (4) As for pHluorin experiments, the authors were asked to test the changes in fluorescence observed are due to changes in pH or changes in the amount of pHluorin protein. They need to add a ratio-metric method in this manuscript. A brief mention to Page 12 line 12 is insufficient to clarify this issue.

      We appreciate the concerns about potential changes in pH or pHluorin protein levels. While we cannot completely dismiss the impact of changes in the amount of pHluorin protein, it appears improbable that the asymmetry of pHluorin fluorescence is attributed to an asymmetric amount of pHluorin protein. This inference is supported by the observation that other fluorescent proteins, such as GFP or mCherry, did not exhibit any asymmetry during ACDs of Q cells. An example of GFP alone during the ACD of QL.p is illustrated in figure 5A from Ou and Vale, JCB, 2009. The fluorescence intensities in the large QL.pa cell and the small QL.aa are indistinguishable.

      Major #2:

      Some issues surrounding statistics must be resolved.

      (1) Fig. 1FG, 2D, 3BDEG, 5BD and 6B used either one-sample t-test or unpaired two-tailed parametric t-test for statistical comparison. These t-tests require a verification of each sample fitting to a normal distribution. The authors need to describe a statistical test used to verify a normal distribution of each sample.

      (2) Fig. 2D, 3D, and 3G have very small sample size (N=3-4, N=6, N=3, respectively), it is possible that a normal distribution cannot be verified. How can the authors justify the use of one-sample t-test and unpaired parametric t-test ?

      (3) Statistical comparison in Fig. 2D and Fig. 6B should be re-assessed. For Fig. 2D, the authors need to compare the intensity ratio of HDA-1/LIN53 between sister cells dying within 35 min and those over 400 min. For Fig. 6B, they need to compare the intensity ratio of VHA-17 between DMSO- and BafA1- treated cells at the same time point after anaphase.

      We appreciate the reviewer's advice on the statistical analysis of our data. In response, we performed normality tests on the datasets presented in Figures 1F, 1G, 3B, 5B, 5D, and 6B, all of which passed the tests (as demonstrated in Figure S10). We also acknowledge the reviewer's comment on the inadequate sample sizes in Figures 2D, 3D, 3E, and 3G for fitting a normal distribution. Therefore, we have revised our statistical analysis methods for these figures and updated both the figures and their legends. The revised statistical results support the primary conclusions of this study.

      In response to the reviewer's observation regarding the small sample size in Figure 2D , which precluded normality verification, and the suggestion to compare sister cells that die within 35 minutes to those surviving over 400 minutes, we adapted our approach. We implemented the Kruskal-Wallis test to evaluate the differences among the groups. To assess the specific differences between each group and the 400 min MSpppaap group, we conducted the Dunn’s multiple comparisons test. The revised Figure 2D illustrates the updated statistical significance.

      For Figure 3D, due to the small sample size precluding normality verification, we applied the Wilcoxon test with 1 as the theoretical median. The revised Figure 3D illustrates the updated statistical significance.

      For Figure 3E, where the sample size also hindered normality verification, we conducted the Kruskal-Wallis test to evaluate the overall effect. Additionally, Dunn’s multiple comparisons test was utilized to examine the differences between groups. The revised Figure 3E illustrates the updated statistical significance.

      For Figure 3G, the reviewer pointed out the small sample size and the limited statistical power due to having only three data points per group. To address this, we revised the figure to visually present each data point, aiming to more clearly illustrate the variation trends.

      For Figure 6B, following the reviewer's suggestion, we compared the DMSO group directly with the Baf A1 group, updating Figure 6B to reflect this comparison as advised.

      These adjustments have been made to ensure the statistical analyses are robust and appropriate given the sample sizes and to align with the reviewer's recommendations, enhancing the clarity and accuracy of our findings.

      Recommendations for the authors:

      We recommend using grey scale (instead of 'heatmap' representation) to show the protein distribution of interest. Heatmap does not help at all, because 'total protein amount per cell' (instead of signal intensity on each pixel) is what matters in the context of this paper. Heatmap presentation does not allow readers to integrate signal intensity with their eyes.

      We thank the editor for pointing this out. We have changed heatmaps to inverted fluorescence images in grey scale.

    2. eLife assessment

      The authors make the intriguing proposal that the NuRD complex in C. elegans, which has been linked to regulation of the cell death protein EGL-1 before, becomes asymmetrically distributed after cell division and that this asymmetry relies on V-ATPase activity. Whereas some disagreement remained between the reviewers' and the authors' interpretation, the final version incorporated alternative possibilities in the text, and with careful interpretation, the current manuscript's model is supported by solid data, and represents a valuable contribution to the field.

    1. eLife assessment

      The study presents a valuable tool for searching molecular dynamics simulation data, making such datasets accessible for open science. The authors provide convincing evidence that it is possible to identify noteworthy molecular dynamics simulation datasets and that their analysis can produce information of value to the community.

    2. Reviewer #1 (Public Review):

      Summary:

      Tiemann et al. have undertaken an original study on the availability of molecular dynamics (MD) simulation datasets across the Internet. There is a widespread belief that extensive, well-curated MD datasets would enable the development of novel classes of AI models for structural biology. However, currently, there is no standard for sharing MD datasets. As generating MD datasets is energy-intensive, it is also important to facilitate the reuse of MD datasets to minimize energy consumption. Developing a universally accepted standard for depositing and curating MD datasets is a huge undertaking. The study by Tiemann et al. will be very valuable in informing policy developments toward this goal.

      Strengths:

      The study presents an original approach to addressing a growing concern in the field. It is clear that adopting a more collaborative approach could significantly enhance the impact of MD simulations in modern molecular sciences.

      The timing of the work is appropriate, given the current interest in developing AI models for describing biomolecular dynamics.

      Weaknesses:

      The study primarily focuses on one major MD engine (GROMACS), although this limitation is not significant considering the proof-of-concept nature of the study.

    3. Reviewer #2 (Public Review):

      Summary:

      Molecular dynamics (MD) data is deposited in public, non-specialist repositories. This work starts from the premise that these data are a valuable resource as they could be used by other researchers to extract additional insights from these simulations; it could also potentially be used as training data for ML/AI approaches. The problem is that mining these data is difficult because they are not easy to find and work with. The primary goal of the authors was to discover and index these difficult-to-find MD datasets, which they call the "dark matter of the MD universe" (in contrast to data sets held in specialist databases).

      The authors developed a search strategy that avoided the use of ill-defined metadata but instead relied on the knowledge of the restricted set of file formats used in MD simulations as a true marker for the data they were looking for. Detection of MD data marked a data set as relevant with a follow-up indexing strategy of all associated content. This "explore-and-expand" strategy allowed the authors for the first time to provide a realistic census of the MD data in non-specialist repositories.

      As a proof of principle, they analyzed a subset of the data (primarily related to simulations with the popular Gromacs MD package) to summarize the types of simulated systems (primarily biomolecular systems) and commonly used simulation settings.

      Based on their experience they propose best practices for metadata provision to make MD data FAIR (findable, accessible, interoperable, reusable).

      A prototype search engine that works on the indexed datasets is made publicly available. All data and code are made freely available as open source/open data.

      Strengths:

      - The novel search strategy is based on relevant data to identify full datasets instead of relying on metadata and thus is likely to have many true positives and few false positives.

      - The paper provides a first glimpse at the potential hidden treasures of MD simulations and force field parametrizations of molecules.

      - Analysis of parameter settings of MD simulations from how researchers *actually* run simulations can provide valuable feedback to MD code developers for how to document/educate users. This approach is much better than analyzing what authors write in the Methods sections.

      - The authors make a prototype search engine available.

      - The guidelines for FAIR MD data are based on experience gained from trying to make sense of the data.

      Weaknesses:

      - So far the work is a proof-of-concept that focuses on MD data produced by Gromacs (which was prevalent under all indexed and identified packages).

      As discussed in the manuscript, some types of biomolecules are likely underrepresented because different communities have different preferences for force fields/MD codes (for example: carbohydrates with AMBER/GLYCAM using AMBER MD instead of Gromacs).

      - Materials sciences seem to be severely under-represented - commonly used codes in this area such as LAMMPS are not even detected, and only very few examples could be identified. As it is, the paper primarily provides an insight into the *biomolecular* MD simulation world.

      The authors succeed in providing a first realistic view on what MD data is available in public repositories. In particular, their explore-expand approach has the potential to be customized for all kinds of specialist simulation data, whereby specific artifacts are<br /> used as fiducial markers instead of metadata. The more detailed analysis is limited to Gromacs simulations and primarily biomolecular simulations (even though MD is also widely used in other fields such as the materials sciences). This restricted view may simply be correlated with the user community of Gromacs and hopefully, follow-up studies from this work will shed more light on this shortcoming.

      The study quantified the number of trajectories currently held in structured databases as ~10k vs ~30k in generalist repositories. To go beyond the proof-of-principle analysis it would be interesting to analyze the data in specialist repositories in the same way as the one in the generalist ones, especially as there are now efforts underway to create a database for MD simulations (Grant 'Molecular dynamics simulation for biology and chemistry research' to establish MDDB' DOI 10.3030/101094651). One should note that structured databases do not invalidate the approach pioneered in this work; if anything they are orthogonal to each other and both will likely play an important role in growing the usefulness of MD simulations in the future.

    4. Reviewer #3 (Public Review):

      Molecular dynamics (MD) simulations nowadays are an essential element of structural biology investigations, complementing experiments and aiding their interpretation by revealing transient processes or details (such as the effects of glycosylation on the SARS-CoV-2 spike protein, for example (Casalino et al. ACS Cent. Sci. 2020; 6, 10, 1722-1734 https://doi.org/10.1021/acscentsci.0c01056) that cannot be observed directly. MD simulations can allow for the calculation of thermodynamic, kinetic, and other properties and the prediction of biological or chemical activity. MD simulations can now serve as "computational assays" (Huggins et al. WIREs Comput Mol Sci. 2019; 9:e1393. https://doi.org/10.1002/wcms.1393). Conceptually, MD simulations have played a crucial role in developing the understanding that the dynamics and conformational behaviour of biological macromolecules are essential to their function, and are shaped by evolution. Atomistic simulations range up to the billion atom scale with exascale resources (e.g. simulations of SARS-CoV-2 in a respiratory aerosol. Dommer et al. The International Journal of High Performance Computing Applications. 2023; 37:28-44. doi:10.1177/10943420221128233), while coarse-grained models allow simulations on even larger length- and timescales. Simulations with combined quantum mechanics/molecular mechanics (QM/MM) methods can investigate biochemical reactivity, and overcome limitations of empirical forcefields (Cui et al. J. Phys. Chem. B 2021; 125, 689 https://doi.org/10.1021/acs.jpcb.0c09898).

      MD simulations generate large amounts of data (e.g. structures along the MD trajectory) and increasingly, e.g. because of funder mandates for open science, these data are deposited in publicly accessible repositories. There is real potential to learn from these data en masse, not only to understand biomolecular dynamics but also to explore methodological issues. Deposition of data is haphazard and lags far behind experimental structural biology, however, and it is also hard to answer the apparently simple question of "what is out there?". This is the question that Tiemann et al explore in this nice and important work, focusing on simulations run with the widely used GROMACS package. They develop a search strategy and identify almost 2,000 datasets from Zenodo, Figshare and Open Science Framework. This provides a very useful resource. For these datasets, they analyse features of the simulations (e.g. atomistic or coarse-grained), which provides a useful snapshot of current simulation approaches. The analysis is presented clearly and discussed insightfully. They also present a search engine to explore MD data, the MDverse data explorer, which promises to be a very useful tool.

      As the authors state: "Eventually, front-end solutions such as the MDverse data explorer tool can evolve being more user-friendly by interfacing the structures and dynamics with interactive 3D molecular viewers". This will make MD simulations accessible to non-specialists and researchers in other areas. I would envisage that this will also include approaches using interactive virtual reality for an immersive exploration of structure and dynamics, and virtual collaboration (e.g. O'Connor et al., Sci. Adv.4, eaat2731 (2018). DOI:10.1126/sciadv.aat2731)

      The need to share data effectively, and to compare simulations and test models, was illustrated clearly in the COVID-19 pandemic, which also demonstrated a willingness and commitment to data sharing across the international community (e.g. Amaro and Mulholland, J. Chem. Inf. Model. 2020, 60, 6, 2653-2656 https://doi.org/10.1021/acs.jcim.0c00319; Computing in Science & Engineering 2020, 22, 30-36 doi: 10.1109/MCSE.2020.3024155). There are important lessons to learn here, for simulations to be reproducible and reliable, for rapid testing, for exploiting data with machine learning, and for linking to data from other approaches. Tiemann et al. discuss how to develop these links, providing good perspectives and suggestions.

      I agree completely with the statement of the authors that "Even if MD data represents only 1 % of the total volume of data stored in Zenodo, we believe it is our responsibility, as a community, to develop a better sharing and reuse of MD simulation files - and it will neither have to be particularly cumbersome nor expensive. To this end, we are proposing two solutions. First, improve practices for sharing and depositing MD data in data repositories. Second, improve the FAIRness of already available MD data notably by improving the quality of the current metadata."

      This nicely states the challenge to the biomolecular simulation community. There is a clear need for standards for MD data and associated metadata. This will also help with the development of standards of best practice in simulations. The authors provide useful and detailed recommendations for MD metadata. These recommendations should contribute to discussions on the development of standards by researchers, funders, and publishers. Community organizations (such as CCP-BioSim and HECBioSim in the UK, BioExcel, CECAM, MolSSI, learned societies etc) have an important part to play in these developments, which are vital for the future of biomolecular simulation.

    5. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      The study presents a valuable tool for searching molecular dynamics simulation data, making such data sets accessible for open science. The authors provide convincing evidence that it is possible to identify useful molecular dynamics simulation data sets and their analysis can produce valuable information.

      Public Reviews

      Reviewer #1 (Public Review):

      Summary:

      Tiemann et al. have undertaken an original study on the availability of molecular dynamics (MD) simulation datasets across the Internet. There is a widespread belief that extensive, well-curated MD datasets would enable the development of novel classes of AI models for structural biology. However, currently, there is no standard for sharing MD datasets. As generating MD datasets is energy-intensive, it is also important to facilitate the reuse of MD datasets to minimize energy consumption. Developing a universally accepted standard for depositing and curating MD datasets is a huge undertaking. The study by Tiemann et al. will be very valuable in informing policy developments toward this goal.

      Strengths:

      The study presents an original approach to addressing a growing concern in the field. It is clear that adopting a more collaborative approach could significantly enhance the impact of MD simulations in modern molecular sciences.

      The timing of the work is appropriate, given the current interest in developing AI models for describing biomolecular dynamics.

      Weaknesses:

      The study primarily focuses on one major MD engine (GROMACS), although this limitation is not significant considering the proof-of-concept nature of the study.

      We thank the reviewer for his/her comments. Moving forward, our plan includes expanding this research to encompass other MD engines used in biomolecular simulations and materials sciences, such as NAMD, Charmm, Amber, LAMMPS, etc. However, this requires parsing associated files to supplement the sparse metadata generally available for the related datasets

      Reviewer #2 (Public Review):

      Summary:

      Molecular dynamics (MD) data is deposited in public, non-specialist repositories. This work starts from the premise that these data are a valuable resource as they could be used by other researchers to extract additional insights from these simulations; it could also potentially be used as training data for ML/AI approaches. The problem is that mining these data is difficult because they are not easy to find and work with. The primary goal of the authors was to discover and index these difficult-to-find MD datasets, which they call the "dark matter of the MD universe" (in contrast to data sets held in specialist databases).

      The authors developed a search strategy that avoided the use of ill-defined metadata but instead relied on the knowledge of the restricted set of file formats used in MD simulations as a true marker for the data they were looking for. Detection of MD data marked a data set as relevant with a follow-up indexing strategy of all associated content. This "explore-and-expand" strategy allowed the authors for the first time to provide a realistic census of the MD data in non-specialist repositories.

      As a proof of principle, they analyzed a subset of the data (primarily related to simulations with the popular Gromacs MD package) to summarize the types of simulated systems (primarily biomolecular systems) and commonly used simulation settings.

      Based on their experience they propose best practices for metadata provision to make MD data FAIR (findable, accessible, interoperable, reusable).

      A prototype search engine that works on the indexed datasets is made publicly available. All data and code are made freely available as open source/open data.

      Strengths:

      The novel search strategy is based on relevant data to identify full datasets instead of relying on metadata and thus is likely to have many true positives and few false positives.

      The paper provides a first glimpse at the potential hidden treasures of MD simulations and force field parametrizations of molecules.

      Analysis of parameter settings of MD simulations from how researchers *actually* run simulations can provide valuable feedback to MD code developers for how to document/educate users. This approach is much better than analyzing what authors write in the Methods sections.

      The authors make a prototype search engine available.

      The guidelines for FAIR MD data are based on experience gained from trying to make sense of the data.

      Weaknesses:

      So far the work is a proof-of-concept that focuses on MD data produced by Gromacs (which was prevalent under all indexed and identified packages).

      As discussed in the manuscript, some types of biomolecules are likely underrepresented because different communities have different preferences for force fields/MD codes (for example: carbohydrates with AMBER/GLYCAM using AMBER MD instead of Gromacs).

      Materials sciences seem to be severely under-represented --- commonly used codes in this area such as LAMMPS are not even detected, and only very few examples could be identified. As it is, the paper primarily provides an insight into the *biomolecular* MD simulation world.

      The authors succeed in providing a first realistic view on what MD data is available in public repositories. In particular, their explore-expand approach has the potential to be customized for all kinds of specialist simulation data, whereby specific artifacts are used as fiducial markers instead of metadata. The more detailed analysis is limited to Gromacs simulations and primarily biomolecular simulations (even though MD is also widely used in other fields such as the materials sciences). This restricted view may simply be correlated with the user community of Gromacs and hopefully, follow-up studies from this work will shed more light on this shortcoming.

      The study quantified the number of trajectories currently held in structured databases as ~10k vs ~30k in generalist repositories. To go beyond the proof-of-principle analysis it would be interesting to analyze the data in specialist repositories in the same way as the one in the generalist ones, especially as there are now efforts underway to create a database for MD simulations (Grant 'Molecular dynamics simulation for biology and chemistry research' to establish MDDB' DOI 10.3030/101094651). One should note that structured databases do not invalidate the approach pioneered in this work; if anything they are orthogonal to each other and both will likely play an important role in growing the usefulness of MD simulations in the future.

      We thank the reviewer for his/her comments. As mentioned to Reviewer 1, we intend to extend this work to other MD engines in the near future to go beyond Gromacs and even biomolecular simulations. Furthermore, as the value of accessing and indexing specialized MD databases such as MDDB, MemprotMD, GPCRmd, NMRLipids, ATLAS, and others has been mentioned by the reviewer, it is indeed one of our next steps to continue to expand the MDverse catalog of MD data. This indexing may also extend the visibility and widespreaded adoptability of these specific databases.

      Reviewer #3 (Public Review):

      Molecular dynamics (MD) simulations nowadays are an essential element of structural biology investigations, complementing experiments and aiding their interpretation by revealing transient processes or details (such as the effects of glycosylation on the SARS-CoV-2 spike protein, for example (Casalino et al. ACS Cent. Sci. 2020; 6, 10, 1722-1734 https://doi.org/10.1021/acscentsci.0c01056) that cannot be observed directly. MD simulations can allow for the calculation of thermodynamic, kinetic, and other properties and the prediction of biological or chemical activity. MD simulations can now serve as "computational assays" (Huggins et al. WIREs Comput Mol Sci. 2019; 9:e1393.

      https://doi.org/10.1002/wcms.1393). Conceptually, MD simulations have played a crucial role in developing the understanding that the dynamics and conformational behaviour of biological macromolecules are essential to their function, and are shaped by evolution. Atomistic simulations range up to the billion atom scale with exascale resources (e.g. simulations of SARS-CoV-2 in a respiratory aerosol. Dommer et al. The International Journal of High Performance Computing Applications. 2023; 37:28-44. doi:10.1177/10943420221128233), while coarse-grained models allow simulations on even larger length- and timescales. Simulations with combined quantum mechanics/molecular mechanics (QM/MM) methods can investigate biochemical reactivity, and overcome limitations of empirical forcefields (Cui et al. J. Phys. Chem. B 2021; 125, 689 https://doi.org/10.1021/acs.jpcb.0c09898).

      MD simulations generate large amounts of data (e.g. structures along the MD trajectory) and increasingly, e.g. because of funder mandates for open science, these data are deposited in publicly accessible repositories. There is real potential to learn from these data en masse, not only to understand biomolecular dynamics but also to explore methodological issues. Deposition of data is haphazard and lags far behind experimental structural biology, however, and it is also hard to answer the apparently simple question of "what is out there?". This is the question that Tiemann et al explore in this nice and important work, focusing on simulations run with the widely used GROMACS package. They develop a search strategy and identify almost 2,000 datasets from Zenodo, Figshare and Open Science Framework. This provides a very useful resource. For these datasets, they analyse features of the simulations (e.g. atomistic or coarse-grained), which provides a useful snapshot of current simulation approaches. The analysis is presented clearly and discussed insightfully. They also present a search engine to explore MD data, the MDverse data explorer, which promises to be a very useful tool.

      As the authors state: "Eventually, front-end solutions such as the MDverse data explorer tool can evolve being more user-friendly by interfacing the structures and dynamics with interactive 3D molecular viewers". This will make MD simulations accessible to non-specialists and researchers in other areas. I would envisage that this will also include approaches using interactive virtual reality for an immersive exploration of structure and dynamics, and virtual collaboration (e.g. O'Connor et al., Sci. Adv.4, eaat2731 (2018). DOI:10.1126/sciadv.aat2731)

      The need to share data effectively, and to compare simulations and test models, was illustrated clearly in the COVID-19 pandemic, which also demonstrated a willingness and commitment to data sharing across the international community (e.g. Amaro and Mulholland, J. Chem. Inf. Model. 2020, 60, 6, 2653-2656 https://doi.org/10.1021/acs.jcim.0c00319; Computing in Science & Engineering 2020, 22, 30-36 doi: 10.1109/MCSE.2020.3024155). There are important lessons to learn here, for simulations to be reproducible and reliable, for rapid testing, for exploiting data with machine learning, and for linking to data from other approaches. Tiemann et al. discuss how to develop these links, providing good perspectives and suggestions.

      I agree completely with the statement of the authors that "Even if MD data represents only 1 % of the total volume of data stored in Zenodo, we believe it is our responsibility, as a community, to develop a better sharing and reuse of MD simulation files - and it will neither have to be particularly cumbersome nor expensive. To this end, we are proposing two solutions. First, improve practices for sharing and depositing MD data in data repositories. Second, improve the FAIRness of already available MD data notably by improving the quality of the current metadata."

      This nicely states the challenge to the biomolecular simulation community. There is a clear need for standards for MD data and associated metadata. This will also help with the development of standards of best practice in simulations. The authors provide useful and detailed recommendations for MD metadata. These recommendations should contribute to discussions on the development of standards by researchers, funders, and publishers. Community organizations (such as CCP-BioSim and HECBioSim in the UK, BioExcel, CECAM, MolSSI, learned societies etc) have an important part to play in these developments, which are vital for the future of biomolecular simulation.

      We thank the reviewer for his/her comments. Beyond the points mentioned to Reviewers 1 and 2, as the reviewer suggested, it would be of great interest to combine innovative and immersive approaches to visualize and possibly interact with the data collected. This is indeed more and more amenable thanks to technologies such as WebGL and programs such as Mol*, or even - as also pointed out by the reviewer - through virtual reality, for example with the mentioned Narupa framework or with the UnityMol software. For a comprehensive review on MD trajectory visualization and associated challenges, we refer to our recent review article https://doi.org/10.3389/fbinf.2024.1356659.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Some minor text editing would improve the readability of the manuscript.

      It would be very useful if the authors could share their perspectives on the best and most efficient approach to sharing datasets and code associated with a publication. My concern lies in the fact that Github, which is currently the dominant platform for sharing code, is not well-suited for hosting large MD datasets. As a result, researchers often need to adopt a workflow where code is shared on Github and datasets are stored elsewhere (e.g., Zenodo). While this is feasible, it adds extra work. Ideally, a transparent process could be developed to seamlessly share code and datasets linked to a study through a unified interface.

      We thank the reviewer for this excellent suggestion. To our knowledge, there is yet no easy framework to jointly store and share code and data, linked to their scientific publication. Of course, code can be submitted to “generic” databases along with the data, but at the current state, those do not provide such useful features like collaborative work & track recording as done to the extent of GitHub.

      Although GitHub is indeed a suitable platform to deposit code, we strongly advise researchers to archive their code in Software Heritage. In addition to preserving source code, Software Heritage provides a unique identifier called SWHID that unambiguously makes reference to a specific version of the source code.

      So far, it is the responsibility of the scientific publication authors to link datasets and source codes (whether in GitHub or Software Heritage) in their paper, but also to make the reverse link from the data and code sharing platforms to the paper after publication.

      As mentioned by the reviewer, a unified interface that could ease this process would significantly contribute to FAIR-ness in MD.

      Reviewer #2 (Recommendations For The Authors):

      L180: I am not aware that TRR files contain energy terms as stated here, my understanding was that EDR files primarily served that purpose.

      “…available in one dataset. Interestingly, we found 1,406 .trr files, Which contain trajectory but also additional information such as velocities, energy of the system, etc’ While the file is especially useful in terms of reusability, the large size (can go up to several 100GB) limits its deposition in most…”

      Indeed, our formulation was ambiguous. The EDR files contain the detailed information on energies, whereas TRR files contain numerous values from the trajectory such as coordinates, velocities, forces and to some extent also energies

      (https://manual.gromacs.org/current/reference-manual/file-formats.html#trr)

      L207: The text states that the total time was not available from XTC files, only the number of frames. However, XTC files record time stamps in addition to frame numbers. As long as these times are in the Gromacs standard of picoseconds, the simulation time ought to be available from XTCs.

      “…systems and the number of frames available in the files (Fig. 3-B). Of note, the frames do not directly translate to the simulation runtime - more information deposited in other files (e.g. .mdp files) is needed to determine the complete runtime of the simulation. The system was up…”.

      Thank you for the useful comment, we removed this sentence. We now mention that studying the simulation time would be of interest in the future, especially when we will perform an exhaustive analysis of XTC files.

      “Of note, as .xtc files also contain time stamps, it would be interesting to study the relationship between the time and the number of frames to get useful information about the sampling. Nevertheless, this analysis would be possible only for unbiased MD simulations. So, we would need to decipher if the .xtc file is coming from biased or unbiased simulations, which may not be trivial.”

      Analysis of MDP files: Were these standard equilibrium MD or can you distinguish biased MD or free energy calculations?

      Currently we do not distinguish between biased and unbiased MD, but in the future we may attempt to do so, e.g. by correlating it with standard equilibration force-fields/parameters, timesteps or similar. Nevertheless, a true distinction will remain challenging.

      L336: typo: pikes -> spikes (or peaks?)

      “…simulations of Lennard-Jones models (Jeon et al., 2016). Interestingly, we noticed the appearance of several pikes at 400K, 600K and 800K, which were not present before the end of the year 2022. These peaks correspond to the same study related to the stability of hydrated crystals (Dybeck et al., 2023)’ Overall, thhis analysis revealed that a wide range of temperatures have been explored,…”

      Thank you. We have corrected this typo.

      Make clear how multiple versions of data sets are handled, e.g., if v1, v2, and v3 of a dataset are provided in Zenodo then which one is counted or are all counted?

      We collected the latest version only of datasets, as exposed by default by the Zenodo API. To reflect this, we added the following sentence to the Methods and Materials section, Initial data collection sub-section:

      “By default, the last version of the datasets was collected.”

      L248 Analysis of GRO files seems fairly narrow because PDB files are very often used for exactly the same purpose, even in the context of Gromacs simulations, not the least because it is familiar to structural biologists that may be interested in representative MD snapshots. Despite all the shortcomings of abusing the PDB format for MD, it is an attempt at increased interoperability. Perhaps the authors can make sure that readers understand that choosing GRO for analysis may give a somewhat skewed picture, even within Gromacs simulations.

      Thanks for this comment. We collected about 12,000 PDB files that could indeed be output from Gromacs simulations and easily be shared due to the universality of this format, but that could as well come from different sources (like other MD packages or the PDB database itself). We purposely decided to limit our study to files strictly associated with the Gromacs package, like MDP and XTC file types. However, we will extend our survey to all other structure-like formats and especially the PDB file type. We reflected this purpose in the following sentence (after line 281)

      “Beyond .gro files, we would like to analyze the ensemble of the ~12,000 .pdb files extracted in this study (see Figure 2-B) to better characterize the types of molecular structures deposited.”

      A simple template metadata file would be welcome (e.g., served from a GitHub/GitLab repository so that it can be improved with community input).

      Thank you for this suggestion that we fundamentally agree with. However, the generation of such a file is a major task, and we believe that the creation of a metadata file template requires far-reaching considerations, therefore is beyond the scope of this paper and should not be decided by a small group of researchers. Indeed, this topic requires a large consensus of different stakeholders, from users, to MD program developers, and journal editors. It would be especially useful to organize dedicated workshops with representatives of all these communities to tackle this specific issue, as mentioned by Reviewer3 in his/her public review. As a basis for this discussion, we humbly proposed at the end of this manuscript a few non-constraining guidelines based on our experience retrieving the data.

      To emphasize this statement, we added the following sentence at the end of the “Guidelines for better sharing of MD simulation data” section (line 420):

      “Converging on a set of metadata and format requires a large consensus of different stakeholders from users, to MD program developers, and journal editors. It would be especially useful to organize specific workshops with representatives of all these communities to collectively tackle this specific issue.”

      In "Data and code availability" it would be good to specify licenses in addition to stating "open source". Thank you for pointing out that GitLab/GitHub are not archives and that everyone should be strongly encouraged to submit data to stable archival repositories.

      We added the corresponding licenses for code and data in the “Data and code availability” section.

      Reviewer #3 (Recommendations For The Authors)

      The paper is well written, with very few typographical or other minor errors.

      Minor points:

      Line 468-9 "can evolve being more user-friendly" should be "can evolve to being more user-friendly", I think.

      Thank you, we have changed the wording accordingly.

    1. eLife assessment

      The authors propose that the asymmetric segregation of the NuRD complex in C. elegans is regulated in a V-ATPase-dependent manner, that this plays a crucial role in determining the differential expression of the apoptosis activator egl-1, and that it is therefore critical for the life/death fate decision in this species. If proven, the proposed model of the V-ATPase-NuRD-EGL-1-Apoptosis cascade would shed light onto the mechanisms underlying the regulation of apoptosis fate during asymmetric cell division, and stimulate further investigation into the intricate interplay between V-ATPase, NuRD, and epigenetic modifications. However, the strength of evidence for this is currently incomplete.

    1. eLife assessment

      The manuscript describes a careful, quantitative analysis of Myosin 10 molecules in U2OS cells, a widely used model for studying filopodia, and how many are present in the cytosol versus filopodia. This important study provides key parameters that are required for building a biophysical model of filopodia which is required to gain a complete understanding of these major actin-based structures. The evidence for the conclusions is compelling, but there are also certain areas of the manuscript that require clarification.

    2. Reviewer #1 (Public Review):

      Summary:

      The manuscript proposes an alternative method by SDS-PAGE calibration of Halo-Myo10 signals to quantify myosin molecules at specific subcellular locations, in this specific case filopodia, in epifluorescence datasets compared to the more laborious and troublesome single molecule approaches. Based on these preliminary estimates, the authors developed further their analysis and discussed different scenarios regarding myosin 10 working models to explain intracellular diffusion and targeting to filopodia.

      Strengths:

      I confirm my previous assessment. Overall, the paper is elegantly written and the data analysis is appropriately presented. Moreover, the novel experimental approach offers advantages to labs with limited access to high-end microscopy setups (super-resolution and/or EM in particular), and the authors proved its applicability to both fixed and live samples.

      Weaknesses:

      Myself and the other two reviewers pointed to the same weakness, the use of protein overexpression in U2OS. The authors claim that Myosin10 is not expressed by U2OS, based on Western blot analysis. Does this completely rule out the possibility that what they observed (the polarity of filopodia and the bulge accumulation of Myo10) could be an artefact of overexpression? I am afraid this still remains the main weakness of the paper, despite being properly acknowledged in the Limitations.

      I consider all the remaining issues I expressed during the first revision solved.

    3. Reviewer #2 (Public Review):

      Summary:

      The paper sought to determine the number of myosin 10 molecules per cell and localized to filopodia, where they are known to be involved in formation, transport within, and dynamics of these important actin-based protrusions. The authors used a novel method to determine the number of molecules per cell. First, they expressed HALO tagged Myo10 in U20S cells and generated cell lysates of a certain number of cells and detected Myo10 after SDS-PAGE, with fluorescence and a stained free method. They used a purified HALO tagged standard protein to generate a standard curve which allowed for determining Myo10 concentration in cell lysates and thus an estimate of the number of Myo10 molecules per cell. They also examined the fluorescence intensity in fixed cell images to determine the average fluorescence intensity per Myo10 molecule, which allowed the number of Myo10 molecules per region of the cell to be determined. They found a relatively small fraction of Myo10 (6%) localizes to filopodia. There are hundreds of Myo10 in each filopodia, which suggests some filopodia have more Myo10 than actin binding sites. Thus, there may be crowding of Myo10 at the tips, which could impact transport, the morphology at the tips, and dynamics of the protrusions themselves. Overall, the study forms the basis for a novel technique to estimate the number of molecules per cell and their localization to actin-based structures. The implications are broad also for being able to understand the role of myosins in actin protrusions, which is important for cancer metastasis and wound healing.

      Strengths:

      The paper addresses an important fundamental biological question about how many molecular motors are localized to a specific cellular compartment and how that may relate to other aspects of the compartment such as the actin cytoskeleton and the membrane. The paper demonstrates a method of estimating the number of myosin molecules per cell using the fluorescently labeled HALO tag and SDS-PAGE analysis. There are several important conclusions from this work in that it estimates the number of Myo10 molecules localized to different regions of the filopodia and the minimum number required for filopodia formation. The authors also establish a correlation between number of Myo10 molecules filopodia localized and the number of filopodia in the cell. There is only a small % of Myo10 that tip localized relative to the total amount in the cell, suggesting Myo10 have to be activated to enter the filopodia compartment. The localization of Myo10 is log-normal, which suggests a clustering of Myo10 is a feature of this motor.

      One of the main critiques of the manuscript was that the results were derived from experiments with overexpressed Myo10 and therefore are hard to extrapolate to physiological conditions. The authors counter this critique with the argument that their results provide insight into a system in which Myo10 is a limiting factor for controlling filopodia formation. They demonstrate that U20S cells do not express detectable levels of Myo10 (supplementary Figure 1E) and thus introducing Myo10 expression demonstrates how triggering Myo10 expression impacts filopodia. An example is given how melanoma cells often heavily upregulation Myo10.

      In addition, the revised manuscript addresses the concerns about the method to quantitate the number of Myo10 molecules per cell and therefore puncta in the cell. The authors have now made a good faith effort to correct for incomplete labeling of the HALO tag (Figure 2A-C, supplementary Figure 2D-E). The authors also address the concerns about variability in transfection efficiency (Figure 1D-E).

      A very interesting addition to the revised manuscript was the quantitation of the number of Myo10 molecules present during an initiation event when a newly formed filopodia just starts to elongate from the plasma membrane. They conclude that 100s of Myo10 molecules are present during an initiation event. They also examined other live cell imaging events in which growth occurs from a stable filopodia tip and correlated with elongation rates.

      Weaknesses:

      The authors acknowledge that a limitation of the study is that all of the experiments were performed with overexpressed Myo10. They address this limitation in the discussion but also provide important comparisons for how their work relates to physiological conditions, such as melanoma cells that only express large amounts of Myo10 when they are metastatic. Also, the speculation about how fascin can outcompete Myo10 should include a mechanism for how the physiological levels of fascin can complete with the overabundance of Myo10 (page 10, lines 401-408).

    4. Reviewer #3 (Public Review):

      Summary

      The work represents progress in quantifying the number of Myo10 molecules present in the filopodia tip. It reveals that cells overexpressing fluorescently labeled Myo10 that the tip can accommodate a wide range of Myo10 motors, up to hundreds of molecules per tip.

      The revised, expanded manuscript addresses all of this reviewer's original comments. The new data, analysis and writing strengthen the paper. Given the importance of filopodia in many cellular/developmental processes and the pivotal, as yet not fully understood role of Myo10 in their formation and extension, this work provides a new look at the nature of the filopodial tip and its ability to accommodate a large number of Myo10 motor proteins through interactions with the actin core and surrounding membrane.

      Specific comments -

      (1) One of the comments on the original work was that the analysis here is done using cells ectopically expressing HaloTag-Myo10. The author's response is that cells express a range of Myo10 levels and some metastatic cancer cells, such as breast cancer, have significantly increased levels of Myo10 compared to non-transformed cell lines. It is not really clear how much excess Myo10 is present in those cells compared to what is seen here for ectopic expression in U2OS cells, making a direct correspondence difficult.

      In response to comments about the bulbous nature of many filopodia tips the authors point out that similar-looking tips are seen when cells are immunostained for Myo10, citing Berg & Cheney (2002). In looking at those images as well as images from papers examining Myo10 immunostaining in metastatic cancer cells (Arjonen et al, 2014, JCI; Summerbell et al, 2020, Sci Adv) the majority of the filopodia tips appear almost uniformly dot-like or circular. There is not too much evidence of the elongated, bulbous filopodial tips seen here.

      However, in reconsidering the approach and results, it is the case that the finding here do establish the plasticity of filopodia tips that can accommodate a surprisingly (shockingly) large number of motors. The authors discuss that their results show that targeting molecules to the filopodia tip is a relatively permissive process (lines 262 - 274). That could be an important property that cells might be able to use to their advantage in certain contexts.

      (2) The method for arriving at the intensity of an individual filopodium puncta (starting on line 532 and provided in the Response), and how this is corrected for transfection efficiency and the cell-to-cell variation in expression level is still not clear to this reviewer. The first part of the description makes sense - the authors obtain total molecules/cell based on the estimation on SDS-PAGE using the signal from bound Halo ligand. It then seems that the total fluorescence intensity of each expressing cell analyzed is measured, then summed to get the average intensity/cell. The 'total pool' is then arrived at by multiplying the number of molecules/cell (from SDS-PAGE) by the total number of cells analyzed. After that, then: 'to get the number of molecules within a Myo10 filopodium, the filopodium intensity was divided by the bioreplicate signal intensity and multiplied by 'total pool.' ' The meaning of this may seem simple or straightforward to the authors, but it's a bit confusing to understand what the 'bioreplicate signal intensity' is and then why it would be multiplied by the 'total pool'. This part is rather puzzling at first read.

      Since the approach described here leads the authors to their numerical estimates every effort should be made to have it be readily understood by all readers. A flow chart or diagram might be helpful.

      (3) The distribution of Myo10 punctae around the cell are analyzed (Fig 2E, F) and the authors state that they detect 'periodic stretches of higher Myo10 density along the plasma membrane' (line 123) and also that there is correlation and anti-correlation of molecules and punctae at opposite ends of the cells.

      In the first case, it is hard to know what the authors really mean by the phrase 'periodic stretches'. It's not easy to see a periodicity in the distribution of the punctae in the many cells shown in Supp Fig 3. Also, the correlation/anti-correlation is not so easily seen in the quantification shown in Fig 2F. Can the authors provide some support or clarification for what they are stating?

      (4) The authors are no doubt aware that a paper from the Tyska lab that employs a completely different method of counting molecules arrives at a much lower number of Myo10 molecules at the filopodial tip than is reported here was just posted (Fitz & Tyska, 2024, bioRxiv, DOI: 10.1101/2024.05.14.593924).

      While it is not absolutely necessary for the authors to provide a detailed discussion of this new work given the timing, they may wish to consider adding a note briefly addressing it.

    5. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This valuable study reports on the packing of molecules in cellular compartments, such as actin-based protrusions. The study provides solid evidence for parameters that enable the building of a biophysical model of filopodia, which is required to gain a complete understanding of these important actin-based structures. Some areas of the manuscript require further clarification.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The manuscript proposes an alternative method by SDS-PAGE calibration of Halo-Myo10 signals to quantify myosin molecules at specific subcellular locations, in this specific case filopodia, in epifluorescence datasets compared to the more laborious and troublesome single molecule approaches. Based on these preliminary estimates, the authors developed further their analysis and discussed different scenarios regarding myosin 10 working models to explain intracellular diffusion and targeting to filopodia.

      Strengths:

      Overall, the paper is elegantly written and the data analysis is appropriately presented.

      Weaknesses:

      While the methodology is intriguing in its descriptive potential and could be the beginning of an interesting story, a good portion of the paper is dedicated to the discussion of hypothetical working mechanisms to explain myosin diffusion, localization, and decoration of filopodial actin that is not accompanied by the mandatory gain/loss of function studies required to sustain these claims.

      To be fair, the detailed mechanisms that we raise related to diffusion, localization, and decoration are based on extensive work by others. Many prior papers use domain deletions of Myo10 and fall in the category of gain/loss-of-function studies. It is true that we have not repeated those extensive studies, but it seems appropriate to connect with and cite their work where appropriate.

      Reviewer #2 (Public Review):

      Summary:

      The paper sought to determine the number of myosin 10 molecules per cell and localized to filopodia, where they are known to be involved in formation, transport within, and dynamics of these important actin-based protrusions. The authors used a novel method to determine the number of molecules per cell. First, they expressed HALO tagged Myo10 in U20S cells and generated cell lysates of a certain number of cells and detected Myo10 after SDS-PAGE, with fluorescence and a stained free method. They used a purified HALO tagged standard protein to generate a standard curve which allowed for determining Myo10 concentration in cell lysates and thus an estimate of the number of Myo10 molecules per cell. They also examined the fluorescence intensity in fixed cell images to determine the average fluorescence intensity per Myo10 molecule, which allowed the number of Myo10 molecules per region of the cell to be determined. They found a relatively small fraction of Myo10 (6%) localizes to filopodia. There are hundreds of Myo10 in each filopodia, which suggests some filopodia have more Myo10 than actin binding sites. Thus, there may be crowding of Myo10 at the tips, which could impact transport, the morphology at the tips, and dynamics of the protrusions themselves. Overall, the study forms the basis for a novel technique to estimate the number of molecules per cell and their localization to actin-based structures. The implications are broad also for being able to understand the role of myosins in actin protrusions, which is important for cancer metastasis and wound healing.

      Strengths:

      The paper addresses an important fundamental biological question about how many molecular motors are localized to a specific cellular compartment and how that may relate to other aspects of the compartment such as the actin cytoskeleton and the membrane. The paper demonstrates a method of estimating the number of myosin molecules per cell using the fluorescently labeled HALO tag and SDS-PAGE analysis. There are several important conclusions from this work in that it estimates the number of Myo10 molecules localized to different regions of the filopodia and the minimum number required for filopodia formation. The authors also establish a correlation between number of Myo10 molecules filopodia localized and the number of filopodia in the cell. There is only a small % of Myo10 that tip localized relative to the total amount in the cell, suggesting Myo10 have to be activated to enter the filopodia compartment. The localization of Myo10 is log-normal, which suggest a clustering of Myo10 is a feature of this motor.

      Weaknesses:

      One main critique of this work is that the Myo10 was overexpressed. Thus, the amount in the cell body compared to the filopodia is difficult to compare to physiological conditions. The amount in the filopodia was relatively small - 100s of molecules per filopodia so this result is still interesting regardless of the overexpression. However, the overexpression should be addressed in the limitations.

      This is a reasonable perspective and we now note this caveat in the Limitations section so that readers will take note. Our goal here was to understand a system in which Myo10 is the limiting reagent for filopodia, rather than a native system that expresses high Myo10 on its own. Because U2OS cells do not express detectable levels of Myo10 (see below), the natural perturbation here is overexpressing Myo10 to stimulate filopodial growth.

      The authors have not addressed the potential for variability in transfection efficiency. The authors could examine the average fluorescence intensity per cell and if similar this may address this concern.

      Indeed, cells are heterogenous and will naturally express different levels of Myo10 not only due to transfection efficiency, but also due to their state (cell cycle stage, motile behavior, and more). In fact, we measure the transfection efficiency of each bioreplicate and account for it in our calibration procedure. We also measure the fluorescence intensity per cell, which lets us calculate the total Myo10s per cell and the cell-to-cell variability. These Myo10 distributions across cells are shown in Fig. 1D-E.

      We note here an error that we made in applying this transfection efficiency correction in the first submission. When we obtain the total Myo10 molecules by SDS-PAGE, we should divide by the total number of transfected cells. However, due to an operator precedence error, the transfection efficiency appeared in the numerator rather than the denominator. We have now corrected this error, which has the effect of increasing the number of molecules in all of our measurements. The effect of this correction has strengthened one of the paper’s main conclusions, that Myo10 is frequently overloaded at filopodial tips.

      The SDS PAGE method of estimating the number of molecules is quite interesting. I really like this idea. However, I feel there are a few more things to consider. The fraction of HALO tag standard and Myo10 labeled with the HALO tagged ligand is not determined directly. It is suggested that since excess HALO tagged ligand was added we can assume nearly 100% labeling. If the HALO tag standard protein is purified it should be feasible to determine the fraction of HALO tagged standard that is labeled by examining the absorbance of the protein at 280 and fluorophore at its appropriate wavelength.

      This is a fair point raised by the reviewer, and we have now measured a labeling efficiency of 90% in Supplementary Figure 2A-C. We have adjusted all values according to this labeling efficiency.

      The fraction of HALO tagged Myo10 labeled may be more challenging to determine, since it is in a cell lysate, but there may be some potential approaches (e.g. mass spec, HPLC).

      As noted, this value is considerably more challenging. Instead, we determined conditions under which labeling in cells is saturated. We have now stained with a concentration range for both fixed and live cell samples. Saturation occurs with ~0.5 μM HaloTag ligand-TMR in fixed/permeabilized cells and in live cells (Supplementary Figure 2D-E). This comparison of live cells vs. permeabilized cells allows us to say that the intact plasma membrane is not limiting labeling under these conditions.

      In Figure 1B, the stain free gel bands look relatively clean. The Myo10 is from cell lysates so it is surprising that there are not more bands. I am not surprised that the bands in the TMR fluorescence gel are clean, and I agree the fluorescence is the best way to quantitate.

      Figure 1B shows the focused view at high MW, and there is not much above Myo10. The full gel lanes shown in Supp. Fig. 1C show the expected number of bands from a cell lysate.

      In Figure 3C, the number of Myo10 molecules needed to initiate a filopodium was estimated. I wonder if the authors could have looked at live cell movies to determine that these events started with a puncta of Myo10 at the edge of the cell, and then went on to form a filopodia that elongated from the cell. How was the number of Myo10 molecules that were involved in the initiation determined? Please clarify the assumptions in making this conclusion.

      We thank the reviewer (and the other reviewers) for this excellent suggestion. We have now carried out these live cell experiments. These experiments were quite challenging, because we needed to collect snapshots of ~50 cells to measure the mean fluorescence intensity of transfected cells and then acquire movies of several cells for analysis. The U2OS cells were also highly temperature-sensitive and would retract their filopodia without objective heating.

      We have now analyzed filopodial initiation events and measured considerably more Myo10 at the first signs of accumulation– in the 100s of molecules. The dimmer spots that we measured in the first draft were likely unrelated to filopodial initiation, and we have corrected the discussion on this point.

      We now also track further growth from a stable filopodial tip (the phased-elongation mechanism from Ikebe and coworkers) and find approximately 500 molecules bud off in those events. We also track filopodial elongation rates as a function of Myo10 numbers. We have added additional live cell imaging sections that include these results.

      It is stated in the discussion that the amount of Myo10 in the filopodia exceeds the number of actin binding sites. However, since Myo10 contains membrane binding motifs and has been shown to interact with the membrane it should be pointed that the excess Myo10 at the tips may be interacting with the membrane and not actin, which may prevent traffic jams.

      This is also an excellent point to consider, and we have expanded the relevant discussion along these lines. We agree that the Myo10 at the filopodial tip is likely membrane-bound. We now estimate the 2D membrane area occupied by Myo10, and find that it reaches nearly full packing in many cases (under a number of assumptions that we spell out more fully in the manuscript).

      Reviewer #3 (Public Review):

      Summary:

      The unconventional myosin Myo10 (aka myosin X) is essential for filopodia formation in a number of mammalian cells. There is a good deal of interest in its role in filopodia formation and function. The manuscript describes a careful, quantitative analysis of Myo10 molecules in U2OS cells, a widely used model for studying filopodia, how many are present in the cytosol versus filopodia and the distribution of filopodia and molecules along the cell edge. Rigorous quantification of Myo10 protein amounts in a cell and cellular compartment are critical for ultimately deciphering the cellular mechanism of Myo10 action as well as understand the molecular composition of a Myo10-generated filopodium.

      Consistent with what is seen in images of Myo10 localization in many papers, the vast majority of Myo10 is in the cell body with only a small percentage (appr 5%) present in filopodia puncta. Interestingly, Myo10 is not uniformly distributed along the cell edge, but rather it is unevenly localized along the cell edge with one region preferentially extending filopodia, presumably via localized activation of Myo10 motors. Calculation of total molecules present in puncta based on measurement of puncta size and measured Halo-Myo10 signal intensity shows that the concentration of motor present can vary from 3 - 225 uM. Based on an estimation of available actin binding sites, it is possible that Myo10 can be present in excess over these binding sites.

      Strengths:

      The work represents an important first step towards defining the molecular stoichiometry of filopodial tip proteins. The observed range of Myo10 molecules at the tip suggests that it can accommodate a fairly wide range of Myo10 motors. There is great value in studies such as this and the approach taken by the authors gives one good confidence that the numbers obtained are in the right range.

      Weaknesses:

      One caveat (see below) is that these numbers are obtained for overexpressing cells and the relevance to native levels of Myo10 in a cell is unclear.

      A similar concern was raised by Reviewer 2; please see above.

      An interesting aspect of the work is quantification of the fraction of Myo10 molecules in the cytosol versus in filopodia tips showing that the vast majority of motors are inactive in the cytosol, as is seen in images of cells. This has implications for thinking about how cells maintain this large population in the off-state and what is the mechanism of motor activation. One question raised by this work is the distinction between cytosolic Myo10 and the population found at the ‘cell edge’ and the filopodia tip. The cortical population of Myo10 is partially activated, so to speak, as it is targeted to the cortex/membrane and presumably ready to go. Providing quantification of this population of motors, that one might think of as being in a waiting room, could provide additional insight into a potential step-by-step pathway where recruitment or binding to the cortical region/plasma membrane is not by itself sufficient for activation.

      As mentioned in our response to Reviewer 2, we have now carried out quantitation in live cells to capture Myo10 transitions from cell body into filopodial movement. We attempted to identify this membrane-bound population of motors in our new live cell experiments but were unable to make convincing measurements. Notably, we see no noticeable enrichment of Myo10 at the cortex relative to the cytosol. Although we believe there is a membrane-bound waiting room (akin to the 3D-2D-1D mechanism of Molloy and Peckham), we suspect that the 2D population is diffusing too rapidly to be detected under our imaging conditions.

      Specific comments:

      (1) It is not obvious whether the analysis of numbers of Myo10 molecules in a cell that is ectopically overexpressing Myo10 is relevant for wild type cells. It would appear to be a significant excess based on the total protein stained blot shown in Fig S1E where a prominent band the size of tagged Myo10 seen in the transfected sample is almost absent in the WT control lane.

      Even “wildtype” cells vary considerably in their Myo10 expression levels. For example, melanoma cells often heavily upregulate Myo10, while these U2OS cells produce nearly none (Supplementary Figure 1E). Thus, there is no single, widely acceptable target for Myo10 expression in wildtype cells.

      Please note that the new Supplementary Figure 1E is a Myo10 Western blot, not total protein staining as before.

      Ideally, and ultimately an important approach, would be to work with a cell line expressing endogenously tagged Myo10 via genome engineering. This can be complicated in transformed cells that often have chromosomal duplications.

      Indeed, we chose U2OS cells for this work because they do not express detectable levels of Myo10, and thus we can avoid all of these complications. Here we can examine how Myo10 levels control filopodial production through ectopic expression.

      However, even though there is an excess of Myo10 it would appear that activation is still under some type of control as the cytosolic pool is quite large and its localization to the cell edge is not uniform. But it is difficult to gauge whether the number of molecules in the filopodium is the same as would be seen in untransfected cells. Myo10 can readily walk up a filopodium and if excess numbers of this motor are activated they would accumulate in the tip in large numbers, possibly creating a bulge as and indeed it does appear that some tips are unusually large. Then how would that relate to the normal condition?

      As noted above, the normal condition depends on the cellular system. However, endogenous Myo10 also accumulates in bulges at filopodial tips, so this is not a phenotype unique to Myo10 overexpression. For example, the images from Figure 1 of the Berg and Cheney (2002) citation show bulges from endogenous Myo10 in endothelial cells.

      (2) Measurements of the localization of Myo10 focuses in large part on ‘Myo10 punctae’. While it seems reasonable to presume that these are filopodia tips, the authors should provide readers with a clear definition of a puncta. Is it only filopodia tips, which seems to be the case? Does it include initiation sites at the cell membrane that often appear as punctae?

      We define puncta as any clusters/spots of Myo10 signal detected by segmentation, not limited to any location within the surface-attached filopodia. We exclude puncta that appear in the cell interior (~5 of which appear in Fig. 1A). These are likely dorsal filopodia, but there are few of these compared to the surface attached filopodia of U2OS cells. In Figure 2, “puncta” includes all Myo10 clusters along the filopodia shaft, though a majority happen to be tip-localized (please see Supplementary Figure 4B). We have edited the main text for clarification.

      Along those lines, the position of dim punctae along the length of a filopodium is measured (Fig 3D). The findings suggest that a given filopodium can have more than one puncta which seems at odds if a puncta is a filopodia tip. How frequently is a filopodium with two puncta seen? It would be helpful if the authors provided an example image showing the dim puncta that are not present at the tip.

      We have now provided an example image of dim puncta along filopodia in Supplementary Figure 4C.

      (3) The concentration of actin available to Myo10 is calculated based on the deduction from Nagy et al (2010) that only 4/13 of the actin monomers in a helical turn are accessible to the Myo10 motor (discussion on pg 9; Fig S4). Subsequent work (Ropars et al, 2016) has shown that the heads of the antiparallel Myo10 dimer are flattened, but the neck is rather flexible, meaning that the motor can a variable reach (36 - 52 nm). Wouldn’t this mean that more actin could be accessible to the Myo10 motor than is calculated here?

      Although we see why the reviewer might believe otherwise, the 4/13 fraction of accessible actin holds. This fraction is obtained from consideration of the fascin-actin bundle structure alone, independent of the reach of any particular myosin motor. Every repeating layer of 13 actin subunits (or 36 nm) has 4 accessible myosin binding-sites. The remaining 9 sites are rejected because a single myosin motor domain will have a steric clash with a neighboring actin filament in the bundle. A myosin with an exceptionally long reach might reach the next 13 subunit layer, but that layer also has only 4 binding sites. Thus, we can calculate the number of binding sites per unit length along the filopodium. This number would hold for a dimeric myosin with any reach, including myosin-5 or myosin-2.

      (4) Quantification of numbers of Myo10 molecules in filopodial puncta (Fig 3C) leads the authors to conclude that ‘only ten or fewer Myo10 molecules are necessary for filopodia initiation’ (pg 7, top). While this is a reasonable based on the assumption that the formation of a puncta ultimately results from an initiation event, little is known about initiation events and without direct observation of coalescence of Myo10 at the cell edge that leads to formation of a filopodium, this seems rather speculative.

      As noted above, we have now performed the necessary live cell imaging of filopodial nucleation events and have updated our conclusions accordingly.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      I have made a series of comments that might help the authors improve their manuscript:

      - A full calibration of the methodology would require testing a wider range of protein amounts, to exhaustively detect the dynamic range of the technique. The authors acknowledge in the discussion that “Furthermore, our estimates of molecules are predicated on the calibration curve of the Halo Standard Protein on the SDS-PAGE gels, which is likely the highest source of error on our molecule counts”. A good way of convincing a nasty reviewer is to provide a calibration with more than 3 reference points. At least this will help exclude from the analysis cells where Myo10 estimates are not in the linear regime of detection.

      We completely agree with the reviewer’s suggestion to build a robust calibration curve. The SDS gel shown in Figure 1C originally contained 4 reference points, but the highest HaloTag standard protein point oversaturated the detector at the set exposure in the TMR channel and was omitted. We have now re-run the SDS gel to include a HaloTag standard protein curve comprising 5 points, alongside all three bioreplicates from the fixed cell experiments and all three bioreplicates from the live cell experiments (updated in Figure 1B-C). We had saved frozen lysates from the original fixed cell work, so we were able to reanalyze our data with the new set of standards. The Myo10 quantities are consistent, but with much tighter CIs from the standard curve.

      - As already said this methodology is intriguing, however, a correlative validation with a conventional SMLM approach to address the bona-fide of the method would be ideal.

      Unfortunately, single molecule approaches for validation are impractical for us. Due to the relatively high magnification of our TIRF microscope and the large spread area of the U2OS cells, single cells typically extend beyond the field of view. We acknowledge the benefits of SMLM quantitative techniques and other approaches cited in the introduction section. To avoid use of special tools/instruments, we offer our methodology, based off Pollard group’s quantitative Western blotting of GFP, as a simpler alternative accessible to anyone.

      - TMR is a small ligand likely interacting also with Halo in its denatured state. However, to clear any doubts a parallel Native-PAGE investigation should be included, or if existing a specific reference should be provided.

      Perhaps there is a misunderstanding here. One of the key advantages of the HaloTag labeling system is that the engineered dehalogenase is covalently modified by the ligand (the TMR-ligand is a suicide substrate). This means that the TMR remains bound even under denaturing conditions, which allows its detection in SDS-PAGE. Native gels are unnecessary here.

      - Moreover, SDS-PAGE is run at alkaline pH, have the authors considered these points when designing the methodology? Fluorescence images were taken in PBS, which has a different pH. Could the authors, or the literature, exclude these aspects as potential pitfalls in the methodology? Also temperature is affecting fluorescence emission, but it is easier to control with certain tolerance in the room-temperature regime.

      Our method does not compare fluorescence values that cross the experimental systems (SDS-PAGE vs. microscopy). Cellular proteins and HaloTag protein standards are compared in a single setting of SDS-PAGE to obtain the average number of Myo10s per transfected cell. Likewise, all measurements on intact (live or fixed) cells are conducted in that single setting to obtain average fluorescence per cell. Thus, there is no issue with the different buffers or temperatures affecting fluorescence emission.

      - The authors should test their approach also with truncation variants of Myosin10 (for instance lacking the PH or motor domain). This is a classical approach that might prove the potential of the technique when altering the capacity of the protein to interact with a main binding partner. Also, treatments that induced filopodia formation might prove useful (i.e., hypotonic media induce filopodia formation in some fibroblast cell lines in our hands).

      The reviewer raises interesting suggestions that we aim to address in future experiments, but truncation variants and environmental perturbations are beyond the focus of the current manuscript. Here, we report on the otherwise unperturbed state when we add exogenous full-length Myo10 to the U2OS cells. But indeed, experiments with Myo10 domain truncations, PI3K and PTEN inhibition, and cargo protein / activating cofactor knock-downs (among others) are on our drawing board.

      - Most of the mechanisms hypothesized in the discussion are sound and plausible. However, the authors have chosen an experimental model where transient transfection of exogenous Myo10 in U2OS is performed. This approach poses two main and fundamental questions that are not resolved by the data provided:

      A) how do different expression levels affect the Myo10 counting?

      Our counting procedure does not assume uniform expression across a population of cells– quite the opposite, in fact. We directly measure Myo10 expression levels on a cell-by-cell basis with microscopy, once we know the number of molecules in our total pool (see the Methods for details). As an example of the final output, Figs. 1D and 1E show the total number of Myo10 molecules per cell for fixed and live cells, respectively.

      B) how does endogenous and unlabeled Myo10 hamper the bonafide of counts? The authors claimed “U2OS cells express low levels of Myo10, so there is a small population of unlabeled endogenous Myo10 unaddressed by this paper”. As presented, the low levels of endogenous Myo10 sound an arbitrary parameter, and there are no data presented that can limit if not exclude this bias in the analysis. To produce data in a genetically modified cell line with Halo-tag on the endogenous protein will represent a much cleaner system. Alternatively, the authors should look for Myo10 KO cell lines where they can back-transfect their Halo-Tagged Myo10 construct in a more consistent framework, focusing on cells with low-to-mid levels of expression.

      We agree, this is an important point to nail down (and is often neglected in the literature). We have now measured the endogenous Myo10 levels in U2OS cells by Western blotting and found that it is undetectable compared to our HaloTagged construct expression. Please see Supp. Fig 1E. Thus, for all intents and purposes, every Myo10 molecule in these experiments came from our expression plasmid. Accordingly, we have removed this caveat from the paper.

      Minor points

      - Figure 1B. To help the reader SDS-PAGE gels annotations should be clearer already from the figure.

      We have updated the annotations for clarity.

      - Methods should be organized in sessions. As it stands, it is hard for the reader to look for technical details.

      We have expanded and added subsections to the Methods as requested.

      - The good practice of indicating the gene and transcript entry numbers and the primer used to amplify and clone into the backbone vectors is getting lost in many papers. I would strongly encourage the authors to add this information to the methods.

      We have included the gene entries to the methods and will include a full FASTA file of the coding sequence as supplementary information to avoid any ambiguity here.

      The authors write “It is unclear how myosins navigate to the right place at the right time, but our results support an important interplay between Myo10 and the actin network.” It is a bit scholastic to say that Myo10 and actin have an important interplay, they are major binding partners. What is the new knowledge contained in this sentence?

      Agreed– we have deleted the sentence in question.

      Reviewer #2 (Recommendations For The Authors):

      The authors should address all the weaknesses indicated in the public review.

      There were a few other places that require clarification.

      On page 4, the last paragraph. It is stated that the targeting of Myo10 was reported/proposed based on previous work (ref 31). The next few sentences are not referenced and thus likely refer to ref 31. The authors did not measure the parameters discussed in these sentences, so it is important to clarify that they are referring to previous work and not the current study.

      Indeed, the next few sentences still refer to old reference 31, so we have now edited the paragraph for clarity.

      On page 7, the reference to Figure 3A indicates that the trend of higher Myo10 correlating with more filopodia. However, the reference to Figure 3B indicates total intracellular Myo10 weakly correlates with more filopodia. However, the x-axis on Figure 3B is filopodia molecules not the intracellular Myo10. Please clarify.

      We appreciate the reviewer for catching our mistake. Those plots are now in Fig. 2 and have been edited accordingly.

      Reviewer #3 (Recommendations For The Authors):

      The Discussion of results at the end of each section is rather brief and could be expanded on a bit more.

      Before we were operating under the constraints of an eLife Short Report. We have now expanded the discussion for a full article.

      The authors mention that actin filaments at the tips of filopodia could be frayed, citing Medalia et al, 2007 (ref 40). That paper describes an early cryoEM analysis of filopodia from the amoeba Dictyostelium. EM images of mammalian filopodia tips, e.g. Svitkina et al, 2003, JCB, do not show quite the same organization of actin as seen in the Dictyostelium filopodia tips. However, recent work from the Bershadsky lab, Li et al, 2023, presents a few cryoEM images of tips of left-bent filopodia that are tightly adhered to a substrate and there it looks like actin filaments become disorganized in tips, along with membrane bulging. The authors should consider expanding discussion of the filopodia tips to take into account what is known for mammalian filopodia.

      We thank the reviewer for bringing these enlightening papers to our attention. We have now included these citations in the discussion.

      Fig 1D - The x-axis is a bit odd, it goes from 0 then to 2.5e+06 with no indication of the bin size. Can this be re-labelled or the scale displayed a bit differently?

      We have double-checked the axis breaks, which are large because the underlying values are large. We have also provided the bin size as requested for all histograms.

      Fig 4A - What is the bin size for the histogram?

      As above, we have now updated the figure legends (now in Fig. 3) to include the bin size.

      Methods -

      - Please provide an accession number for the Myo10 nucleotide sequence used for this work as there are at least two known isoforms.

      Thank you for noting this. We are using the full-length, not the headless isoform. We have now updated the Methods accordingly.

      - No mention is made of the SDS sample buffer used, was that also added to the sample?

      We have now updated the Methods accordingly.

      - How are samples boiled at 70 deg C? Do the authors actually mean ‘heated’?

      Indeed. We have now corrected “boiled” to “heated.”

      - Could the authors please briefly explain the connected component analysis used to identify filopodia?

      We have now updated the Methods accordingly.

      - The intensity of filopodia was determined by dividing tip intensity by the total bioreplicate sum of intensities then multiplying it by the total pool, if this reviewer understands correctly. It sounds like intensities are being averaged across a whole cell population instead of cell-by-cell. Is that correct? If so, can the authors please provide the underlying rationale for this? If not, then please better describe what was actually done.

      We apologize for the confusion. Intensities are being averaged (summed) across a whole cell population, but importantly that step is only used to obtain a scale factor that converts the fluorescence signal at the microscope to the number of molecules. We then use that scale factor for all cells imaged in the bioreplicate, to both 1) find the total Myo10 in that cell, and 2) find the total amount of that Myo10 in any given location within that cell.

      To further clarify, each bioreplicate has a known total number of Myo10 molecules associated with the number of cells loaded onto the SDS gel. From the SDS gel, we have an average number of Myo10 molecules per positively transfected cell. If 50 cell images are analyzed, then there is a Myo10 ‘total pool’ of (50 cells) * (average Myo10 molecules/cell). The fluorescence signal intensities in microscopy were summed for all cells within the bioreplicate (50 cells in this example). However, due to variation in expression, not every cell has the same signal intensity when imaged under the same conditions. It would be inaccurate to assume each cell contains the average Myo10 molecules/cell. Therefore, to get the number of molecules within a given Myo10 cell (or punctum), the summed cell (punctum) intensity was divided by the bioreplicate fluorescence signal intensity sum and multiplied by ‘total pool.’

      - The authors quantify Myo10 protein amounts by western blotting using Halo tag fluorescence, a method that should provide good accuracy. The results depend on the transfection efficiency and it is rarely the case that it is 100%. The authors state that they use a ‘value correction for positively transfected cells’ (pg 11). It is likely that there was a range of expression levels in the cells, how was a cut-off for classifying a cell as non-expressing determined or set?

      As described in the Methods, “microscopy was used to count the percentage of transfected cells from ~105-190 randomly surveyed cells per bioreplicate.” Cells were labeled and located with DAPI. If no TMR signal could be visually detected by microscopy, then the cell was deemed to be non-Myo10 expressing. We did not set a cutoff fluorescence value, as untransfected cells have no detectable signal. Please see Supplementary Figure 1F for examples.

      - “In-house Python scripts” are used for image analysis. Will these be made publicly available?

      Yes, we will package these up on GitHub.

    1. eLife assessment

      Chang et al. have investigated the catalytic mechanism of I-PpoI nuclease, a one-metal-ion dependent nuclease, by time-resolved X-ray crystallography using soaking of crystals with metal ions under different pH conditions. This convincing study revealed that I-PpoI catalyzes the reaction process through a single divalent cation. The study uncovers important details of the roles of the metal ion and the active site histidine in catalysis.

    2. Reviewer #1 (Public Review):

      This study is convincing because they performed time-resolved X-ray crystallography under different pH conditions using active/inactive metal ions and PpoI mutants, as with the activity measurements in solution in conventional enzymatic studies. Although the reaction mechanism is simple and may be a little predictable, the strength of this study is that they were able to validate that PpoI catalyzes DNA hydrolysis through "a single divalent cation" because time-resolved X-ray study often observes transient metal ions which are important for catalysis but are not predictable in previous studies with static structures such as enzyme-substrate analog-metal ion complexes. The discussion of this study is well supported by their data. This study visualized the catalytic process and mutational effects on catalysis, providing new insight into the catalytic mechanism of I-PpoI through a single divalent cation. The authors found that His98, a candidate of proton acceptor in the previous experiments, also affects the Mg2+ binding for catalysis without the direct interaction between His98 and the Mg2+ ion, suggesting that "Without a proper proton acceptor, the metal ion may be prone for dissociation without the reaction proceeding, and thus stable Mg2+ binding was not observed in crystallo without His98". In future, this interesting feature observed in I-PpoI should be investigated by biochemical, structural, and computational analyses using other metal-ion dependent nucleases.

    3. Reviewer #2 (Public Review):

      Summary:

      Most polymerases and nucleases use two or three divalent metal ions in their catalytic functions. The family of His-Me nucleases, however, use only one divalent metal ion, along with a conserved histidine, to catalyze DNA hydrolysis. The mechanism has been studied previously but, according to the authors, it remained unclear. By use of a time resolved X-ray crystallography, this work convincingly demonstrated that only one M2+ ion is involved in the catalysis of the His-Me I-PpoI 19 nuclease, and proposed concerted functions of the metal and the histidine.

      Strengths:

      This work performs mechanistic studies, including the number and roles of metal ion, pH dependence, and activation mechanism, all by structural analyses, coupled with some kinetics and mutagenesis. Overall, it is a highly rigorous work. This approach was first developed in Science (2016) for a DNA polymerase, in which Yang Cao was the first author. It has subsequently been applied to just 5 to 10 enzymes by different labs, mainly to clarify two versus three metal ion mechanisms. The present study is the first one to demonstrate a single metal ion mechanism by this approach.

      Furthermore, on the basis of the quantitative correlation between the fraction of metal ion binding and the formation of product, as well as the pH dependence, and the data from site-specific mutants, the authors concluded that the functions of Mg2+ and His are a concerted process. A detailed mechanism is proposed in Figure 6.

      Even though there are no major surprises in the results and conclusions, the time-resolved structural approach and the overall quality of the results represent a significant step forward for the Me-His family of nucleases. In addition, since the mechanism is unique among different classes of nucleases and polymerases, the work should be of interest to readers in DNA enzymology, or even mechanistic enzymology in general.

      Weaknesses:

      Two relatively minor issues are raised here for consideration:<br /> p. 4, last para, lines 1-2: "we next visualized the entire reaction process by soaking I-PpoI crystals in buffer....". This is a little over-stated. The structures being observed are not reaction intermediates. They are mixtures of substrates and products in the enzyme-bound state. The progress of the reaction is limited by the progress of the soaking of the metal ion. Crystallography has just been used as a tool to monitor the reaction (and provide structural information about the product). It would be more accurate to say that "we next monitored the reaction progress by soaking....".

      p. 5, the beginning of the section. The authors on one hand emphasized the quantitative correlation between Mg ion density and the product density. On the other hand, they raised the uncertainty in the quantitation of Mg2+ density versus Na+ density, thus they repeated the study with Mn2+ which has distinct anomalous signals. This is a very good approach. However, there is still no metal ion density shown in the key Figure 2A. It will be clearer to show the progress of metal ion density in a figure (in addition to just plots), whether it is Mg or Mn.

    4. Author response:

      Public Reviews: 

      Reviewer #1 (Public Review): 

      This study is convincing because they performed time-resolved X-ray crystallography under different pH conditions using active/inactive metal ions and PpoI mutants, as with the activity measurements in solution in conventional enzymatic studies. Although the reaction mechanism is simple and may be a little predictable, the strength of this study is that they were able to validate that PpoI catalyzes DNA hydrolysis through "a single divalent cation" because time-resolved X-ray study often observes transient metal ions which are important for catalysis but are not predictable in previous studies with static structures such as enzyme-substrate analog-metal ion complexes. The discussion of this study is well supported by their data. This study visualized the catalytic process and mutational effects on catalysis, providing new insight into the catalytic mechanism of I-PpoI through a single divalent cation. The authors found that His98, a candidate of proton acceptor in the previous experiments, also affects the Mg2+ binding for catalysis without the direct interaction between His98 and the Mg2+ ion, suggesting that "Without a proper proton acceptor, the metal ion may be prone for dissociation without the reaction proceeding, and thus stable Mg2+ binding was not observed in crystallo without His98". In future, this interesting feature observed in I-PpoI should be investigated by biochemical, structural, and computational analyses using other metal-ion dependent nucleases. 

      We appreciate the reviewer for the positive assessment as well as all the comments and suggestions.

      Reviewer #2 (Public Review): 

      Summary: 

      Most polymerases and nucleases use two or three divalent metal ions in their catalytic functions. The family of His-Me nucleases, however, use only one divalent metal ion, along with a conserved histidine, to catalyze DNA hydrolysis. The mechanism has been studied previously but, according to the authors, it remained unclear. By use of a time resolved X-ray crystallography, this work convincingly demonstrated that only one M2+ ion is involved in the catalysis of the His-Me I-PpoI 19 nuclease, and proposed concerted functions of the metal and the histidine. 

      Strengths: 

      This work performs mechanistic studies, including the number and roles of metal ion, pH dependence, and activation mechanism, all by structural analyses, coupled with some kinetics and mutagenesis. Overall, it is a highly rigorous work. This approach was first developed in Science (2016) for a DNA polymerase, in which Yang Cao was the first author. It has subsequently been applied to just 5 to 10 enzymes by different labs, mainly to clarify two versus three metal ion mechanisms. The present study is the first one to demonstrate a single metal ion mechanism by this approach. 

      Furthermore, on the basis of the quantitative correlation between the fraction of metal ion binding and the formation of product, as well as the pH dependence, and the data from site-specific mutants, the authors concluded that the functions of Mg2+ and His are a concerted process. A detailed mechanism is proposed in Figure 6. 

      Even though there are no major surprises in the results and conclusions, the time-resolved structural approach and the overall quality of the results represent a significant step forward for the Me-His family of nucleases. In addition, since the mechanism is unique among different classes of nucleases and polymerases, the work should be of interest to readers in DNA enzymology, or even mechanistic enzymology in general. 

      Thank you very much for your comments and suggestions.

      Weaknesses: 

      Two relatively minor issues are raised here for consideration: 

      p. 4, last para, lines 1-2: "we next visualized the entire reaction process by soaking I-PpoI crystals in buffer....". This is a little over-stated. The structures being observed are not reaction intermediates. They are mixtures of substrates and products in the enzyme-bound state. The progress of the reaction is limited by the progress of the soaking of the metal ion. Crystallography has just been used as a tool to monitor the reaction (and provide structural information about the product). It would be more accurate to say that "we next monitored the reaction progress by soaking....". 

      We appreciate the clarification regarding the description of our experimental approach. We agree that our structures do not represent reaction intermediates but rather mixtures of substrate and product states within the enzyme-bound environment. We will revise the text accordingly to more accurately reflect our methodology.

      p. 5, the beginning of the section. The authors on one hand emphasized the quantitative correlation between Mg ion density and the product density. On the other hand, they raised the uncertainty in the quantitation of Mg2+ density versus Na+ density, thus they repeated the study with Mn2+ which has distinct anomalous signals. This is a very good approach. However, there is still no metal ion density shown in the key Figure 2A. It will be clearer to show the progress of metal ion density in a figure (in addition to just plots), whether it is Mg or Mn. 

      Thank you for your insightful comments. We recognize the importance of visualizing metal ion density alongside product density data. As you commented, distinguishing between Mg2+ and Na+ is challenging, and in Fig 2A, no distinguishable density was observed at 20s. Mn2+, with its higher electron density, is detectable even at low occupancy. To address this, we will include figure panels in Figure 3 or supplementary figures to present Mn2+ and product densities concurrently.

    1. Reviewer #3 (Public Review):

      Summary:

      Baek and colleagues present important follow-up work on the role of serum glucose in the management of neonatal sepsis. The authors previously showed high glucose administration exacerbated neonatal sepsis, while strict glucose control improved outcomes but caused hypoglycemia. In the current report they examined the effect of a more tailored glucose management approach on outcomes and examined hepatic gene expression, plasma metabolome/proteome, blood transcriptome, as well as the the therapeutic impact of hIAIP. The authors leverage multiple powerful approaches to provide robust descriptive accounts of the physiologic changes that occur with this model of sepsis in these various conditions.

      Strengths:

      (1) Use of preterm piglet model.

      (2) Robust, multi-pronged approach to address both hepatic and systemic implications of sepsis and glucose management.

      (3) Trial of therapeutic intervention - glucose management (Figure 6), hIAIP (Figure 7).

      Weaknesses:

      (1) The translational role of the model is in question. CONS is rarely if ever a cause of EOS in preterm neonates. The model. uses preterm pigs exposed at 2 hours of age. This model most likely replicates EOS.

      (2) Throughout the manuscript it is difficult to tell from which animals the data are derived. Given the ~90% mortality in the experimental CONS group, and 25% mortality in the intervention group, how are the data from animals "at euthanasia" considered? Meaning - are data from survivors and those euthanized grouped together? This should be clarified as biologically these may be very different populations (ie, natural survivor vs death).

      (3) With limited time points (at euthanasia ) for hepatic transcriptomics (Figure 2), plasma metabolite (Figure 3) blood transcriptome (Figure 4), and plasma proteome (Figure 5) it is difficult to make conclusions regarding mechanisms preceding euthanasia. Per methods, animals were euthanized with acidosis or clinical decompensation. Are the reported findings demonstrative of end-organ failure and deterioration leading to death, or reflective of events prior?

      (4) Data are descriptive without corresponding "omics" from interventions (glucose management and/or hIAIP) or at least targeted assessment of key differences.

    2. eLife assessment

      This interesting and important study follows up on the authors' observations that lower glucose parental nutrition leads to lower rates of sepsis from Staphylococcus epidermis in a preterm pig model. Sepsis in early life, particularly in premature infants, has significant morbidity and mortality and the authors present convincing evidence that glycemic state affects hepatic metabolism-dependent immune function and improved clearance of coagulase-negative staphylococcal infection. The authors provide a robust multi-omic dataset for the use of the scientific community. However, there are also several concerns that will limit the impact of the work, including that the model does not reflect early onset sepsis that is observed in premature infants, and the question of whether low glucose parental nutrition (PN) is protective versus high glucose PN is harmful as the levels of glucose in the high PN were incredibly high.

    3. Reviewer #1 (Public Review):

      Summary:

      In this manuscript, the authors follow up on their published observation that providing a lower glucose parental nutrition (PN) reduces sepsis from a common pathogen [Staphylococcus epidermitis (SE)] in preterm piglets. Here they found that a higher dose of glucose could thread the needle and get the protective effects of low glucose without incurring significant hypoglycemia. They then investigate whether the change in low glucose PN impacts metabolism to confer this benefit. The finding that lower glucose reduces sepsis is important as sepsis is a major cause of morbidity and mortality in preterm infants, and adjusting PN composition is a feasible intervention.

      Strengths:

      (1) They address a highly significant problem of neonatal sepsis in preterm infants using a preterm piglet model.

      (2) They have compelling data in this paper (and in a previous publication, ref 27) that low glucose PN confers a survival advantage. A downside of the low glucose PN is hypoglycemia which they mitigate in this paper by using a slightly high amount of glucose in the PN.

      (3) The experiment where they change PN from high to low glucose after infection is very important to determine if this approach might be used clinically. Unfortunately, this did not show an ability to reduce sepsis risk with this approach. Perhaps this is due to the much lower mortality in the high glucose group (~20% vs 87% in the first figure).

      (4) They produce an impressive multiomics data set from this model of preterm piglet sepsis which is likely to provide additional insights into the pathogenesis of preterm neonatal sepsis.

      Weaknesses:

      (1) The high glucose control gives very high blood glucose levels (Figure 1C). Is this the best control for typical PN and glucose control in preterm neonates? Is the finding that low glucose is protective or high glucose is a risk factor for sepsis?

      (2) In Figure 1B, preterm piglets provided the high glucose PN have 13% survival while preterm piglets on the same nutrition in Figure 6B have ~80% survival. Were the conditions indeed the same? If so, this indicates a large amount of variation in the outcome of this model from experiment to experiment.

      (3) Piglets on the low glucose PN had consistently lower density of SE (~1 log) across all time points. This may be due to changes in immune response leading to better clearance or it could be due to slower growth in a lower glucose environment.

      (4) Many differences in the different omics (transcriptomics, metabolomics, proteomics) were identified in the SE-LOW vs SE-HIGH comparison. Since the bacterial load is very different between these conditions, could the changes be due to bacterial load rather than metabolic reprogramming from the low glucose PN?

    4. Reviewer #2 (Public Review):

      Summary:

      The authors demonstrate that a low parenteral glucose regimen can lead to improved bacterial clearance and survival from Staph epi sepsis in newborn pigs without inducing hypoglycemia, as compared to a high glucose regimen. Using RNA-seq, metabolomic, and proteomic data, the authors conclude that this is primarily mediated by altered hepatic metabolism.

      Strengths:

      Well-defined controls for every time point, with multiple time points and biological replicates.

      The authors used different experimental strategies to arrive at the same conclusion, which lends credibility to their findings.

      The authors have published the negative findings associated with their study, including the inability to reverse sepsis-related mortality after switching from SE-high to SE-low at 3h or 6h and after administration of hIAIP.

      Weaknesses:

      (1) The authors mention, and it is well-known, that Staph epi is primarily involved in late-onset sepsis. The model of S. epi sepsis used in this study clearly replicates early-onset sepsis, but S. epi is extremely rare in this time period. How do the authors justify the clinical relevance of this model?

      (2) The authors find that the neutrophil subset of the leukocyte population is diminished significantly in the SE-low and SE-high populations. However, they conclude on page 10 that "modulations of hepatic, but not circulating immune cell metabolism, by reduced glucose supply..." and this is possible because the authors have looked at the entire leukocyte transcriptome. I am curious about why the authors did not sequence the neutrophil-specific transcriptome.

      (3) The authors use high (30g/k/d) and low (7.2g/k/d) glucose regimens. These translate into a GIR of 21 and 5 mg/k/min respectively. A normal GIR for a preterm infant is usually 5-8, and sometimes up to 10. Do the authors have a "safe GIR" or a threshold they think we cannot cross? Maybe a point where the metabolism switch takes place? They do not comment on this, especially as GIR and glucose levels are continuous variables and not categorical.

      (4) In Figures 2B and C the authors show that SE-high and SE-low animals have differences in the oxphos, TCA, and glycolytic pathways. The authors themselves comment in the Supplementary Table S1B, E-F that these same metabolic pathways are also different in the Con-Low and Con-high animals, it is just the inflammatory pathways that are not different in the non-infected animals. How can they then justify that it is these metabolic pathways specifically which lead to altered inflammatory pathways, and not just the presence of infection along with some other unfound mechanism?

      (5) The authors mention in Figure 1F that SE-low animals had lower bacterial burdens than SE-high animals, but then go on to infer that the inflammatory cytokine differences are attributed to a rewiring of the immune response. However, they have not normalized the cytokine levels to the bacterial loads, as the differences in the cytokines might be attributed purely to a difference in bacterial proliferation/clearing.

      (6) The authors mention that switching from SE-high to SE-low at 3 or 6 h time points does not reduce mortality. Have the authors considered the reverse? Does hyperglycemia after euglycemia initially, worsen mortality? That would really conclude that there is some metabolic reprogramming happening at the very onset of sepsis and it is a lost battle after that.

    5. Author response:

      Public Reviews:

      Reviewer #1 (Public Review): 

      In the article by Dearlove et al., the authors present evidence in strong support of nucleotide ubiquitylation by DTX3L, suggesting it is a promiscuous E3 ligase with capacity to ubiquitylate ADP ribose and nucleotides. The authors include data to identify the likely site of attachment and the requirements for nucleotide modification. 

      While this discovery potentially reveals a whole new mechanism by which nucleotide function can be regulated in cells, there are some weaknesses that should be considered. Is there any evidence of nucleotide ubiquitylation occurring cells? It seems possible, but evidence in support of this would strengthen the manuscript. The NMR data could also be strengthened as the binding interface is not reported or mapped onto the structure/model, this seems of considerable interest given that highly related proteins do have the same activity. 

      The paper is for the most part well well-written and is potentially highly significant, but it could be strengthened as follows: 

      (1) The authors start out by showing DTX3L binding to nucleotides and ubiquitylation of ssRNA/DNA. While ubiquitylation is subsequently dissected and ascribed to the RD domains, the binding data is not followed up. Does the RD protein alone bind to the nucleotides? Further analysis of nucleotide binding is also relevant to the Discussion where the role of the KH domains is considered, but the binding properties of these alone have not been analysed. 

      We thank the reviewer for the suggestion. We have tested DTX3L RD for ssDNA binding using NMR (see Figure 4A and Figure S2), which showed that DTX3L RD binds ssDNA. We also tested the DTX3L KH domains for RNA/ssDNA binding using an FP experiment. However, the FP experiment did not show significant changes upon titrating RNA/ssDNA. It seems that the KH domains alone are not sufficient to bind RNA/ssDNA and both KH and RD domains are required for binding. Understanding how DTX3L binds RNA/ssDNA is an ongoing research in the lab. We will revise the Discussion on the KH domains.

      (2) With regard to the E3 ligase activity, can the authors account for the apparent decreased ubiquitylation activity of the 232-C protein in Figure 1/S1 compared to FL and RD? 

      We will address this question in the revision.

      (3) Was it possible to positively identify the link between Ub and ssDNA/RNA using mass spectrometry? This would overcome issues associated with labels blocking binding rather than modification. 

      We have tried to use mass spectrometry to detect the linkage between Ub and ssDNA/RNA, but was unable to do so. We suspect that the oxyester linkage might be labile, posing a challenge for mass spectrometry techniques. Similarly, a recent preprint from Ahel lab, which utilises LC-MS, detects the Ub-NMP product rather than the linkage (https://www.biorxiv.org/content/10.1101/2024.04.19.590267v1.full.pdf).

      (4) Furthermore, can a targeted MS approach be used to show that nucleotides are ubiquitylated in cells? 

      This will require future development and improvement of the MS approach, specifically the isolation of labile oxyester-linked products from cells and the optimisation of the MS detection method.

      (5) Do the authors have the assignments (even partial?) for DTX3L RD? In Figure 4 it would be helpful to identify the peaks that correspond to the residues at the proposed binding site. Also do the shifts map to a defined surface or do they suggest an extended site, particularly for the ssDNA.

      We only collected HSQC spectra which was insufficient for assignments. We have performed a competition experiment using ADPr and labelled ssDNA, showing that ADPr competes against the ubiquitination of ssDNA (Figure 4D). We will provide an additional experiment showing that ssDNA with a blocked 3’-OH can compete against ubiquitination of ADPr. These data, together with our NMR analysis, will further strengthen the evidence that ssDNA and ADPr compete the same binding pocket in DTX3L RD. Understanding how DTX3L RD binds ssDNA/RNA is an ongoing research in the lab.

      (6) Does sequence analysis help explain the specificity of activity for the family of proteins? 

      We will performed sequence alignment of DTX proteins RD domains and discuss this point in the revision.

      (7) While including a summary mechanism (Figure 5I) is helpful, the schematic included does not necessarily make it easier for the reader to appreciate the key findings of the manuscript or to account for the specificity of activity observed. While this figure could be modified, it might also be helpful to highlight the range of substrates that DTX3L can modify - nucleotide, ADPr, ADPr on nucleotides etc. 

      We will modify this Figure as suggested.

      Reviewer #2 (Public Review): 

      Summary: 

      The manuscript by Dearlove et al. entitled "DTX3L ubiquitin ligase ubiquitinates single-stranded nucleic acids" reports a novel activity of a DELTEX E3 ligase family member, DTX3L, which can conjugate ubiquitin to the 3' hydroxyl of single-stranded oligonucleotides via an ester linkage. The findings that unmodified oligonucleotides can act as substrates for direct ubiquitylation and the identification of DTX3 as the enzyme capable of performing such oligonucleotide modification are novel, intriguing, and impactful because they represent a significant expansion of our view of the ubiquitin biology. The authors perform a detailed and diligent biochemical characterization of this novel activity, and key claims made in the article are well supported by experimental data. However, the studies leave room for some healthy skepticism about the physiological significance of the unique activity of DTX3 and DTX3L described by the authors because DTX3/DTX3L can also robustly attach ubiquitin to the ADP ribose moiety of NAD or ADP-ribosylated substrates. The study could be strengthened by a more direct and quantitative comparison between ubiquitylation of unmodified oligonucleotides by DTX3/DTX3L with the ubiquitylation of ADP-ribose, the activity that DTX3 and DTX3L share with the other members of the DELTEX family. 

      Strengths: 

      The manuscript reports a novel and exciting observation that ubiquitin can be directly attached to the 3' hydroxyl of unmodified, single-stranded oligonucleotides by DTX3L. The study builds on the extensive expertise and the impactful previous studies by the Huang laboratory of the DELTEX family of E3 ubiquitin ligases. The authors perform a detailed and diligent biochemical characterization of this novel activity, and all claims made in the article are well supported by experimental data. The manuscript is clearly written and easy to read, which further elevates the overall quality of submitted work. The findings are impactful and will help illuminate multiple avenues for future follow-up investigations that may help establish how this novel biochemical activity observed in vitro may contribute to the biological function of DTX3L. The authors demonstrate that the activity is unique to the DTX3/DTX3L members of the DELTEX family and show that the enzyme requires at least two single-stranded nucleotides at the 3' end of the oligonucleotide substrate and that the adenine nucleotide is preferred in the 3' position. Most notably, the authors describe a chimeric construct containing RING domain of DTX3L fused to the DTC domain DTX2, which displays robust NAD ubiquitylation, but lacks the ability to ubiquitylate unmodified oligonucleotides. This construct will be invaluable in the future cell-based studies of DTX3L biology that may help establish the physiological relevance of 3' ubiquitylation of nucleic acids. 

      Weaknesses: 

      The main weakness of the study is in the lack of direct evidence that the ubiquitylation of unmodified oligonucleotides reported by the authors plays any role in the biological function of DTX3L. The study leaves plenty of room for natural skepticism regarding the physiological relevance of the reported activity, because, akin to other DELTEX family members, DTX3 and DTX3L can also catalyze attachment of ubiquitin to NAD, ADP ribose and ADP-ribosylated substrates. Unfortunately, the study does not offer any quantitative comparison of the two distinct activities of the enzyme, which leaves plenty of room for doubt. One is left wondering, whether ubiquitylation of unmodified oligonucleotides is just a minor and artifactual side activity owing to the high concentration of the oligonucleotide substrates and E2~Ub conjugates present in the in-vitro conditions and the somewhat lower specificity of the DTX3 and DTX3L DTC domains (compared to DTX2 and other DELTEX family members) for ADP ribose over other adenine-containing substrates such as unmodified oligonucleotides, ADP/ATP/dADP/dATP, etc. The intriguing coincidence that DTX3L, which is the only DTX protein capable of ubiquitylating unmodified oligonucleotides, is also the only family member that contains nucleic acid interacting domains in the N-terminus, is suggestive but not compelling. A recently published DTX3L study by a competing laboratory (PMID: 38000390), which is not cited in the manuscript, suggests that ADP-ribose-modified nucleic acids could be the physiologically relevant substrates of DTX3L. That competing hypothesis appears more convincing than ubiquitylation of unmodified oligonucleotides because experiments in that study demonstrate that ubiquitylation of ADP-ribosylated oligos is quite robust in comparison to ubiquitylation of unmodified oligos, which is undetectable. It is possible that the unmodified oligonucleotides in the competing study did not have adenine in the 3' position, which may explain the apparent discrepancy between the two studies. In summary, a quantitative comparison of ubiquitylation of ADP ribose vs. unmodified oligonucleotides could strengthen the study. 

      We thank the reviewer for the constructive feedback. We agree that evidence for the biological function is lacking. While we have tried to detect Ub-ssDNA/RNA from cells, we found that Isolating and detecting labile oxyester-linked Ub-ssDNA/RNA products remain challenging due to (1) low levels of Ub-ssDNA/RNA products, (2) the presence of DUBs and nucleases that rapidly remove the products during the experiments, and (3) our lack of a suitable MS approach to detect the product. For these reasons, we feel that discovering the biological function will require future effort and expertise and is beyond the scope of our current manuscript.

      In the manuscript (PMID: 38000390), the authors used PARP10 to catalyse ADP-ribosylation onto 5’-phosphorylated ssDNA/RNA. They used the following sequences which lacks 3’-adenosine, which could explain the lack of ubiquitination.

      E15_5′P_RNA [Phos]GUGGCGCGGAGACUU

      E15_5′P_DNA [Phos]GTGGCGCGGAGACTT

      We will perform the experiment using this sequence to verify this. We have cited this manuscript but for some reasons, Pubmed has updated its published date from mid 2023 to Jan 2024. We will update the Endnote in the revised manuscript.

      We agree that it is crucial to compare ubiquitination of oligonucleotides and ADPr by DTX3L to find its preferred substrate. We have challenged oligonucleotide ubiquitination by adding excess ADPr and found that ADPr efficiently competes with oligonucleotide (Figure 4D). We will perform more thorough competition experiments by titrating with increasing molar excess of either ADPr or ssDNA to examine the effect on the ubiquitination of ssDNA and ADPr, respectively.

    1. eLife assessment

      This study provides an important advance in the molecular understanding of the lipopolysaccharide export mechanism and machinery in bacteria. By using advanced spectroscopy approaches, the experiments provide solid biophysical support for the dynamic behavior of the multisubunit Lpt transport system. This work has implications for understanding bacterial cell envelope biogenesis and may contribute to the development of drugs that target Gram-negative pathogens.

    2. Reviewer #2 (Public Review):

      Lipopolysaccharide (LPS) is a major component of the outer membrane of Gram-negative bacteria and plays a critical role in bacterial virulence. The LPS export mechanism is a potential target for new antibiotics. Inhibiting this process can render bacteria more susceptible to the host immune system or other antibacterial agents. Given the rise of antibiotic-resistant bacteria, novel targets are urgently needed. The seven LPS transport (Lpt) proteins, A-G, move LPS from the inner to the outer membrane. This study investigated the conformational changes in the LptB2FG-LptC complex using site-directed spin labeling (SDSL) electron paramagnetic resonance (EPR) spectroscopy, revealing how ATP binding and hydrolysis affect the LptF β-jellyroll domain and lateral gates. The findings highlight the role of LptC in regulating LPS entry, ensuring efficient and unidirectional transport across the periplasm.

      The β-jellyrolls are not fully resolved in the vanadate-trapped structure of LptB2FG and LptB2FGC. Therefore, the current study provides valuable information on the functional dynamics of these periplasmic domains, their interactions, and their roles in the unidirectional transport of LPS. Additionally, the dynamic perspective of the lateral gates in LptFG in the presence and absence of LptC is another strength of this study. Moreover, at least in detergent samples, more comprehensive intermediates of the ATP turnover cycle are studied than in the available structures, providing crucial missing mechanistic details.

      Other major strengths of the study include high-quality DEER distance measurements in both detergent and proteoliposomes, the latter providing valuable dynamics information in the lipid environment. However, lipid composition is not mentioned. The proteoliposome study is crucial since the previous structural study (Li, Orlando & Liao 2019) was done in rather small-diameter nanodiscs, which might affect the overall dynamics of the complex. It would have been beneficial if the investigators had reconstituted the complex in lipid nanodiscs with the same composition as proteoliposomes. The mixed lipid/detergent micelles provide an alternative. It seems the ATPase activity of the protein complex is much lower in detergent compared with lipid nanodiscs (Li, Orlando & Liao 2019). In the current study, ATPase activity in proteoliposomes is not provided. Also, the reviewer assumes cysteine-less (CL) constructs of the complex components were utilized. The ATPase assay on CL complex is not presented.

      Additionally, from previous structural studies and the mass spectrometry data presented here, LPS co-purifies and is already bound to the complex, thus the Apo state may represent the LPS-bound state without nucleotides.

      The selection of sites to probe lateral gate 2, which forms the main LPS entry site, may pose an issue. Although the authors provide justification based on the available structures, one site (position 325 in LptF) is located on a flexible loop, and position 52 in LptG is on the neighboring transmembrane helix, separated by a potentially flexible loop from the gating TM1. These labeling sites could exhibit significant local dynamics, resulting in a broader distribution of distances and potentially masking the gating-related conformational changes.

    3. Reviewer #1 (Public Review):

      Summary:

      The current manuscript uses electron spin resonance spectroscopy to understand how the dynamic behavior and conformational heterogeneity of the LPS transport system change during substrate transport and in response to the membrane, bound nucleotide (or transition state analog), and accessory subunits. The study builds on prior structural studies to expand our molecular understanding of this highly significant bacterial transport system.

      Strengths

      This series of well-designed and well-executed experiments provides new mechanistic insights into the dynamic behavior of the LPS transport system. Notable new insights provided by this study include its indication of the spatial organization of the LptC domain, which was poorly resolved in structures, and how the LptC domain modulates the dynamic behavior of the gate through which lipids access the binding site. In addition, a mass spectrometry approach designed to examine LPS binding at different stages in the nucleotide-dependent conformational cycle provides insight into the order of operations of LPS binding and transport.

    4. Reviewer #3 (Public Review):

      Summary:

      The manuscript by Dajka and co-workers reports the application of a biophysical approach to analyse the dynamics of the LptB2FG-C ABC transporter, involved in LPS transport across the cell envelope in Escherichia coli. LptB2FG-C belongs to a new class of ABC transporters (type VI) and is essential and conserved in several Gram-negative pathogens. Since LPS is the major component of the outer membrane of the Gram-negative cell and is responsible for the low permeability of this membrane to several antibiotics, a deep understanding of the mechanism and function of the LptB2FG-C transporter is crucial for the development of new drugs targeting Gram-negative pathogens.

      Several structural studies have been published so far on the LptB2FG-C transporter, disclosing important aspects of the transport mechanism; nevertheless, lack of resolution of some regions of the individual proteins as well as the dynamic nature of the transport mechanism per se (e.g. the insertion and removal of the TM helix of LptC from the TMDs of the transporter during the LPS transport cycle) has greatly limited the understanding of the mechanism that couples ATP binding and hydrolysis with LPS transport. This knowledge gap could be filled by applying an approach that allows the analysis of dynamic processes. The DEER/PELDOR technique applied in this work fits well with this requirement.

      Strengths:

      In this study, the authors provide some new pieces of information on the LptB2FG-C function and the role of LptC in the transporter. Notably, they show that:

      -there is high heterogeneity in the conformational states of the entry gate of LPS in the transporter (gate-2) that are reduced by the insertion of LptC, and the heterogeneity observed is not altered by ATP binding or hydrolysis (as expected since LPS entry is ATP-independent).

      -ATP binding induces an allosteric opening of LptF β-jellyroll domain that allows for LPS passage to the β-jellyroll of LptC, which is stably associated with the β-jellyroll of LptF throughout the cycle.

      - the β-jellyroll of LptG is highly flexible, indicating an involvement in the LPS transport cycle.

      The manuscript is timely and overall clear.

      Weaknesses:

      I list my concerns below and provide suggestions that, in my opinion, should be addressed to reinforce the findings of this study.

      (1) Protein complex controls: the authors assess the ATPase activity of the spin-labelled variants of their protein complexes to rule out the possibility that engineering the proteins to enable spin labelling could affect their functionality (Figure S4). It has been reported that the association of LptC to LptB2FG complex inhibits its ATPase activity. However, in the ATPase assay data shown in Figure S4, the inhibitory effect of the LptC TM is not visible (please compare LptB2FG F-A45C G-I335C and F-L325C G-A52C with and without LptC). This can lead to suspect that the regulatory function of LptC is missing in the LptC-containing complexes used in this work. I suggest the authors include wt LptB2FGC in the assay to compare the ATPase activity of this complex with wt LptB2FG. The published inhibitory effect of TM LptC has been observed in proteoliposomes. Since it is not clear from the paper if the ATPase assay in Figure 4 has been conducted in DDM or proteoliposomes, the lack of inhibitory effect could be due to the assay conditions. A comparative test could answer this question.

      (2) Figure 2: NBD closure upon ATP binding to LptB2FG is convincingly demonstrated both in DDM micelles and proteoliposomes, validating the experimental system. However, since under physiological conditions, ATP binding should take place before the displacement of the TM of LptC (Wilson and Ruiz, Mol microbiol 2022), I suggest the authors carry out the experiments with LptC-containing complexes to investigate conformational changes (if any) that are triggered when ATP binding occurs before the TM displacement.

      (3) Proteoliposomes: in the experiments shown in Figures 3 and 4, unlike those in Figure 2, measurements in proteoliposomes give different results from the experiments in DDM, showing higher heterogeneity. Could this be related to the presence (or absence) of LPS in liposomes? It is not mentioned in the materials and methods section whether LPS is present. Could the authors please discuss this?

      (4) The authors show large conformational heterogeneity in gate-2 (using the spin-labelled pair F-L325R1-G-A52R1) and suggest that deviation from the corresponding simulations could be due to the need for enhanced dynamics to allow for gate interaction with LPS or LptC. The effect of LptC is probed in the experiments shown in Figure 6, but I suggest the authors add LPS to the complexes to evaluate the possible stabilizing effect of LPS on the conformations shown in Figure 4.

      (5) Figure 6: the measurement of lateral gate 1 and 2 dynamics in the LptC-containing complexes clearly supports the hypothesis, proposed based on the available structures, that TM LptC dissociates from LptB2FG upon ATP binding. However, direct evidence of this movement is still missing. Would it be possible to monitor the dynamics of the TM LptC by directly labelling this protein domain? This would give a conclusive demonstration of the displacement during the ATPase cycle.

      (6) LPS release assay: Figure 6 panels H-I-J show the MS spectra relative to LPS-bound and free proteins obtained from wt LptB2FG upon ATP binding and ATP hydrolysis conditions. From these spectra the authors conclude that LPS is completely released only upon ATP hydrolysis. However, the current model predicts that LPS release into the Lpt bridge made by LptC-A-D is triggered by ATP binding. For this reason, I suggest the authors assess LPS release also from the LptB2FGC complex where, in the absence of LptA, LPS would be expected to be mostly retained by the complex under the same conditions.

    1. Reviewer #1 (Public Review):

      Summary:

      An online database called MRAD has been developed to identify the risk or protective factors for AD.

      Strengths:

      This study is a very intriguing study of great clinical and scientific significance that provided a thorough and comprehensive evaluation with regard to risk or protective factors for AD. It also provided physicians and scientists with a very convenient, free as well as user-friendly tool for further scientific investigation.

      Weaknesses:

      (1) The paper mentions that the MRAD database currently contains data only from European populations, with no mention of data from other populations or ethnicities. Given potential differences in Alzheimer's Disease (AD) across different populations, the limitations of the data should be emphasized in the discussion, along with plans to expand the database to include data from more racial and geographic regions.

      (2) Sufficient information should be provided to clarify the data sources, sample selection, and quality control methods used in the MRAD database. Readers may expect more detailed information about the data to ensure data reliability, representativeness, and research applicability.

      (3) While the authors mention that the MRAD database offers interactive visualization interfaces, the paper lacks detailed information on how to interpret and understand these visual results. Guidelines on effectively using these visualization tools to help researchers better comprehend the data are essential.

      (4) In the conclusion section of the paper, it is advisable to explicitly emphasize the practical applications and potential clinical significance of the MRAD database. The paper should articulate how MRAD can contribute to the early identification, diagnosis, prevention, and treatment of AD and its potential societal and clinical value more clearly.

      (5) Grammar and Spelling Errors: There are several spelling and grammar errors in the paper. Referring to a scientific editing service is recommended.

    2. eLife assessment

      This study introduces the MRAD database, which provides a useful tool for evaluating risk and protective factors for Alzheimer's disease through Mendelian randomization analysis. While the findings are supported by solid evidence, the study's value could be enhanced by addressing methodological concerns and ensuring rigorous validation of significant associations. The MRAD database has the potential to aid researchers and clinicians, but the current analysis appears incomplete without these refinements.

    3. Reviewer #2 (Public Review):

      Summary:

      This MR study by Zhao et al. provides a comprehensive hypothesis-free approach to identifying risk and protective factors causal to Alzheimer's Disease (AD).

      Strengths:

      The study employs a comprehensive, hypothesis-free approach, which is novel over traditional hypothesis-driven studies. Also, causal associations between risk/protective factors and AD were addressed using genetic instruments and analysis.

      Major comments:

      (1) The authors used the inverse-variance weighted (IVW) model as the primary method and other MR methods (MR-Egger, weighted mean, etc.) for sensitivity analysis. However, each method has its own assumption, and IVW is only robust when pleiotropy and heterogeneity are not severe. Rather than using IVW imprudently across all associations, it would be more appropriate to choose the best MR method for each association based on heterogeneity/Egger intercept tests. This customized approach, based on tests of MR assumption violations, yields more stable and reliable results. For reference, please follow up on work by Milad et al. (EHJ - "Plasma lipids and risk of aortic valve stenosis: a Mendelian randomization study"). This study selected the best MR model for each association based on pleiotropy and heterogeneity tests. Given the large number of tests in this work, I suggest initially screening significant signals using IVW, as done, and then validating the results using multiple MR methods for those signals. It is common for MR estimates from different methods to vary significantly (with some being statistically significant and others not), and in such cases, the MR estimates from the best-fitted model should be trusted and highlighted.

      (2) Lines 157-160 mentioned "But to date, AD has been reported as hypothesis-driven MR study based on a single factor, ignoring the potential role of a huge number of other risk factors. Also, due to the high degree of heterogeneity present in AD subtypes, which have different biological and genetic characteristics. Thus, the previous studies cannot offer a systematic and complete viewpoint.". This statement overlooks a similar study published in Molecular Psychiatry ("A Phenome-wide Association and Mendelian Randomization Study for Alzheimer's Disease: A Prospective Cohort Study of 502,493"), which rigorously assessed the effects of 4171 factors spanning 10 different categories on AD using observational analysis and MR. The authors should revise their statement on the novelty of their study type throughout the manuscript and discuss how their work differs from and potentially strengthens previous studies.

      (3) Given the large number of tests, the multiple testing issue is concerning. To mitigate potential false positives, I recommend employing the Bonferroni threshold or FDR. The authors should only interpret exposures that are significant at the Bonferroni threshold.

      (4) In the discussion, the authors should interpret or highlight exposures that remain significant after multiple testing corrections.

    1. eLife assessment

      The study presents a valuable finding in advancing our understanding of the cellular and molecular mechanisms that regulate the switching of the migration mode from parallel to radial in cerebellar granule cell development. The evidence supporting the claims of the authors is solid and supports the main conclusion; the highlight was the imaging system's visualization of the cell-recognition event associated with neuronal migration, which established a new standard for the field. This study would be of interest to cell biologists and neurodevelopmental biologists working on cell-cell interaction and neuronal migration.

    2. Reviewer #1 (Public Review):

      This study by Hallada et al. reported the detailed characterization of cis and trans-binding of JAM-C in mediating the developmental migration of CGNs, combining ex vivo cultures, time-lapse imaging, and mathematical analyses. Overall, the study was comprehensively carried out, and the conclusion is important in our understanding of the signaling mechanism of cerebellar development.

      Weaknesses:

      Several technical concerns need to be clarified.

      (1) The efficiency of shRNA knockdown of endogenous JAM-C. The entire study was based on the assumption that the endogenous wild-type JAM-C was depleted to the extent that it would not influence the observed phenotypes. However, this point requires verification, particularly in the ex vivo cultures.

      (2) The expression levels of mutant JAM-C proteins. It is unclear whether the exogenous expression of mutant JAM-C proteins would be comparable to that of the endogenous JAM-C. In addition, the levels of exogenously expressed JAM-C may likely alter over the time course of experiments, e.g., in some experiments over 48 hours.

      (3) The resolution of imaging methods. Different imaging methods were utilized in the study, and it is essential to clearly state the resolution of each imaging dataset (e.g., 0.2 x 0.2 um per pixel). This information is crucial to assess the reliability of observed phenotypes, which in some cases were relatively unimpressive.

    3. Reviewer #2 (Public Review):

      Summary:

      Lamination is a layered neuronal arrangement that provides a basic frame to establish functional connectivity in the brain. The formation of a layered structure requires a highly coordinated interaction between migrating neurons and the developing microenvironment. Earlier studies revealed that to reach specific locations, migrating neurons typically follow various morphogen gradients. Here, Hallada et al. showed that cerebellar granule neurons (CGNs) could navigate via adhesive interaction with Junctional Adhesion Molecule C (JAM-C) followed by recruitment and distribution of intercellular partners (Pard3 and debris) at the contact sites. These results show that neuronal migration could be structured by specific interactions with adhesion molecules and spatial re-arrangements of downstream effectors.

      Strengths:

      The authors concluded that cis/trans binding sites of JAM-C on CGNs are crucial for contact formation with cerebellar glial cells (Bergman glial cells, BGs) and recruitment of Pard3 and drebrin to contact sites. This conclusion was based on the data obtained utilizing several advanced tools and technical approaches, such as cutting-edge microscopy, detailed visualization of cell-cell recognition, and a new correlation analysis.

      Weaknesses:

      (1) Despite multiple advanced methodologies, the study has weaknesses related primarily to the lack of specific evidence in support of findings and data interpretation issues. For example, it is unclear how JAM-C-mediated adhesion facilitates the entry of CGNs into the cerebellar molecular layer (ML). The authors described that CGN-CGN JAM recognition recruits more Pard3 and drebrin compared to CGN-BG recognition, which could increase the dwelling time of CGNs before moving to ML. However, such a mechanism does not explain what would initiate the entry of CGNs into ML. Perhaps the authors could provide a detailed explanation of this phenomenon in the Discussion (but certainly not in the Abstract). Also, the authors could consider revising the content of the Abstract, emphasizing their findings, and leaving out the speculations.

      (2) To allow for comparison, it would be very helpful to indicate specific numerical values for each data point throughout the manuscript. For example, the authors stated that a change in instantaneous migration angle due to JAM-C silencing negatively affects CGNs movement to the ML (Figure 2) and that spatial distribution of negative JAM-Drebrin correlation is altered at CGN-CGN contacts (Figure 7). However, without specific values, it remains unclear what the magnitude of the discussed changes is or whether they were actually significant. It was not certainly straightforward to make specific conclusions based on graphical presentation alone.

    4. Reviewer #3 (Public Review):

      Summary:

      This study elucidated the mechanism controlling the switch from parallel migration to radial migration during the development of cerebellar granule cells by analyzing the behavior of cell-type-specific JAM-mediated adhesion and the downstream factors that promote migration. The research represents a detailed analysis, employing probes to capture cell recognition events between different cell types, a co-culture system (monolayer culture and slice imaging), and imaging techniques, building upon the authors' prior research on JAM-Pard3 interactions. As a result, the authors found that:

      (1) JAM-C-mediated interactions between granule cells (GCNs) are formed earlier and are more robust than JAM-C/JAM-B interactions between GCNs and glia;

      (2) Recruitment of migration-promoting factors Pard3/Drebrin by JAM interactions is more efficient in GCN-GCN (JAM-C/JAM-C) interactions; and

      (3) The distribution pattern of Pard3/Drebrin differs between GCN-GCN and GCN-Glia interactions, as revealed by detailed imaging analysis.

      Consequently, the authors discovered that these differences contribute to a time lag between parallel and radial migration, which serves as a temporal checkpoint sorting mature cerebellar granule cells.

      Strengths:

      Cell migration is a commonly observed phenomenon in neural development. It is crucial for sorting specific cell populations and positioning them appropriately to develop proper neural circuits. While the regulation of these migrations is known to be mediated by secreted guidance factors, this study demonstrates that combinations of cell adhesion molecules (JAM) mediate cell type-specific interactions that contribute to the timing control of cell migration. This finding significantly advances our understanding of the mechanisms governing cell migration in neural development.

      Weaknesses:

      The author's hypothesis has been validated using in vitro systems. While in vitro systems allow for a more detailed design of experimental parameters, validation in vivo would still be necessary to demonstrate whether the temporal checkpoint of migration mediated by cell-cell interactions works. For example, knockout of JAM-C in cerebellar granule cells could be considered for such validation. Furthermore, the behavioral analysis of these mutant mice would be interesting.

      Additionally, the author's observation that recruitment patterns of Pard3 and Drebrin at adhesive sites vary between interacting cell pairs is intriguing and suggests exciting implications. It would be highly informative if the relationship between these differences and ML entry timing could be demonstrated.

    1. eLife assessment

      Wittkamp et al. investigated the spatiotemporal dynamics of expectation of pain using an original fMRI-EEG approach. The methods are solid and the evidence for a substantially different neural representation between the anticipatory and the actual pain period is convincing. These important findings would benefit from a general framework to encompass their research questions, hypotheses, and interpretation of results. Furthermore, a more in-depth discussion about the choice of conditions would be desirable, specifically whether the definitions of nocebo and placebo in the study are comparable with traditional paradigms, and whether the control condition can be considered as a situation with no expectation or no prediction.

    2. Reviewer #1 (Public Review):

      Summary:

      In this important paper, the authors investigate the temporal dynamics of expectation of pain using a combined fMRI-EEG approach. More specifically, by modifying the expectations of higher or lower pain on a trial-to-trial basis, they report that expectations largely share the same set of activations before the administration of the painful stimulus, and that the coding of the valence of the stimulus is observed only after the nociceptive input has been presented. fMRI-informed EEG analysis suggested that the temporal sequence of information processing involved the Dorsolateral prefrontal cortex (DLPFC), the anterior insula, and the anterior cingulate cortex. The strength of evidence is convincing, and the methods are solid, but a few alternative interpretations about the findings related to the control group, as well as a more in-depth discussion on the correlations between the BOLD and EEG signals would strengthen the manuscript.

      Strengths:

      In line with open science principles, the article presents the data and the results in a complete and transparent fashion.

      From a theoretical standpoint, the authors make a step forward in our understanding of how expectations modulate pain by introducing a combination of spatial and temporal investigation. It is becoming increasingly clear that our appraisal of the world is dynamic, guided by previous experiences, and mapped on a combination of what we expect and what we get. New research methods, questions, and analyses are needed to capture these evolving processes.

      Weaknesses:

      The control condition is not so straightforward. Across the manuscript it is defined as "no expectation", and in the legend of Figure 1 it is mentioned that the third state would be "no prediction". However, it is difficult to conceive that participants would not have any expectations or predictions. Indeed, in the description of the task it is mentioned that participants were instructed that they would receive stimuli during "intermediate sensitive states". The results of the pain scores and expectations might support the idea that the control condition is situated in between the placebo and nocebo conditions. However, since this control condition was not part of the initial conditioning, and = participants had no reference to previous stimuli, one might expect that some ratings might have simply "regressed to the mean" for a lack of previous experience.

      General considerations and reflections:

      Inducing expectations in the desired direction is not a straightforward task, and results might depend on the exact experimental conditions and the comparison group. In this sense, the authors' choice of having 3 groups of positive, negative, and "neutral" expectations is to be praised. On the other hand, also control groups form their expectations, and this can constitute a confounder in every experiment using expectation manipulation, if not appropriately investigated.

      In addition, although fMRI is still (probably) the best available tool we have to understand the spatial representation of cortical processing, limitations about not only the temporal but even the spatial resolution should be acknowledged. Given the anatomical and physiological complexity of the cortical connections, as we know from the animal world, it is still well possible that subcircuits are activated also for positive and negative expectations, but cannot be observed due to the limitation of our techniques. Indeed, on an empirical/evolutionary basis it would remain unclear why we should have a system that waits for the valence of a stimulus to show differential responses.

      Also, moving in a dimension of network and graph theory, one would not expect single areas to be responsible for distinct processes, but rather that they would integrate information in a shared way, potentially with different feedback and feedforward communications. As such, it becomes more difficult to assume the insula is a center for coding potential pain, perhaps more of a node in a system that signals potential dangers for the integrity of the body.

      The authors analyze the EEG signal between 0.5 to 128 Hz, finding significant results in the correlation between single-trial BOLD and EEG activity in the higher gamma range (see Figure 6 panel C). It would be interesting to understand the rationale for including such high frequencies in the signal, and the interpretation of the significant correlation in the high gamma range.

    3. Reviewer #2 (Public Review):

      I think this is a very promising paper. The combination of EEG and fMRI is unique and original. However, I also have some suggestions that I think could help improve the manuscript.

      This manuscript reports the findings of an EEG-fMRI study (n = 50) on the effects of expectations on pain. The combination of EEG with fMRI is extremely original and well-suited to study the transition from expectation to perception. However, I think that the current treatment of the data, as well as the way that the manuscript is currently written, does not fully capitalize on the potential of this unique dataset. Several findings are presented but there is currently no clear message coming out of this manuscript.

      First, one positive point is that the experimental manipulation clearly worked. However, it should be noted that the instructions used are not typical of studies on placebo/nocebo. Participants were not told that the stimulations would be of higher/lower intensity. Rather, they were told that objective intensities were held constant, but that EEG recordings could be used to predict whether they would perceive the stimulus as more or less intense. I think that this is an interesting way to manipulate expectations, but there could have been more justification in the introduction for why the authors have chosen this unusual procedure.

      Also, the introduction mentions that little is known about potential cerebral differences between expectations of high vs. low pain expectations. I think the fear conditioning literature could be cited here. Activations in ACC, SMA, Ins, parahippocampal gyrus, PAG, etc. are often associated with upcoming threat, whereas activations vmPFC/default mode network are associated with safety.

      The fact that the authors didn't observe a clearer distinction between high and low expectations here could be related to their specific instructions that imply that the stimulus is the same and that it is the subjective perception that is expected to change. In any case, this is a relatively minor issue that is easy to address.

      Towards the end of the introduction, the authors present the aims of the study in mainly exploratory terms:<br /> (1) What are the differences between anticipation and perception?<br /> (2) What regions display a difference between high and low expectations (high > low or low < high) vs. an effect of expectation regardless of the direction (high and low different than neutral)?<br /> I think these are good questions, but the authors should provide more justification, or framework, for these questions. More specifically, what will they be able to conclude based on their observations?

      For instance (note that this is just an example to illustrate my point. I encourage the authors to come up with their own framework/predictions) :

      (1) Possibility #1: A certain region encodes expectations in a directed fashion (high > low) and that same region also responds to perception in the same direction (high > low). This region would therefore modulate pain by assimilating perception towards expectations.<br /> (2) Possibility # 2: different regions are involved in expectation and perception. Perhaps this could mean that certain regions influence pain processing through descending facilitation for instance...

      Regarding analyses, I think that examining the transition from expectations to perception is a strong angle of the manuscript given the EGG-fMRI nature of the study. However, I feel that more could have been done here. One problem is that the sequence of analyses starts by identifying an fMRI signal of interest and then attempts to find its EEG correlates. The problem is that the low temporal resolution of fMRI makes it difficult to differentiate expectation from perception, which doesn't make this analysis a good starting point in my opinion. Why not start by identifying an EEG signal that differentiates perception vs expectation, and then look for its fMRI correlates?

      Finally, I found the hypotheses on "valenced" vs. "absolute" effects a little bit more difficult to follow. This is because "neutral" is not really neutral: it falls in between low and high. If I follow correctly, participants know that the temperature is always the same. Therefore, if they are told that the machine cannot predict whether their perception is going to be low or high, then it must be because it is likely to be in between. Ratings of expectation and pain ratings confirm that. The neutral condition is not "devoid" of expectations as the authors suggest. Therefore, it would make sense to look at regions with the following pattern low > neutral > high, or vice-versa, low < neutral < high. Low & high being different than neutral is more difficult to interpret. I don't think that you can say that it reflects "absolute" expectations because neutral is also the expectation of a medium temperature. Perhaps it reflects "certainty/uncertainty" or something like that, but it is not clear that it reflects "expectations".

    1. eLife assessment

      This interesting study explores whether tumor cells can manipulate their Hydra hosts and has useful findings on the consequences for the fitness of the host Hydra.<br /> However, the evidence supporting these findings was incomplete, would benefit from the addition of several control experiments. The work will be of broad interest to many fields including development biology, evolutionary biology and tumor biology.

    2. Reviewer #1 (Public Review):

      Summary:

      In this manuscript, BOUTRY et al examined a cnidarian Hydra model system where spontaneous tumors manifest in laboratory settings, and lineages featuring vertically transmitted neoplastic cells (via host budding) have been sustained for over 15 years. They observed that hydras harboring long-term transmissible tumors exhibit an unexpected augmentation in tentacle count. In addition, the presence of extra tentacles, enhancing the host's foraging efficiency, correlated with an elevated budding rate, thereby promoting tumor transmission vertically. This study provided evidence that tumors, akin to parasitic entities, can also exert control over their hosts.

      Strengths:

      The manuscript is well-written, and the phenotype is intriguing.

      Weaknesses:

      The quality of this manuscript could be improved if more evidence were to be provided regarding the beneficial versus detrimental effects of the tumors.

    3. Reviewer #2 (Public Review):

      Background and Summary:

      This study addresses the intriguing question of whether and how tumors can develop in the freshwater polyp hydra and how they influence the fitness of the animals. Hydra is notable for its significant morphogenetic plasticity and nearly unlimited capacity for regeneration. While its growth through asexual reproduction (budding) and the associated processes of pattern formation have been extensively studied at the cellular level, the occurrence of tumors was only recently described in two strains of Hydra oligactis (Domazet-Lošo et al, 2014). In that research, an arrest in the differentiation of female germ cells led to an accumulation of germline cells that failed to develop into eggs. In hydra, fertile egg cells typically incorporate nurse cells, which originate from large interstitial stem cells (ISCs) restricted to the germline, through apoptosis. However, this increase in apoptosis activity is absent in "germline tumors," and germline ISCs instead form slowly growing patches that do not compromise tissue integrity. Despite the upregulation of certain genes associated with mammalian neoplasms (such as tpt1 and p23) in this tissue, determining whether this differentiation arrest and the resulting egg patches truly constitute neoplasms remains a challenge.

      The authors have recently published two papers on the ecological and evolutionary aspects of hydra tumor formation (Boutry et al 2022, 2023), which is also the focus of this manuscript. They transplanted tissues derived from animals with germline tumors to wildtype animals and analyzed their growth patterns, specifically the number of tentacles in the host tissue. They observed that such tissues induced the growth of additional tentacles compared to tissues without germline tumors. The authors conclude that this growth pattern (increased number of tentacles) is correlated with "reducing the burden on the host by (over-)compensating for the reproductive costs of tumors" and claim that "transmissible tumors in hydra have evolved strategies to manipulate the phenotype of their host". While it might be stimulating to add a fresh view from other disciplines (here, ecological and evolutionary aspects), the authors completely ignore the current knowledge of the underlying cell biology of the processes they analyze.

      Strengths:

      The study focuses on intriguing questions. Whether and how tumors can develop in the freshwater polyp hydra, and how they influence the fitness of the animals?

      Weaknesses:

      Concept of germline tumors.<br /> The conceptual foundation of their experiments on germline tumors was the study of Domazet-Lošo et al (2014) introducing the concept of germline tumors in hydra (see above). While this is an intriguing hypothesis, there has been little advancement in comprehending the molecular mechanisms underlying tumor formation in hydra beyond this initial investigation. Germline tumors in hydra do not fully meet the typical criteria for neoplasms observed in mammalian tissues. More importantly, a similar phenotype was already reported by the work of Paul Brien and described as "crise gametique" (Brien, 1966, Biologie de la reproduction animale - Blastogenèse, Gamétogenèse, Sexualisation, ed. Masson & Cie, Paris). This phenomenon of gametic crisis is unique to Hydra oligactis, a stenotherm, cold-adapted cosmopolitan species. In this species, gametogenesis severely impacts the vitality of the polyps, often leading to complete exhaustion and death (Tardent, 1974). Animals can only be rescued during the initial phase of the cold-induced sexual period (see also the research of Littlefield (1984, 1985, 1986, 1991). The observed arrest in differentiation arrest in germline tumors might represent an epigenetically established consequence of surviving gametogenesis. Regrettably, this important work was not mentioned by the authors or by Domazet-Lošo et al. (2014), highlighting a notable gap in the recognition of basic research in this area that might challenge the hydra tumor hypothesis.

      "Super-nummary" tentacles in graft experiments.<br /> The authors describe that after grafting tissue from animals with germline tumors to wild-type animals, the number of tentacles in the host tissue increased when the donor tissue had germline tumors. A maximum effect of four additional tentacles was found with donor strain H. oligactis robusta and three additional tentacles with donor strain H.oligactis St Petersburg. In general, H.oligactis wild-type host strains had fewer tentacles than H.oligactis St Petersburg strains. This is consistent with the results of Domazet-Lošo et al (2014) who showed that the number of tentacles increased in the strains with germline tumors. What conclusions can be drawn from these experiments? The authors might want to conclude that transmissible tumors in Hydra have developed strategies to manipulate the phenotype of their host. But there is no evidence for this, as essential controls are missing. It is known that the size of hydra polyps is proportion-regulated, i.e. the number of tentacles varies with the size and number of (epithelial) cells. Such controls are missing in the experiments. There is also a lack of controls from wild-type animals in gametogenesis: it is very likely that grafts with wild-type animals with egg spots of comparable size as the germline tumors (see above) will result in similar numbers of tentacles in host tissue.

    1. eLife assessment

      This is an important and timely study that advances our understanding of the role of lateral hypothalamic orexin/hypocretin neurons in appetitive approach and consummatory behaviors. Specifically, using fiber photometry, the authors provide solid and convincing evidence that orexin neurons are primarily active during approach and not consummatory behavior, in a manner that is dependent on metabolic state. Further, using optogenetics and cell type-specific electrophysiology, they show that inputs from the ventral pallidum and lateral nucleus accumbens shell to orexin/hypocretin neurons in the lateral hypothalamus are predominantly inhibitory.

    2. Reviewer #1 (Public Review):

      Summary:

      Using fiber photometry, Mitchell et al. report that the calcium activity of lateral hypothalamic orexin neurons increases during the approach to a food pellet in a manner that depends on the metabolic state and begins to return to baseline prior to and during food consumption. This activity is also enhanced during the approach to palatable food relative to a standard chow pellet. They also present ex vivo electrophysiological evidence that GABAergic neurons in the ventral pallidum and lateral nucleus accumbens shell, but not medial nucleus accumbens shell, provide predominantly inhibitory, monosynaptic input onto lateral hypothalamic neurons. Overall, most claims are well supported by the data, though the electrophysiology analysis is somewhat limited and some information that could inform interpretation of the data is lacking.

      Strengths:

      (1) The fiber photometry recordings make use of an isosbestic control, and the signals were aligned using linear regression after baseline correction and calculation of robust z-scores.

      (2) The fiber photometry analyses are based on animal averages, rather than trial-based averages, which can result in Type 1 errors without appropriate measures to account for the influence of the subject.

      (3) Monosynaptic currents from GABAergic inputs from the ventral pallidal and lateral shell are identified by the remaining current in the presence of tetrodotoxin (TTX) and 4-aminopyridine (4-AP).

      Weaknesses:

      (1) The data are not discussed in the context of the prior literature on ventral pallidal GABAergic inputs to the lateral hypothalamus (such as Prasad et al. 2020, JNeurosci) and it is not clear whether these patterns of monosynaptic inhibitory inputs are specific to orexin neurons.

      (2) The paper does not address whether there are synaptic inputs from non-GABAergic ventral pallidum neurons, though very recent work suggests that ventral pallidal projections to the lateral hypothalamus may be enriched with glutamatergic RNA markers relative to other projections (Bernet et al. 2024, JNeurosci). Some statements in the manuscript refer to ventral pallidal inputs in general, despite the use of cell-type specific expression in VGAT-cre mice.

      (3) The statistical analysis of the electrophysiology data is limited and does not appear to account for the lack of independence for cells recorded from the same mouse.

    3. Reviewer #2 (Public Review):

      Summary:

      Mitchell & Mohammadkhani et al. used an Orexin-Cre mouse line with a Cre-dependent GCaMP virus to perform lateral hypothalamic (LH) Ca2+ fiber photometry recordings in mice during the approach to food under various metabolic and saliency conditions. They also used a Vgat-Cre mouse line with Cre-dependent ChR2 in various regions of the ventral striatopallidal (VSP) complex in combination with an Orexin promoter-driven reporter virus labeling Orx-LH neurons to assess electrophysiological connectivity of inhibitory/excitatory inputs from VSP to Orx-LH. Overall, authors note that Orx-LH Ca2+ activation occurs during approach to food (but not consumption of food), and that VSP->Orx-LH connectivity is primarily monosynaptic and inhibitory, although this varies across subregions, with some monosynaptic excitatory input as well. While their methods and analyses are technically sound and the manuscript is clearly written and presented, the further knowledge gained over previous work is rather incremental and does not produce a substantial shift in the current existing framework.

      Strengths:

      Cell type specificity of OX/HT recordings is confirmed by post-hoc immunostaining, both for fiber photometry and electrophysiological connectivity. This is an important strength given the contentious history of cell specificity in various transgenic OX/HT mouse lines.

      Clearly implicating metabolic state and food saliency as factors impacting OX/HT activity dynamics is a strength, and linking the influence of ghrelin receptor signaling is relatively novel.

      Weaknesses:

      In fiber photometry traces, OX/HT activity begins increasing 2-3 seconds prior to the food approach (Figures 1F and 1G), requiring an explanation. One possibility is that mice may be detecting odorant cues indicative of food prior to the physical approach.

      Figure 1F - the authors' interpretation that OX/HT activity doesn't actually decrease during consumption, but simply "trends toward baseline" is complicated by the fact that the authors shaded 20s-30s intervals labeled "eating". Mice do not typically consume food for 20-30s nonstop. Mice typically consume for ~1-5 seconds, then they take a break, then they resume.

      The authors state in the Discussion "... the reduction in OX/HT cell activity was more closely correlated with the termination of approach behavior" (rather than with eating per se). However, in many cases, mice begin consuming food immediately after approaching it, so it is puzzling that there is an activity reduction following the approach, but not an activity reduction upon consumption. In other words, the cessation of approach and the beginning of consumption are often tightly linked together in rapid sequence.

      Figure 2E - the single polysynaptic oIPSC appears to have the same/similar latency as many of the Monosynaptic oIPSCs. Close proximity of consecutive oIPSCs may affect the analysis of amplitude and latency. For example, in representative traces of Figure 2C, it is unlikely to get an accurate measure of the second oIPSC.

      The comparison of apparent connectivity differences between VP vs. mNAcSh vs. lNAcSh is limited by appropriate anatomical quantification and demonstration. When using a Vgat-Cre mouse line and targeting the VSP, there is the potential for massive viral spread across the entire Nucleus accumbens/VP/SI/BNST area.

      How do the electrophysiological properties of OX/HT neurons (and VSP inputs) change across metabolic/saliency states? For example, under High Fat Diet, chronic Food Restriction, and chronic Ghrelin. This seems to be the fundamental question that the authors are working toward, but it is not resolved with the current data set.

      Potential Ephys Pitfall: a high Chloride internal solution means that oEPSCs might actually be GABAergic after all. Low Chloride solution, so Cl reversal potential is closer to RMP (or put more Chloride in pipette so it has more depolarized potential than resting- to reverse current mediated by Chloride ions). However, the internal solution used for oEPSCs was calculated to have a Cl reversal potential at ~ -20mV; thus, the Cl-mediated PSCs would be depolarizing when cells were held at -65mV. Did the authors apply any blockers in the bath to confirm that recorded oEPSCs were glutamatergic?

    4. Reviewer #3 (Public Review):

      Summary:

      Orexin/hypocretin (OX/HT) neurons are implicated in food intake and there is evidence supporting OX/HT neurons' role in reward consumption potentially influenced by animal's metabolic state. Here, Mitchell, Mohammadkhani, et al. use fiber photometry to dissociate OX/HT neurons' role in reward-seeking by contrasting their role in reward consumption. Mice were given normal chow or palatable food in a fed or fasted state. The authors recorded GCAMP signals from OX/HT neurons during food approach and consumption. They observed heightened OX/HT GCAMP signals during the food approach; in contrast, they saw the signals decline during arrival at the food source and during food consumption. In a second set of experiments, the authors investigate upstream circuits that could potentially gate OX/HT neurons. They use optogenetics to directly stimulate inhibitory inputs arriving from either the ventral pallidum, the medial, or the lateral nucleus accumbens shell to OX/HT neurons. They investigated if these circuits impinge monosynaptically or polysynaptically onto OX/HT neurons to assess their functional role in inhibiting these neurons. The authors found that the ventral pallidum and the lateral but not medial nucleus accumbens shell exert inhibitory control over OX/HT neurons.

      Strengths:

      The manuscript is well-written, employs suitable statistical analyses, and the conclusions are generally supported by the results.

      Weaknesses:

      Larger group sizes in some instances and causal manipulation of the inhibitory circuits during reward approach vs consumption would enable the authors to make stronger assertions about these circuits' role in gating OX/HT neurons in these behaviors.

    1. eLife assessment

      This important study substantially advances our understanding of energy landscapes and their link to animal ontogeny. The evidence supporting the conclusions is compelling, with<br /> high-throughput telemetry data and advanced track segmentation methods used to develop and map energy landscapes. The work will be of broad interest to animal ecologists.

    2. Reviewer #1 (Public Review):

      Summary:

      The authors propose that the energy landscape of animals can be thought of in the same way as the fundamental versus realized niche concept in ecology. Namely, animals will use a subset of the fundamental energy landscape due to a variety of factors. The authors then show that the realized energy landscape of eagles increases with age as the animals are better able to use the energy landscape.

      Strengths:

      This is a very interesting idea and that adds significantly to the energy landscape framework. They provide convincing evidence that the available regions used by birds increase with size.

      Weaknesses:

      Some of the measures used in the manuscript are difficult to follow and there is no mention of the morphometrics of birds or how these change with age (other than that they don't change which seems odd as surely they grow). Also, there may need to be more discussion of other ontogenetic changes such as foraging strategies, home range size etc.

    3. Reviewer #2 (Public Review):

      Summary:

      With this work, the authors tried to expand and integrate the concept of realized niche in the context of movement ecology by using fine-scale GPS data of 55 juvenile Golden eagles in the Alps. Authors found that ontogenic changes influence the percentage of area flyable to the eagles as individuals exploit better geographic uplifts that allow them to reduce the cost of transport.

      Strengths:

      Authors made insightful work linking changes in ontogeny and energy landscapes in large soaring birds. It may not only advance the understanding of how changes in the life cycle affect the exploitability of aerial space but also offer valuable tools for the management and conservation of large soaring species in the changing world.

      Weaknesses:

      Future research may test the applicability of the present work by including more individuals and/or other species from other study areas.

    1. eLife assessment

      This important work significantly advances the field of computational modelling of genome organisation through the development of OpenNucleome. The evidence supporting the tool's effectiveness is compelling, as the authors compare their predictions with experimental data. It is anticipated that OpenNucleome will attract significant interest from the biophysics and genomics communities.

    2. Reviewer #1 (Public Review):

      Summary:

      In this paper the authors develop a comprehensive program to investigate the organization of chromosome structures at 100 kb resolution. It is extremely well executed. The authors have thought through all aspects of the problem. The resulting software will be most useful to the community. Interestingly they capture many experimental observations accurately. I have very little complaints.

      Strengths:

      A lot of details are provided. The success of the method is well illustrated. Software is easily available,

      Weaknesses:

      The number of parameters in the energy function is very large. Any justification? Could they simply be the functions?

      What would the modification be if the resolution is increased?

      They should state that the extracted physical values are scale dependent. Example, viscosity.

    3. Reviewer #2 (Public Review):

      Summary:

      In this work, Lao et al. develop an open-source software (OpenNucleome) for GPU-accelerated molecular dynamics simulation of the human nucleus accounting for chromatin, nucleoli, nuclear speckles, etc. Using this, the authors investigate the steady-state organization and dynamics of many of the nuclear components.

      Strengths:

      This is a comprehensive open-source tool to study several aspects of the nucleus, including chromatin organization, interactions with lamins and organization, and interactions with nuclear speckles and nucleoli. The model is built carefully, accounting for several important factors and optimizing the parameters iteratively to achieve experimentally known results. Authors have simulated the entire genome at 100kb resolution (which is a very good resolution to simulate and study the entire diploid genome) and predict several static quantities such as the radius of gyration and radial positions of all chromosomes, and time-dependent quantities like the mean-square displacement of important genomic regions.

      Weaknesses:

      One weakness of the model is that it has several parameters. Some of them are constrained by the experiments. However, the role of every parameter is not clear in the manuscript.

    4. Reviewer #3 (Public Review):

      Summary:

      The authors present OpenNucleome, a computational tool for simulating the structure and dynamics of the human nucleus. The software models nuclear components, including chromosomes and nuclear bodies, and incorporates GPU acceleration for potential performance gains. The authors aim to advance the understanding of nuclear organization by providing a tool that aligns with experimental data and is accessible to the genome architecture research community.

      Strengths:

      OpenNucleome provides a model of the nucleus, contributing to the advancement of computational biology.<br /> Utilizing GPU acceleration with OpenMM may offer potential performance improvements.

      Weaknesses:

      It could still take advantage of clearer explanations regarding the generation and usage of input and output files and compatibility with other tools.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer 1:

      Comment 0: In this paper, the authors develop a comprehensive program to investigate the organization of chromosome structures at 100 kb resolution. It is extremely well executed. The authors have thought through all aspects of the problem. The resulting software will be most useful to the community. Interestingly they capture many experimental observations accurately.

      I have very few complaints.

      We appreciate the reviewer’s strong assessment of the paper’s significance, novelty, and broad interest, and we thank them for the detailed suggestions and comments.

      Comment 1: The number of parameters in the energy function is very large. Is there any justification for this? Could they simplify the functions?

      We extend our gratitude to the reviewer for their insightful remarks. The parameters within our model can be categorized into two groups: those governing chromosome-chromosome interactions and those governing chromosome-nuclear landmark interactions.

      In terms of chromosome-chromosome interactions, the parameter count is relatively modest compared to the vast amount of Hi-C data available. For instance, while the whole-genome Hi-C matrix at the 100KB resolution encompasses approximately 303212 contacts, our model comprises merely six parameters for interactions among different compartments, along with 1000 parameters for the ideal potential. As outlined in the supporting information, the ideal potential is contingent upon sequence separation, with 1000 chosen to encompass bead separations of up to 100MB. While it is theoretically plausible to reduce the number of parameters by assuming interactions cease beyond a certain sequence separation, determining this scale a priori presents a challenge.

      During the parameterization process, we observed that interchromosomal contacts predicted solely based on compartmental interactions inadequately mirrored Hi-C data. Consequently, we introduced 231 additional parameters to more accurately capture interactions between distinct pairs of autosomes. These interactions may stem from factors such as non-coding RNA or proteins not explicable by simple, non-specific compartmental interactions.

      Regarding parameters concerning chromosome-nuclear landmark interactions, we have 30321 parameters for speckles and 30321 for the nuclear lamina. To streamline the model, we opted to assign a unique parameter to each chromatin bead. However, it is conceivable that many chromatin beads share a similar mechanism for interacting with nuclear lamina or speckles, potentially allowing for a common parameter assignment. Nonetheless, implementing such simplification necessitates a deeper mechanistic understanding of chromosome-nuclear landmark interactions, an aspect currently lacking.

      As our comprehension of nuclear organization progresses, the interpretability of parameter counts may improve, facilitating their reduction.

      Comment 2: What would the modification be if the resolution is increased?

      To increase the resolution of chromatin, we can in principle keep the same energy function as defined in Eq. S6. In this case, we only need to carry out further parameter optimization.

      However, transitioning to higher resolutions may unveil additional features not readily apparent at 100kb. Notably, chromatin loops with an average size of 200kb or smaller have been identified in high-resolution Hi-C data [1]. To effectively capture these loops, new terms in the energy function must be incorporated. For instance, Qi and Zhang [2] employed additional contact potentials between CTCF sites to account for loop formation. Alternatively, an explicit loop-extrusion process could be introduced to model loop formation more accurately.

      Comment 3: They should state that the extracted physical values are scale-dependent. For example, viscosity.

      We thank the reviewer for the comment and would like to clarify that our model does not predict the viscosity. The nucleoplasmic viscosity was set as 1Pa · s to produce a diffusion coefficient that reproduces experimental value. The exact value for the nucleoplasmic viscosity is still rather controversial, and our selected value falls in the range of reported experimental values from 10−1Pa·s to 102Pa · s.

      We have modified the main text to clarify the calculation of the diffusion coefficient.

      “The exponent and the diffusion coefficient Dα = (27±11)×10−4μm2 · s−α both match well with the experimental values [cite], upon setting the nucleoplasmic viscosity as 1Pa · s (see Supporting Information Section: Mapping the reduced time unit to real time for more details).”

      Reviewer 2:

      Comment 0: In this work, Lao et al. develop an open-source software (OpenNucleome) for GPU-accelerated molecular dynamics simulation of the human nucleus accounting for chromatin, nucleoli, nuclear speckles, etc. Using this, the authors investigate the steady-state organization and dynamics of many of the nuclear components.

      We thank the reviewer for summary of our work.

      Comment 1: The authors could introduce a table having every parameter and the optimal parameter value used. This would greatly help the reader.

      We would like to point out that model parameters are indeed provided in Table S1, S2, S3, S4, and Fig. S7. In these tables, we further provided details on how the parameters were determined.

      Given the large number of parameters for the ideal potential (1000), we opted to plot it rather than listing out all the numbers. We added three new figures to plot the interaction parameters between chromosomes, between chromosomes and speckles, and between chromosomes and the nuclear lamina. Numerical values can be found online in the GitHub repository (parameters).

      Comment 2: How many total beads are simulated? Do all beads have the same size?

      The total number of the coarse-grained beads is 70542, including 60642 chromatin beads, 300 nucleolus beads, 1600 speckle beads, and 8000 nuclear lamina beads. The radius of the chromatin, nucleolus, and speckle beads is 0.25, while that of the lamina bead is 0.5. More information of the size and number of the beads are discussed in the Section: Components of the whole nucleus model.

      Comment 3: In Equation S17, what is the 3rd and 4th powers mean? What necessitates it?

      The potential defined in Equation S17 follows the definition of class2 bond in the LAMMPS package (LAMMPS docs). Compared to a typical harmonic potential, the presence of higher order terms produces sharper increase in the energy at large distances (Author response image 1). This essentially reduces the flucatuation of bond length in simulations.

      Author response image 1.

      Comparison between the Class2 potential (defined in Eq. S17) and the Harmonic potential (K(r − r0)2, with K = 20 and r0 = 0.5).

      Comment 4: What do the X-axis and Y-axis numbers in Figure 5A and 5B mean? What are their units?

      We apologize for the lack of clarify in our original figure. In Fig. 5A, the X and Y axis depicts the simulated and experimental radius of gyration (Rg) for individual chromosomes, as indicated in the title of the figure. Similarly, in Fig. 5B, the X and Y axis depicts the simulated and experimental radial position of individual chromosomes.

      We have converted the chromosome Rg values into reduced units and labeled the corresponding axes in the updated figure (Fig. 5). The normalized radial position is unitless and its detailed definition is included in the supporting information Section: Computing simulated normalized chromosome radial positions. We updated the figure caption to provide an explicit reference to the SI text.

      Reviewer 3:

      Comment 0: In this work, the authors present the development of OpenNucleome, a software for simulating the structure and dynamics of the human nucleus. It provides a detailed model of nuclear components such as chromosomes and nuclear bodies, and uses GPU acceleration for better performance based on the OpenMM package. The work also shows the model’s accuracy in comparisons with experimental data and highlights the utility in the understanding of nuclear organization. While I consider this work a good tool for the genome architecture scientific community, I have some comments and questions that could further clarify the usage of this tool and help potential users. I also have a few questions that would help to clarify the technique and results and some suggestions for references.

      We appreciate the reviewer’s strong assessment of the paper’s significance, novelty, and broad interest, and we thank them for the detailed suggestions and comments.

      Comment 1: Could the authors elaborate on what they consider to be ’well-established and easily adoptable modeling tools’?

      By well established, we meant that models that have been extensively validated and verified, and are highly regarded by the community.

      By easily adoptable, we meant that tools that are well documented and can be relatively easily learned by new groups without help from the developers.

      We have revised the text to clarify our meaning.

      “Despite the progress made in computational modeling, the absence of well-documented software with easy-to-follow tutorials pose a challenge.”

      Comment 2: Recognizing the value of a diverse range of tools in the community, the Open-MiChroM tool is also an open-source platform built on top of OpenMM. The documentation shows various modeling approaches and many tutorials that contain different approaches besides the MiChroM energy function. How does OpenNucleome compare in terms of facilitating crossvalidation and user accessibility? The two tools seem to be complementary, which is a gain to the field. I recommend adding one or two sentences in the matter. Also, while navigating the OpenNucleome GitHub, I have not found the tutorials mentioned in the text. I also consider a barrier in the process of generating necessary input files. I would suggest expanding the tutorials and documentation to help potential users.

      We thank the reviewer for the excellent comments. We agree that while many of the tutorials were included in the original package, they were not as clearly documented. We have revised them extensively to to now present:

      • A tutorial for optimizing chromosome chromosome interactions.

      • A tutorial for optimizing chromosome nuclear landmark interactions.

      • A tutorial for building initial configurations.

      • A tutorial for relaxing the initial configurations.

      • A tutorial for selecting the initial configurations.

      • A tutorial for setting up performing Langevin dynamics simulations.

      • A tutorial for setting up performing Brownian dynamics simulations.

      • A tutorial for setting up performing simulations with deformed nucleus.

      • A tutorial for analyzing simulation trajectories.

      • A tutorial for introducing new features to the model.

      These tutorials and our well-documented and open source code (https://zhanggroup-mitchemistry.github.io/OpenNucleome) should significantly promote user accessibility. Our inclusion of python scripts for analyzing simulation trajectorials shall allow users to compute various quantities for evaluating and comparing model quality.

      We added a new paragraph in the Section: Conclusions and Dicussion of the main text to compare OpenNucleosome with existing software for genome modeling.

      “Our software enhances the capabilities of existing genome simulation tools [cite]. Specifically, OpenNucleome aligns with the design principles of Open-MiChroM [cite], prioritizing open-source accessibility while expanding simulation capabilities to the entire nucleus. Similar to software from the Alber lab [cite], OpenNucleome offers highresolution genome organization that faithfully reproduces a diverse range of experimental data. Furthermore, beyond static structures, OpenNucleome facilitates dynamic simulations with explicit representations of various nuclear condensates, akin to the model developed by [citet].”

      Comment 3: Lastly, I would appreciate it if the authors could expand their definition of ’standardized practices’.

      We apologize for any confusion caused. By ”standardized practices,” we refer to the fact that different groups often employ unique procedures for structural modeling. These procedures differ in the representation of chromosomes, the nucleus environment, and the algorithms for parameter optimization. This absence of a consensus on the optimal practices for genome modeling can be daunting for newcomers to the field.

      We have revised the text to the following to avoid confusion:

      “Many research groups develop their own independent software, which complicates crossvalidation and hinders the establishment of best practices for genome modeling [3–5].”

      Comment 4: On page 7, the authors refer to the SI Section: Components of the whole nucleus model for further details. Could the authors provide more information on the simulated density of nuclear bodies? Is there experimental data available that details the ratio of chromatin to other nuclear components, which was used as a reference in the simulation?

      We thank the reviewer for the comment. Imaging studies have provided quantitative measures about the size and number of various nuclear bodies. For example, there are 2 ∼ 5 nucleoli per nucleus, with the typical size RNo ≈ 0.5μm [6–10]. In the review by Spector and Lamond [11], the authors showed that there are 20 ∼ 50 speckles, with the typical size RSp ≈ 0.3μm. We used these numbers to guide our simulation of nuclear bodies. These information was mentioned in the Section: Chromosomes as beads on the string polymers of the supporting information.

      The chromatin density is fixed by the average size of chromatin bead and the nucleus size. We chose the size of chromatin based on imaging studies as detailed in the Subsection: Mapping chromatin bead size to real unit of the supporting information. Upon fixing the bead size, the chromatin volume is determined.

      Comment 5: In the statement, ’the ideal potential is only applied for beads from the same chromosome to approximate the effect of loop extrusion by Cohesin molecules for chromosome compaction and territory formation,’ it would be helpful if the authors could clarify the scope of this potential. Specifically, the code indicates that the variable ’dend ideal’ is set at 1000, suggesting an interaction along a 100Mb polymer chain at a resolution of 100Kb per bead. Could the authors elaborate on their motivation for the Cohesin complex’s activity having a significant effect over such long distances within the polymer chain?

      We thank the reviewer for the insight comment. They are correct that the ideal potential was introduced to capture chromosome folding beyond the interactions between compartments, including loop extrusion. Practically, we parameterized the ideal potential such that the simulated average contact probabilities as a function of sequence separation match the experimental values. The reviewer is correct that beyond a specific value of sequence separation, one would expect the impact of loop extrusion on chromosome folding should be negligible, due to Cohesin dissociation. Correspondingly, the interaction potential should be zero at large sequence separations.

      However, it is important to note that the precise separation scale cannot be known a priori. We chose 100Mb as a conservative estimation. However, as we can see from Fig. S7, our parameterization scheme indeed produced interaction parameters are mainly zero at large sequence separations. Interesting, the scale at which the potential approaches 0 (∼ 500KB), indeed agree with the estimated length traveled by Cohesin molecules before dissociation [12].

      Comment 6: On pages 8 and 9, the authors discuss the optimization process. However, in reviewing the code and documentation available on the GitHub page, I could not find specific sections related to the optimization procedure described in the paper. In this context, I have a few questions: Could the authors provide more details or direct me to the parts of the documentation and the text/SI that address the optimization procedure used in their study? Additional clarification on the cost/objective function employed during the optimization process would be highly beneficial, as this was not readily apparent in the text.

      We thank the reviewer for the comment. We revised the SI to include the definition of the cost function for the Adam optimizer.

      “During the optimization process, our aim was to minimize the disparity between experimental findings and simulated data. To achieve this, we defined the cost function as follows:

      where the index i iterates over all the constraints defined in Eq. S28.”

      The detailed optimization procedure was included in the SI as quoted below

      “The details of the algorithm for parameter optimization are as follows

      (1) Starting with a set of values for and we performed 50 independent 3-million-step long MD simulations to obtain an ensemble of nuclear configurations. The 500K steps of each trajectory are discarded

      as equilibration. We collected the configurations at every 2000 simulation steps from the rest of the simulation trajectories to compute the ensemble averages defined on the left-hand side of Eq. S13.

      (2) Check the convergence of the optimization by calculating the percentage of error

      defined as . The summation over i includes all the average contact probabilities defined in Eq. S28.

      (3) If the error is less than a tolerance value etol, the optimization has converged, and we stop the simulations. Otherwise, we update the parameters, α, using the Adam optimizer [13]. With the new parameter values, we return to step one and restart the iteration.”

      Previously, the optimization code was included as part of the analysis folder. To avoid confusion and improve readability, a separate folder named optimization has been created. This folder provides the Adam optimization of chromosome-chromosome interactions (chr-chr optimization) and chromosome-nuclear landmarks interactions (chr-NL optimization).

      Comment 7: What was the motivation for choosing the Adam algorithm for optimization? Adam is designed for training on stochastic objective functions. Could the authors elucidate on the ’stochastic’ aspect of their function to be optimized? Why the Adam algorithm was considered the most appropriate choice for this application?

      We thank the reviewer for the comment. As defined in Eq. R1, the cost function measures the difference between the simulated constraints with corresponding experimental values. The estimation of simulation values, by averaging over an ensemble of chromosome configurations, is inherently noisy and stochastic. Exact ensemble averages can only be achieved with unlimited samples obtained from infinite long simulations.

      In the past, we have used the Newton’s method for parameterization, and the detailed algorithm can be found in the SI of Ref. 14. However, we found that Adam is more efficient as it is a first-order approximation method. The Newton’s method, on the other hand, is second-order approximation method and requires estimation of the Hessian matrix. When the number of constraints is large, as is in our case, the computational cost for estimating the Hessian matrix can be significant. Another advantage of the Adam algorithm lies in its adjustment of the learning rate along the optimization to further speedup convergence.

      Comment 8: The authors mention that examples of setting up simulations, parameter optimization, and introducing new features are provided in the GitHub repository. However, I was unable to locate these examples. Could the authors guide me to these specific resources or consider adding them if they are not currently available?

      We thank the reviewer for the comment. We have improved the GitHub repository and all the tutorials can be found using the links provided in Response to Comment 2.

      Comment 9: Furthermore, the paper states that ’a configuration file that provides the position of individual particles in the PDB file format is needed to initialize the simulations.’ It would be beneficial for new users if the authors could elaborate on how this file is generated. And all other input files in general. Detailing the procedures for a new user to run their system using OpenNucleome would be helpful.

      We thank the reviewer for the comment. The procedure for generating initial configurations was explained in the SI Section: Initial configurations for simulations and quoted below.

      “We first created a total of 1000 configurations for the genome by sequentially generating the conformation of each one of the 46 chromosomes as follows. For a given chromosome, we start by placing the first bead at the center (origin) of the nucleus. The positions of the following beads, i, were determined from the (i − 1)-th bead as . v is a normalized random vector, and 0.5 was selected as the bond length between neighboring beads. To produce globular chromosome conformations, we rejected vectors, v, that led to bead positions with distance from the center larger than 4σ. Upon creating the conformation of a chromosome i, we shift its center of mass to a value ri com determined as follows. We first compute a mean radial distance, with the following equation

      where Di is the average value of Lamin B DamID profile for chromosome i. Dhi and Dlo represent the highest and lowest average DamID values of all chromosomes, and 6σ and 2σ represent the upper and lower bound in radial positions for chromosomes. As shown in Fig. S6, the average Lamin B DamID profiles are highly correlated with normalized chromosome radial positions as reported by DNA MERFISH [cite], supporting their use as a proxy for estimating normalized chromosome radial positions. We then select as a uniformly distributed random variable within the range . Without loss of generality, we randomly chose the directions for shifting all 46 chromosomes.

      We further relaxed the 1000 configurations to build more realistic genome structures. Following an energy minimization process, one-million-step molecular dynamics (MD) simulations were performed starting from each configuration. Simulations were performed with the following energy function

      where UGenome is defined as in Eq. S7. UG-La is the excluded volume potential between chromosomes and lamina, i.e, only the second term in Eq. S24. Parameters in UGenome were from a preliminary optimization. The end configurations of the MD simulations were collected to build the final configuration ensemble (FCE).”

      The tutorial for preparing initial configurations can be found at this link.

      Comment 10: In the section discussing the correlation between simulated and experimental contact maps, as referenced in Figure 4A and Figure S2, the authors mention a high degree of correlation. Could the authors specify the exact value of this correlation and explain the method used for its computation? Considering that comparing two Hi-C matrices involves a large number of data points, it would be helpful to know if all data points were included in this analysis.

      We have updated Fig 4A and S2 to include Pearson correlation coefficients next to the contact maps. The reviewer is correct in that all the non-redundant data points of the contact maps are included in computing the correlation coefficients.

      For improved clarity, we added a new section in the supporting information to detail the calculations. The section is titled Computing Pearson correlation coefficients between experimental and simulated contact maps, and the relevant text is quoted below.

      “We computed the Pearson correlation coefficients (PCC) between experimental and simulated contact maps in Fig. 4A and Fig. S2 as

      xi and yi represent the experimental and simulated contact probabilities, and n is the total number of data points. Only non-redundant data points, i.e., half of the pairwise contacts, are used in the PCC calculation.”

      Comment 11: In addition, the author said: ”Moreover, the simulated and experimental average contact probabilities between pairs of chromosomes agree well, and the Pearson correlation coefficient between the two datasets reaches 0.89.” How does this correlation behave when not accounting for polymer compaction or scaling? An analysis presenting the correlation as a function of genomic distance would be interesting.

      Author response image 2.

      Pearson correlation coefficient between experimental and simulated contact probabilities as a function of the sequence separation within specific chromosomes. For each chromosome, we first gathered a set of experimental contacts alongside a matching set of simulated ones for genomic pairs within a particular separation range. The Pearson correlation coefficient at the corresponding sequence separation was then determined using Equation R4. We limited the calculations to half of the chromosome length to ensure the availability of sufficient data.

      We thank the reviewer for the comment. The analysis presenting the correlation as a function of genomic distance (sequence separation) for each chromosome is shown in Figure S12 and also included in the SI. While the correlation coefficients decreases at larger separation, the values around 0.5 is quite reasonable and comparable to results obtained using Open-Michrom.

      We also computed the correlation of whole genome contact maps after excluding intra-chromosomal contacts. The PCC decreased from 0.89 to 0.4. Again, the correlation coefficient is quite reasonable considering that these contacts are purely predicted by the compartmental interactions and were not directly optimized.

      Comment 12: I recommend using the web-server that is familiar to the authors to benchmark the OpenNucleome tool/model: ”3DGenBench: A Web-Server to Benchmark Computational Models for 3D Genomics.” Nucleic Acids Research, vol. 50, no. W1, July 2022, pp. W4-12.

      We appreciate the reviewer’s suggestion. Unfortunately, the website is no longer active during the time of the revision. However, as detailed in Response to comment 11, we used the one of the popular metrics to exclude polymer compact effect and evaluate the agreement between simulation and experiments.

      Comment 13: Regarding the comparison of simulation results with microscopy data from reference 34. Given their different resolutions and data point/space groupings, how do the authors align these datasets? Could the authors describe how they performed this comparison? How were the radial positions calculated in both the simulations and experiments? Since the data from reference 34 indicates a non-globular shape of the nucleus; how did this factor into the calculation of radial distributions?

      We thank the reviewer for the comment and apologize for the confusion. First, the average properties we examined, including radial positions and interchromosomal contacts, were averaged over all genomic loci. Therefore, they are independent of data resolution.

      Secondly, instead of calculating the absolute radial positions, which are subject to variations in nucleus shape and size, we defined the normalized radial positions. They measure the ratio between the distance from the nucleus center to the chromosome center and the distance from the nucleus center to the lamina. This definition was frequently used in prior imaging studies to measure chromosome radial positions.

      The calculation of the simulated normalized radial positions and the experimental normalized radial positions are discussed in the Section: Computing simulated normalized chromosome radial positions

      “For a given chromosome i, we first determined its center of mass position denoted as Ci. Starting from the center of the nucleus, O, we extend the the vector vOC to identify the intersection point with the nuclear lamina as Pi. The normalized chromosome radial position i is then defined as , where ||·|| represents the L2 norm.

      and Section: Computing experimental normalized chromosome radial positions.

      “We followed the same procedure outlined in Section: Computing simulated normalized chromosome radial positions to compute the experimental values. To determine the center of the nucleus using DNA MERFISH data, we used the algorithm, minimum volume enclosing ellipsoid (MVEE)[15], to fit an ellipsoid for each genome structure. The optimal ellipsoid defined as is obtained by optimizing subjecting to the constraint that . xi correspond to the list of chromatin positions determined experimentally.”

      Comment 14: In the sentence: ”It is evident that telomeres exhibit anomalous subdiffusive motion.” I recommend mentioning the work ”Di Pierro, Michele, et al., ”Anomalous Diffusion, Spatial Coherence, and Viscoelasticity from the Energy Landscape of Human Chromosomes.” Proceedings of the National Academy of Sciences, vol. 115, no. 30, July 2018, pp. 7753-58.”.

      We have revised the sentence to include the citation as follows.

      “In line with previous research [cite], telomeres display anomalous subdiffusive motion. When fitted with the equation , these trajectories yield a spectrum of α values, with a peak around 0.59.”

      Comment 15: Regarding the observation that ’chromosomes appear arrested and no significant changes in their radial positions are observed over timescales comparable to the cell cycle,’ could the authors provide more details on the calculations or analyses that led to this conclusion? Specifically, information on the equilibration/relaxation time of chromosome territories relative to rearrangements within a cell cycle would be interesting.

      Our conclusion here was mostly based on the time trace of normalized radial positions shown in Figure 6A of the main text. Over the timescale of an entire cell cycle (24 hours), the relatively little to no changes in the radial positions supports glassy dynamics of chromosomes. We further determined the mean squared displacement (MSD) for chromosome center of masses. As shown in the left panel of Fig. S12, the MSDs are much smaller than the average size of chromosomes (see Rg values in Fig. 5A), supporting arrested dynamics.

      We further computed the auto-correlation function of the normalized chromosome radial position as

      where t indexes over the trajectory frames and ¯r is the mean position. As shown in Fig. S12, the positions are not completely decorrelated over 10 hours, again supporting slow dynamics. It would be interesting to examine the relaxation timescale more closely in future studies.

      Comment 16: The authors also comment on the SI ”Section: Initial configurations for simulations provides more details on preparing the 1000 initial configurations.” and related to reference 34 mentioning that ”the average Lamin B DamID profiles are highly correlated with chromosome radial positions as reported by DNA MERFISH”. How do the authors account for situations where homologous chromosomes are neighbors or have an interacting interface? Ref. 34 indicates that distinguishing between these scenarios can be challenging, potentially leading to ’invalid distributions’ that are filtered out. Clarification on how such cases were handled in the simulations would be helpful.

      We would like to first clarify that when comparing with experimental data, we averaged over the homologous chromosomes to obtain haploid data. We added the following text in the manuscript to emphasize this point

      “Given that the majority of experimental data were analyzed for the haploid genome, we adopted a similar approach by averaging over paternal and maternal chromosomes to facilitate direct comparison. More details on data analysis can be found in the Supporting Information Section: Details of simulation data analysis.”

      Furthermore, we used the processed DNA MERFISH data from the Zhuang lab, which unambiguously assigns a chromosome ID to each data point. Therefore, the issue mentioned by the reviewer is not present in the procssed data. In our simulations, since we keep track of the explicit connection between genomic segments, the trace of individual chromosomes can be determined for any configuration. Therefore, there is no ambiguity in terms of simulation data.

      Comment 17: When discussing the interaction with nuclear lamina and nuclear envelop deformation, I suggest mentioning the following studies: The already cited ref 52 and ”Contessoto, Vin´ıcius G., et al. ”Interphase Chromosomes of the Aedes Aegypti Mosquito Are Liquid Crystalline and Can Sense Mechanical Cues.” Nature Communications, vol. 14, no. 1, Jan. 2023, p. 326.”

      We updated the text to include the suggested reference.

      “Numerous studies have highlighted the remarkable influence of nuclear shape on the positioning of chromosomes and the regulation of gene expression [16, 17].”

      Comment 18: The authors state that ’Tutorials in the format of Python Scripts with extensive documentation are provided to facilitate the adoption of the model by the community.’ However, as I mentioned, the documentation appears to be limited, and the available tutorials could benefit from further expansion. I suggest that the authors consider enhancing these resources to better assist users in adopting and understanding the model.

      As detailed in the Response to Comment 2, we have updated the GitHub repository to better document the included Jupyter notebooks and tutorials.

      Comment 19: In the Methods section, the authors discuss using Langevin dynamics for certain simulations and Brownian dynamics for others. Could the authors provide more detailed reasoning behind the choice of these different dynamics for different aspects of the simulation? Furthermore, it would be insightful to know how the results might vary if only one of these dynamics was utilized throughout the study. Such clarification would help in understanding the implications of these methodological choices on the outcomes of the simulations.

      We thank the reviewer for the comment. As detailed in the supporting information Section: Mapping the Reduced Time Unit to Real Time, the Brownian dynamics simulations provide a rigorous mapping to the biological timescale. By choosing a specific value for the nucleoplasmic viscosity, we determined the time unit in simulations as τ = 0.65s. With this time conversion, the simulated diffusion coefficients of telomeres match well with experimental values. Therefore, Brownian dynamics simulations are recommended for computing time dependent quantities and the large damping coefficients mimics the complex nuclear environment well.

      On the other hand, the large damping coefficient slows down the configuration relaxation of the system significantly. For computing equilibrium statistical properties, it is useful to use a small coefficient and the Langevin integrator with large time steps to facilitate conformational relaxation.

      References

      [1] Rao, S. S.; Huntley, M. H.; Durand, N. C.; Stamenova, E. K.; Bochkov, I. D.; Robinson, J. T.; Sanborn, A. L.; Machol, I.; Omer, A. D.; Lander, E. S.; others A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 2014, 159, 1665–1680.

      [2] Qi, Y.; Zhang, B. Predicting three-dimensional genome organization with chromatin states. PLoS computational biology 2019, 15, e1007024.

      [3] Yildirim, A.; Hua, N.; Boninsegna, L.; Zhan, Y.; Polles, G.; Gong, K.; Hao, S.; Li, W.; Zhou, X. J.; Alber, F. Evaluating the role of the nuclear microenvironment in gene function by population-based modeling. Nature Structural & Molecular Biology 2023, 1–14.

      [4] Junior, A. B. O.; Contessoto, V. G.; Mello, M. F.; Onuchic, J. N. A scalable computational approach for simulating complexes of multiple chromosomes. Journal of molecular biology 2021, 433, 166700.

      [5] Fujishiro, S.; Sasai, M. Generation of dynamic three-dimensional genome structure through phase separation of chromatin. Proceedings of the National Academy of Sciences 2022, 119, e2109838119.

      [6] Caragine, C. M.; Haley, S. C.; Zidovska, A. Nucleolar dynamics and interactions with nucleoplasm in living cells. Elife 2019, 8, e47533.

      [7] Brangwynne, C. P.; Mitchison, T. J.; Hyman, A. A. Active liquid-like behavior of nucleoli determines their size and shape in Xenopus laevis oocytes. Proceedings of the National Academy of Sciences 2011, 108, 4334–4339.

      [8] Farley, K. I.; Surovtseva, Y.; Merkel, J.; Baserga, S. J. Determinants of mammalian nucleolar architecture. Chromosoma 2015, 124, 323–331.

      [9] Qi, Y.; Zhang, B. Chromatin network retards nucleoli coalescence. Nature Communications 2021, 12, 6824.

      [10] Caragine, C. M.; Haley, S. C.; Zidovska, A. Surface fluctuations and coalescence of nucleolar droplets in the human cell nucleus. Physical review letters 2018, 121, 148101.

      [11] Spector, D. L.; Lamond, A. I. Nuclear speckles. Cold Spring Harbor perspectives in biology 2011, 3, a000646.

      [12] Banigan, E. J.; Mirny, L. A. Loop extrusion: theory meets single-molecule experiments. Current opinion in cell biology 2020, 64, 124–138.

      [13] Kingma, D. P.; Ba, J. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 2014,

      [14] Zhang, B.; Wolynes, P. G. Topology, structures, and energy landscapes of human chromosomes. Proceedings of the National Academy of Sciences 2015, 112, 6062–6067.

      [15] Moshtagh, N.; others Minimum volume enclosing ellipsoid. Convex optimization 2005, 111, 1–9.

      [16] Brahmachari, S.; Contessoto, V. G.; Di Pierro, M.; Onuchic, J. N. Shaping the genome via lengthwise compaction, phase separation, and lamina adhesion. Nucleic Acids Res. 2022, 50, 1–14.

      [17] Contessoto, V. G.; Dudchenko, O.; Aiden, E. L.; Wolynes, P. G.; Onuchic, J. N.; Di Pierro, M. Interphase chromosomes of the Aedes aegypti mosquito are liquid crystalline and can sense mechanical cues. Nature Communications 2023, 14, 326.

    1. eLife assessment

      This important work provides another layer of regulatory mechanism for TGF-beta signaling activity. The evidence supports the involvement of microtubules as a reservoir of Smad2/3, however, additional evidence to convincingly demonstrate the functional involvement of Rudhira in this process is highly appreciated. The work will be of broad interest to developmental biologists in general and molecular biologists in the field of growth factor signaling.

    2. Reviewer #1 (Public Review):

      Summary

      This manuscript aimed to study the role of Rudhira (also known as Breast Carcinoma Amplified Sequence 3), an endothelium-restricted microtubules-associated protein, in regulating of TGFβ signaling. The authors demonstrate that Rudhira is a critical signaling modulator for TGFβ signaling by releasing Smad2/3 from cytoskeletal microtubules and how Rudhira is a Smad2/3 target gene. Taken together, the authors provide a model of how Rudhira contributes to TGFβ signaling activity to stabilize the microtubules, which is essential for vascular development.

      Strengths

      The study used different methods and techniques to achieve aims and support conclusions, such as Gene Ontology analysis, functional analysis in culture, immunostaining analysis, and proximity ligation assay. This study provides an unappreciated additional layer of TGFβ signaling activity regulation after ligand-receptor interaction.

      Weaknesses

      (1) It is unclear how current findings provide a better understanding of Rudhira KO mice, which the authors published some years ago.<br /> (2) Why do they use HEK cells instead of SVEC cells in Figure 2 and 4 experiments?<br /> (3) A model shown in Figure 5E needs improvement to grasp their findings easily.

    3. Reviewer #2 (Public Review):

      Summary:

      It was first reported in 2000 that Smad2/3/4 are sequestered to microtubules in resting cells and TGF-β stimulation releases Smad2/3/4 from microtubules, allowing activation of the Smad signaling pathway. Although the finding was subsequently confirmed in a few papers, the underlying mechanism has not been explored. In the present study, the authors found that Rudhira/breast carcinoma amplified sequence 3 is involved in the release of Smad2/3 from microtubules in response to TGF-β stimulation. Rudhira is also induced by TGF-β and is probably involved in the stabilization of microtubules in the delayed phase after TGF-β stimulation. Therefore, Rudhira has two important functions downstream of TGF-β in the early as well as delayed phase.

      Strengths:

      This work aimed to address an unsolved question on one of the earliest events after TGF-β stimulation. Based on loss-of-function experiments, the authors identified a novel and potentially important player, Rudhira, in the signal transmission of TGF-β,

      Weaknesses:

      The authors have identified a key player that triggers Smad2/3 released from microtubules after TGF-β stimulation probably via its association with microtubules. This is an important first step for understanding the regulation of Smad signaling, but underlying mechanisms as well as upstream and downstream events largely remain to be elucidated.

      (1) The process of how Rudhira causes the release of Smad proteins from microtubules remains unclear. The statement that "Rudhira-MT association is essential for the activation and release of Smad2/3 from MTs" (lines 33-34) is not directly supported by experimental data.

      (2) The process of how Rudhira is mobilized to microtubules in response to TGF-β remains unclear.

      (3) After Rudhira releases Smad proteins from microtubules, Rudhira stabilizes microtubules. The process of how cells return to a resting state and recover their responsiveness to TGF-β remains unclear.

      This reviewer is also afraid that some of the biochemical data lack appropriate controls and are not convincing enough.

    4. Author response:

      eLife assessment:

      This important work provides another layer of regulatory mechanism for TGF-beta signaling activity. The evidence supports the involvement of microtubules as a reservoir of Smad2/3, however, additional evidence to convincingly demonstrate the functional involvement of Rudhira in this process is highly appreciated. The work will be of broad interest to developmental biologists in general and molecular biologists in the field of growth factor signaling.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary

      This manuscript aimed to study the role of Rudhira (also known as Breast Carcinoma Amplified Sequence 3), an endothelium-restricted microtubules-associated protein, in regulating of TGFβ signaling. The authors demonstrate that Rudhira is a critical signaling modulator for TGFβ signaling by releasing Smad2/3 from cytoskeletal microtubules and how Rudhira is a Smad2/3 target gene. Taken together, the authors provide a model of how Rudhira contributes to TGFβ signaling activity to stabilize the microtubules, which is essential for vascular development.

      Strengths

      The study used different methods and techniques to achieve aims and support conclusions, such as Gene Ontology analysis, functional analysis in culture, immunostaining analysis, and proximity ligation assay. This study provides an unappreciated additional layer of TGFβ signaling activity regulation after ligand-receptor interaction.

      We thank the reviewer for acknowledging the importance of our study and providing a clear summary of our findings.

      Weaknesses

      (1) It is unclear how current findings provide a better understanding of Rudhira KO mice, which the authors published some years ago.

      Our previous study demonstrated that Rudhira KO mice have a predominantly developmental cardiovascular phenotype that phenocopies TGFβ loss of function (Shetty, Joshi et al., 2018). Additionally, we found that at the molecular level, Rudhira regulates cytoskeletal organization (Jain et al., 2012; Joshi and Inamdar, 2019). Our current study builds upon these previous findings, showing an essential role of Rudhira in maintaining TGFβ signaling and controlling the microtubule cytoskeleton during vascular development. On one hand Rudhira regulates TGFβ signaling by promoting the release of Smads from microtubules, while on the other, Rudhira is a TGFβ target essential for stabilizing microtubules. Thus, our current study provides a molecular basis for Rudhira function in cardiovascular development.

      (2) Why do they use HEK cells instead of SVEC cells in Figure 2 and 4 experiments?

      Our earlier studies have characterized the role of Rudhira in detail using both loss and gain of function methods in multiple cell types (Jain et al., 2012; Shetty, Joshi et al., 2018; Joshi and Inamdar, 2019). As endothelial cells are particularly difficult to transfect, and because the function of Rudhira in promoting cell migration is conserved in HEK cells, it was practical and relevant to perform these experiments in HEK cells (Figures 2 and 4E).

      (3) A model shown in Figure 5E needs improvement to grasp their findings easily.

      We have modified Figure 5E for clarity.

      Reviewer #2 (Public Review):

      Summary

      It was first reported in 2000 that Smad2/3/4 are sequestered to microtubules in resting cells and TGF-β stimulation releases Smad2/3/4 from microtubules, allowing activation of the Smad signaling pathway. Although the finding was subsequently confirmed in a few papers, the underlying mechanism has not been explored. In the present study, the authors found that Rudhira/breast carcinoma amplified sequence 3 is involved in the release of Smad2/3 from microtubules in response to TGF-β stimulation. Rudhira is also induced by TGF-β and is probably involved in the stabilization of microtubules in the delayed phase after TGF-β stimulation. Therefore, Rudhira has two important functions downstream of TGF-β in the early as well as delayed phase.

      Strengths:

      This work aimed to address an unsolved question on one of the earliest events after TGF-β stimulation. Based on loss-of-function experiments, the authors identified a novel and potentially important player, Rudhira, in the signal transmission of TGF-β.

      We thank the reviewer for the critical evaluation and appreciation of our findings.

      Weaknesses:

      The authors have identified a key player that triggers Smad2/3 released from microtubules after TGF-β stimulation probably via its association with microtubules. This is an important first step for understanding the regulation of Smad signaling, but underlying mechanisms as well as upstream and downstream events largely remain to be elucidated.

      We acknowledge that the mechanisms regulating cytoskeletal control of Smad signaling are far from clear, but these are out of scope of this manuscript. This manuscript rather focuses on Rudhira/Bcas3 as a pivot to understand vascular TGFβ signaling and microtubule connections.

      (1) The process of how Rudhira causes the release of Smad proteins from microtubules remains unclear. The statement that "Rudhira-MT association is essential for the activation and release of Smad2/3 from MTs" (lines 33-34) is not directly supported by experimental data.

      We agree with the reviewer’s comment. Although we provide evidence that the loss of Rudhira (and thereby deduced loss of Rudhira-MT association) prevents release of Smad2/3 from MTs (Fig 3C), it does not confirm the requirement of Rudhira-MT association for this. In light of this, we have modified the statement to ‘Rudhira associates with MTs and is essential for the activation and release of Smad2/3 from MTs”.

      (2) The process of how Rudhira is mobilized to microtubules in response to TGF-β remains unclear.

      Our previous study showed that Rudhira associates with microtubules, and preferentially binds to stable microtubules (Jain et al., 2012; Joshi and Inamdar, 2019). Since TGFβ stimulation is known to stabilize microtubules, we hypothesize that TGFβ stimulation increases Rudhira binding to stable microtubules. We have mentioned this in our revised manuscript.

      (3) After Rudhira releases Smad proteins from microtubules, Rudhira stabilizes microtubules. The process of how cells return to a resting state and recover their responsiveness to TGF-β remains unclear.

      We show that dissociation of Smads from microtubules is an early response and stabilization of microtubules is a late TGFβ response. However, we agree that the sequence of these molecular events has not been characterized in-depth in this or any other study, making it difficult to assign causal roles (eg. whether release of Smads from MTs is a pre-requisite for MT stabilization by Rudhira) or reversibility. However, the TGFβ pathway is auto regulatory, leading to increased turnover of receptors and Smads and increased expression of inhibitory Smads, which may recover responsiveness to TGFβ. Additionally, the still short turnover time of stable microtubules (several minutes to hours) may also promote quick return to resting state.

      We have discussed this in our revised manuscript.

    1. Author response:

      eLife assessment

      This important study provides new insight into the dynamics that underlie the development of therapy resistance in prostate cancer by revealing that divergent tumor evolutionary paths occur in response to different treatment timing and that these converge on common resistance mechanisms. The use of barcoded lineage tracing and characterization of isolated tumor clonal populations provides compelling evidence supporting the importance of clonal dynamics in a tumor ecosystem for treatment resistance. Several open questions remain, however, raising the possibility of alternative interpretations of the data set in its current form. Overall, the findings deepen our understanding of prostate cancer evolution and hold promising implications for how drug resistance can be addressed or prevented.

      We are pleased the reviewers found our work reporting distinct evolutionary paths to resistance based on timing of treatment to be important and supported by compelling evidence.  We also acknowledge the need for additional work to clarify some details, particularly regarding the mechanism of clonal cooperativity as a catalyst of resistance.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Lee, Eugine et al. use in vivo barcoded lineage tracing to investigate the evolutionary paths to androgen receptor signaling inhibition (ARSI) resistance in two different prostate cancer clinical scenario models: measurable disease and minimal residual disease. Using two prostate cancer cell lines, LNCaP/AR and CWR22PC, the authors find that in their minimal residual disease models, the outgrowth of pre-existing resistant clones gives rise to ARSI-resistant tumors. Interestingly, in their measurable disease model or post-engraftment ARSI setting, these pre-existing resistant clones are depleted and rather a subset of clones that give rise to the treatment of naïve tumors adapt to ARSI treatment and are enriched in resistant tumors. For the LNCaP/AR cell line, characterization of pre-existing resistant clones in treatment naïve and ARSI treatment settings reveal increased baseline androgen receptor transcriptional output as well as baseline upregulation of glucocorticoid receptor (GR) as the primary driver of pre-existing resistance. Similarly, the authors found induction of high GR expression over long-term ARSI treatment in ARSI-sensitive clones for adaptive resistance to ARSI. For CWR22Pc cells, HER3/NRG1 signaling was the primary driver for ARSI resistance in both measurable disease and minimal residual disease models. Not only were these findings consistent with the authors' previous reports of GR and NRG1/Her3 as the molecular drivers of ARSI resistance in LNCaP/AR and CWR22Pc, respectively, but also demonstrate conserved resistance mechanisms despite pre-existing or adaptive evolutionary paths to resistance. Lastly, the authors show adaptive ARSI resistance is dependent on interclonal cooperation, where the presence of pre-existing resistant clones or "helper" clones is required to promote adaptive resistance in ARSI-sensitive clones.

      Strengths:

      The authors employ DNA barcoding, powerful a tool already demonstrated by others to track the clonal evolution of tumor populations during resistance development, to study the effects of the timing of therapy as a variable on resistance evolution. The authors use barcoding in two cell line models of prostate cancer in two clinical disease scenarios to demonstrate divergent evolutionary paths converging on common resistant mechanisms. By painstakingly isolating clones with barcodes of interest to generate clonal cell lines from the treatment of naïve cell populations, the authors are able to not only characterize pre-existing resistance but also show cooperativity between resistant and drug-sensitive populations for adaptive resistance.

      Weaknesses:

      While the finding that different evolutionary paths result in common molecular drivers of ARSI resistance is novel and unexpected, this work primarily confirms the authors' previous published work identifying the resistance mechanisms in these cell lines. The impact of the work would be greater with additional studies understanding the specific molecular/genetic mechanisms by which cells become resistant or cooperate within a population to give rise to resistant population subclones.

      We agree that additional insights into the mechanism of adaptive resistant and the role of cell-cell cooperativity are clear next steps for this work. We propose to do so through single cell characterization (RNA-seq, ATAC-seq) of tumor evolution in a time course experiment where we can track each clone using expressed barcodes. This will allow us to explore the dynamics of interaction between the "adaptable" and "helper" clones. Unfortunately, the barcode methodology used in this initial report is DNA-based; therefore, a follow-up study using a transcribable barcode library is needed to address these fascinating questions.

      This study would also benefit from additional explanation or exploration of why the two resistance driver pathways described (GR and NRG1/Her3) are cell line specific and if there are genetic or molecular backgrounds in which specific resistance signaling is more likely to be the predominant driver of resistance.

      In the case of NRG1/HER3 pathway mediated resistance, we know that this mechanism requires that the PTEN/PIK3CA pathway be wildtype.  This is the case for the CWR22Pc model described in the manuscript. Furthermore, we have data showing that PTEN deletion in these cells rescues the phenotype, meaning that CWR22Pc cells with PTEN deletion are no longer dependent on NRG1/HER3 signaling for ARSI resistance.

      In contrast, LNCaP/AR cells are PTEN null at baseline and therefore must evolve alternative mechanisms of ARSI resistance. Since our initial identification of the GR mechanism, we and others have extended the finding to additional models (VCaP, LAPC4) (PMID: 24315100; PMID: 28191869). Another recent insight is the importance of RB1 and TP53 status in maintenance of luminal lineage identity during ARSI therapy, and the recognition of lineage plasticity as a resistance mechanism in cell lines/tumor models that lack these two tumor suppressors. In summary, baseline genetics clearly plays a role in which ARSI resistance pathway is  likely to emerge. We will clarify this point in the revision with additional discussion.

      Reviewer #2 (Public Review):

      Summary

      The authors aimed to characterise the evolutionary dynamics that occur during the resistance to androgen receptor signalling inhibition, and how this differs in established tumours vs. residual disease, in prostate cancer. By using a barcoding method, they aimed to both characterise the distribution of clones that support therapy resistance in these settings, while also then being able to isolate said clones from the pre-graft population via single-cell cloning to characterise the mechanisms of resistance and dependency on cooperativity.

      While, interestingly, the timing of combination therapies has been shown to be critical to avoid cross-resistance, the timing of therapy has not been specifically considered as a factor dictating resistance pathways. Additionally, the role of residual disease and dormant populations in driving relapse is of increasing interest, yet a lot remains to be understood of these populations. The question of whether different clinical manifestations of therapy resistance follow similar evolutionary pathways to resistance is therefore interesting and relevant for the field.

      The methods applied are elegant and the body of work is substantial. The proposed divergent evolutionary pathways pose interesting questions, and the findings on cooperativity provide insight. However, whether the model truly reflects minimal residual disease to the extent that the authors suggest may limit the relevance of the findings at this stage. Certain patterns in the DNA barcoding results also call into question whether the results fully support the strong claims of the authors, or whether alternative explanations could exist. While the potential to isolate individual clones in the pre-graft setting is a great strength of the method applied and the isolation of these clones is a huge body of work in itself, the limited number of clones that could be isolated also somewhat limits the validation of the findings.

      Strengths

      Very relevant and interesting question, clear clinical relevance, applying elegant methods that hold the potential to provide a novel understanding of multiple aspects of therapy resistance, through from evolutionary patterns to intracellular and cooperative mechanisms of resistance.

      The text is clearly written, logical, and the structure is easy to follow.

      Weaknesses

      (1) The extent to which the model used truly mimics residual disease

      The main conclusions of the paper are built upon results using a model for minimal residual disease. However, the extent to which this truly recapitulates minimal residual disease, particularly with regard to their focus on the timings of therapy, could be discussed further. If in the clinical setting residual disease occurs following the existence of a tumour and its microenvironment, there might be many aspects of the process that are missed when coinciding treatment with engraftment of a xenograft tumour with pre-castration. If any characterisation of the minimal residual disease was possible (such as histologically or through RNA sequencing), this may help demonstrate in what ways this model recapitulates minimal residual disease.

      We appreciate the reviewer's feedback on this point and acknowledge that the pre-ARSI setting used in our studies is not precisely identical to minimal residual disease (MRD) seen clinically, where a patient typically undergoes primary treatment (radical prostatectomy surgery or local radiotherapy) then relapses with distant disease from micrometastases that were not initially detectable.  Having uncovered a key difference in the path to resistance using our pre-ARSI model, we believe our data provide a strong rationale to invest additional effort in designing newer MRD models that more closely mimic the clinical scenario, perhaps through surgical resection of a primary tumor that could “seed” micrometatases prior to therapy. We will highlight this aspect in our revised manuscript and provide clarity on the limitations and scope of our study.

      (2) Whether the observed enrichment of pre-resistant clones is truly that

      The authors strongly make the case that their barcoding experiments provide evidence for pre-existing resistance in the context of minimal residual disease. However, it seems that the clones enriched in the ARSIR tumours are consistently the most enriched clones in the pregraft. Is it possible that the high selective pressure in the pre-engraftment ARSI condition simply leads to an enrichment of the most populous clones from the pregraft? Whereas in the control setting, the reduced selective pressure at the point of engraftment allows for a wider variety of clones to establish in the tumour?

      The reviewer raises an important point about enrichment of ARSI resistance clones in the pregraft but we do not believe that explains the subsequent in vivo data for the following reasons:

      (1) The two most enriched clones in the Pre-ARSIR tumors are the second and third the most enriched clones in pre-graft, not first (Supplementary figure 1E). If the clones were enriched in resistant tumors based on their abundance in starting population, we expect to find the most enriched clone in the tumor.

      (2) By varying the androgen concentration in the pregraft culture media, we could selectively deplete or enrich the same clones enriched in the Pre-ARSIR tumors in vivo, indicating the enrichment of these clones in the resistant tumors is unlikely to be solely based on their relative frequency in the pregraft (Supplementary figure 2).

      We will clarify these points in the revised manuscript.

      Additionally, is there the possibility that the clones highly enriched in the pregraft are in fact a heterogeneous group of cells bearing the same barcode due to stochastic events in the process of viral transduction? Addressing these questions would greatly improve the study.

      The barcode library was deep sequenced to confirm even distribution of the barcodes before it was transferred from Novartis (PMID: 258491301) and we intentionally used a low multiplicity of infection (MOI) to generate barcode lines to ensure single copy insertion. That said, we cannot entirely rule out the possibility that the second and third most enriched clones in the pregraft originated from the same ancestral clone and subsequently acquired two different barcodes.  We will clarify this point in the revised manuscript.

      (3) The robustness of the subsequent work based on 1-2 pre-resistant clones

      While appreciating the volume of work involved in isolating and culturing individual pre-resistant clones, given the previous point, the conclusions would benefit from very robust validations with these single-cell clones. There are only two clones, and the results seem to focus more on one than the other, for which the data is less convincing. For instance, the Enz IC50 data, which in the case for pre-ARSI R2 is restricted to the supplementary, compares the clones A-D. In Figure S8 B, pre-ARSI R2 is compared to clone B, which is, of the four clones shown in the main figure when compared to R1, the one with the lowest Enz IC50. Therefore, while the resistant clones seem to have a significantly higher Enz IC50, comparing both clones to clones A-D may not have achieved this significance. It would also be useful to know how abundant the resistant clones were in the original barcode experiments.

      We acknowledge that studies relying on 1-2 biological samples indeed have limitations. Given our extensive prior work into the role of GR in the development of ARSI resistance (and that of other labs), we focused on demonstrating that both pre-ARSIR1 and pre-ARSIR2 clones exhibit pre-existing GR expression and are primed to further upregulate GR levels under ARSI conditions, thereby relying on GR function to sustain resistance. Given the redundancy of resistant mechanisms of the two clones, we made efforts to isolate additional clones enriched in Pre-ARSIR tumors. However, despite our attempts, we were unable to identify further clones. Pre-ARSIR1 and pre-ARSIR2 are second and third most enriched clones in pre-graft (2.1% and 1.7% respectively).

      (4) The logic used in the final section requires further explanation

      In the final section, the authors suggest that a pre-ARSIR clone is able to cooperate with a pre-Intact clone to aid adaptive ARSI resistance. If this is true, then could it not be that rare, pre-resistant clones support adaptive resistance in established tumours? And, therefore, the mechanism underlying resistance could be through pre-existing resistant clones in both settings. The work would benefit from a discussion to clarify this discrepancy in the interpretation of the findings. This is particularly necessary given the strong wording the authors use regarding their findings, such as that they have provided 'conclusive evidence' for acquired resistance.

      We agree that rare, pre-resistant clones could support adaptive resistance (and therefore resistance in this adaptive setting could, technically be called “pre-existing”) but it is critical to recognize that these rare, pre-resistant “helper” clones are vastly outnumbered by pre-Intact clones that “acquire” resistance through their “help.” We find this to be fascinating biology and we will clarify this logic in the resubmission, as well as future experimental approaches to unravel the mechanism.

    2. eLife assessment

      This important study provides new insight into the dynamics that underlie the development of therapy resistance in prostate cancer by revealing that divergent tumor evolutionary paths occur in response to different treatment timing and that these converge on common resistance mechanisms. The use of barcoded lineage tracing and characterization of isolated tumor clonal populations provides compelling evidence supporting the importance of clonal dynamics in a tumor ecosystem for treatment resistance. Several open questions remain, however, raising the possibility of alternative interpretations of the data set in its current form. Overall, the findings deepen our understanding of prostate cancer evolution and hold promising implications for how drug resistance can be addressed or prevented.

    3. Reviewer #1 (Public Review):

      Summary:

      Lee, Eugine et al. use in vivo barcoded lineage tracing to investigate the evolutionary paths to androgen receptor signaling inhibition (ARSI) resistance in two different prostate cancer clinical scenario models: measurable disease and minimal residual disease. Using two prostate cancer cell lines, LNCaP/AR and CWR22PC, the authors find that in their minimal residual disease models, the outgrowth of pre-existing resistant clones gives rise to ARSI-resistant tumors. Interestingly, in their measurable disease model or post-engraftment ARSI setting, these pre-existing resistant clones are depleted and rather a subset of clones that give rise to the treatment of naïve tumors adapt to ARSI treatment and are enriched in resistant tumors. For the LNCaP/AR cell line, characterization of pre-existing resistant clones in treatment naïve and ARSI treatment settings reveal increased baseline androgen receptor transcriptional output as well as baseline upregulation of glucocorticoid receptor (GR) as the primary driver of pre-existing resistance. Similarly, the authors found induction of high GR expression over long-term ARSI treatment in ARSI-sensitive clones for adaptive resistance to ARSI. For CWR22Pc cells, HER3/NRG1 signaling was the primary driver for ARSI resistance in both measurable disease and minimal residual disease models. Not only were these findings consistent with the authors' previous reports of GR and NRG1/Her3 as the molecular drivers of ARSI resistance in LNCaP/AR and CWR22Pc, respectively, but also demonstrate conserved resistance mechanisms despite pre-existing or adaptive evolutionary paths to resistance. Lastly, the authors show adaptive ARSI resistance is dependent on interclonal cooperation, where the presence of pre-existing resistant clones or "helper" clones is required to promote adaptive resistance in ARSI-sensitive clones.

      Strengths:

      The authors employ DNA barcoding, powerful a tool already demonstrated by others to track the clonal evolution of tumor populations during resistance development, to study the effects of the timing of therapy as a variable on resistance evolution. The authors use barcoding in two cell line models of prostate cancer in two clinical disease scenarios to demonstrate divergent evolutionary paths converging on common resistant mechanisms. By painstakingly isolating clones with barcodes of interest to generate clonal cell lines from the treatment of naïve cell populations, the authors are able to not only characterize pre-existing resistance but also show cooperativity between resistant and drug-sensitive populations for adaptive resistance.

      Weaknesses:

      While the finding that different evolutionary paths result in common molecular drivers of ARSI resistance is novel and unexpected, this work primarily confirms the authors' previous published work identifying the resistance mechanisms in these cell lines. The impact of the work would be greater with additional studies understanding the specific molecular/genetic mechanisms by which cells become resistant or cooperate within a population to give rise to resistant population subclones.

      This study would also benefit from additional explanation or exploration of why the two resistance driver pathways described (GR and NRG1/Her3) are cell line specific and if there are genetic or molecular backgrounds in which specific resistance signaling is more likely to be the predominant driver of resistance.

    4. Reviewer #2 (Public Review):

      Summary

      The authors aimed to characterise the evolutionary dynamics that occur during the resistance to androgen receptor signalling inhibition, and how this differs in established tumours vs. residual disease, in prostate cancer. By using a barcoding method, they aimed to both characterise the distribution of clones that support therapy resistance in these settings, while also then being able to isolate said clones from the pre-graft population via single-cell cloning to characterise the mechanisms of resistance and dependency on cooperativity.

      While, interestingly, the timing of combination therapies has been shown to be critical to avoid cross-resistance, the timing of therapy has not been specifically considered as a factor dictating resistance pathways. Additionally, the role of residual disease and dormant populations in driving relapse is of increasing interest, yet a lot remains to be understood of these populations. The question of whether different clinical manifestations of therapy resistance follow similar evolutionary pathways to resistance is therefore interesting and relevant for the field.

      The methods applied are elegant and the body of work is substantial. The proposed divergent evolutionary pathways pose interesting questions, and the findings on cooperativity provide insight. However, whether the model truly reflects minimal residual disease to the extent that the authors suggest may limit the relevance of the findings at this stage. Certain patterns in the DNA barcoding results also call into question whether the results fully support the strong claims of the authors, or whether alternative explanations could exist. While the potential to isolate individual clones in the pre-graft setting is a great strength of the method applied and the isolation of these clones is a huge body of work in itself, the limited number of clones that could be isolated also somewhat limits the validation of the findings.

      Strengths

      • Very relevant and interesting question, clear clinical relevance, applying elegant methods that hold the potential to provide a novel understanding of multiple aspects of therapy resistance, through from evolutionary patterns to intracellular and cooperative mechanisms of resistance.

      • The text is clearly written, logical, and the structure is easy to follow.

      Weaknesses

      (1) The extent to which the model used truly mimics residual disease

      The main conclusions of the paper are built upon results using a model for minimal residual disease. However, the extent to which this truly recapitulates minimal residual disease, particularly with regard to their focus on the timings of therapy, could be discussed further. If in the clinical setting residual disease occurs following the existence of a tumour and its microenvironment, there might be many aspects of the process that are missed when coinciding treatment with engraftment of a xenograft tumour with pre-castration. If any characterisation of the minimal residual disease was possible (such as histologically or through RNA sequencing), this may help demonstrate in what ways this model recapitulates minimal residual disease.

      (2) Whether the observed enrichment of pre-resistant clones is truly that

      The authors strongly make the case that their barcoding experiments provide evidence for pre-existing resistance in the context of minimal residual disease. However, it seems that the clones enriched in the ARSIR tumours are consistently the most enriched clones in the pregraft. Is it possible that the high selective pressure in the pre-engraftment ARSI condition simply leads to an enrichment of the most populous clones from the pregraft? Whereas in the control setting, the reduced selective pressure at the point of engraftment allows for a wider variety of clones to establish in the tumour? Additionally, is there the possibility that the clones highly enriched in the pregraft are in fact a heterogeneous group of cells bearing the same barcode due to stochastic events in the process of viral transduction? Addressing these questions would greatly improve the study.

      (3) The robustness of the subsequent work based on 1-2 pre-resistant clones

      While appreciating the volume of work involved in isolating and culturing individual pre-resistant clones, given the previous point, the conclusions would benefit from very robust validations with these single-cell clones. There are only two clones, and the results seem to focus more on one than the other, for which the data is less convincing. For instance, the Enz IC50 data, which in the case for pre-ARSI R2 is restricted to the supplementary, compares the clones A-D. In Figure S8 B, pre-ARSI R2 is compared to clone B, which is, of the four clones shown in the main figure when compared to R1, the one with the lowest Enz IC50. Therefore, while the resistant clones seem to have a significantly higher Enz IC50, comparing both clones to clones A-D may not have achieved this significance. It would also be useful to know how abundant the resistant clones were in the original barcode experiments.

      (4) The logic used in the final section requires further explanation

      In the final section, the authors suggest that a pre-ARSIR clone is able to cooperate with a pre-Intact clone to aid adaptive ARSI resistance. If this is true, then could it not be that rare, pre-resistant clones support adaptive resistance in established tumours? And, therefore, the mechanism underlying resistance could be through pre-existing resistant clones in both settings. The work would benefit from a discussion to clarify this discrepancy in the interpretation of the findings. This is particularly necessary given the strong wording the authors use regarding their findings, such as that they have provided 'conclusive evidence' for acquired resistance.

    1. eLife assessment

      This valuable study demonstrates that genomic insertion of a G4-containing sequence can be sufficient to induce chromosome loops and alter gene expression. The evidence supporting the conclusions is convincing. Effects were shown by Hi-C as well as qPCR for chromatin modifications and expression, and the specificity of the effects was controlled by mutating the G4-containing sequence or treating with LNA probes to abolish G4 structure formation. The work will be of interest to researchers working on chromatin organization and gene regulation.

    2. Reviewer #1 (Public Review):

      In this manuscript, Chowdhury and co-workers provide interesting data to support the role of G4-structures in promoting chromatin looping and long-range DNA interactions. The authors achieve this by artificially inserting a G4-containing sequence in an isolated region of the genome using CRISPR-Cas9 and comparing it to a control sequence that does not contain G4 structures. Based on the data provided, the authors can conclude that G4-insertion promotes long-range interactions (measured by Hi-C) and affects gene expression (measured by qPCR) as well as chromatin remodelling (measured by ChIP of specific histone markers).

      In this revised version of the manuscript, G4 formation of the inserted sequence was validated by ChIP-qPCR, and the same G4-containing sequence was inserted at a second locus, and similar, though not identical, effects on chromatin and gene expression were observed.

      Strengths:

      This is the first attempt to connect genomics datasets of G4s and HiC with gene expression.<br /> The use of Cas9 to artificially insert a G4 is also very elegant.

    3. Reviewer #2 (Public Review):

      Roy et al. investigated the role of non-canonical DNA structures called G-quadruplexes (G4s) in long-range chromatin interactions and gene regulation. Introducing a G4 array into chromatin significantly increased the number of long-range interactions, both within the same chromosome (cis) and between different chromosomes (trans). G4s functioned as enhancer elements, recruiting p300 and boosting gene expression even 5 megabases away. The study reveals that G4s directly influence 3D chromatin organization via facilitating communication between regulatory elements and genes.

      Strengths:

      The authors' findings are valuable for understanding the role of G4-DNA in 3D genome organization and gene transcription. The authors provide convincing evidence to support their claims.

    4. Reviewer #3 (Public Review):

      Summary:

      This paper aims to demonstrate the role of G-quadruplex DNA structures in the establishment of chromosome loops. The authors introduced an array of G4s spanning 275 bp, naturally found within a very well characterized promoter region of the hTERT promoter, in an ectopic region devoid of G-quadruplex and annotated gene. As a negative control, they used a mutant version of the same sequence in which G4 folding is impaired. Due to the complexity of the region, 3 G4s on the same strand and one on the opposite strand, 12 point mutations were made simultaneously (G to T and C to A). Analysis of the 3D genome organization shows that the WT array establishes more contact within the TAD and throughout the genome than the control array. Additionally, a slight enrichment of H3K4me1 and p300, both enhancer markers, was observed locally near the insertion site. The authors tested whether the expression of genes located either nearby or away up to 5 Mb were up-regulated based on this observation. They found that four genes were up-regulated from 1.5 to 3 fold. An increased interaction between the G4 array compared to the mutant was confirmed by the 3C assay. For in-depth analysis of the long-range changes, they also performed Hi-C experiments and showed a genome-wide increase in interactions of the WT array versus the mutated form.

      Strengths:

      The experiments were well-executed and the results indicate a statistical difference between the G4 array inserted cell line and the mutated modified cell line.

      Weaknesses:

      (1) It would have been nice to have an internal control corresponding to a region known to be folded in several cell lines to compare the level of pG4 signal within their construct with a well-characterised control (for example, the KRAS promoter region).<br /> (2) The mutations introduced into the G4 sequence may also affect Sp1 or other transcription factor binding sites present in this region, and some of the observations may depend on these sites rather than G4 structures. While this is acknowledged in the text, the conclusion in the title of the paper seems an overstatement.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      In this manuscript, Chowdhury and co-workers provide interesting data to support the role of G4-structures in promoting chromatin looping and long-range DNA interactions. The authors achieve this by artificially inserting a G4-containing sequence in an isolated region of the genome using CRISPR-Cas9 and comparing it to a control sequence that does not contain G4 structures. Based on the data provided, the authors can conclude that G4-insertion promotes long-range interactions (measured by Hi-C) and affects gene expression (measured by qPCR) as well as chromatin remodelling (measured by ChIP of specific histone markers).

      Whilst the data presented is promising and partially supports the authors' conclusion, this reviewer feels that some key controls are missing to fully support the narrative. Specifically, validation of actual G4-formation in chromatin by ChIP-qPCR (at least) is essential to support the association between G4-formation and looping. Moreover, this study is limited to a genomic location and an individual G4-sequence used, so the findings reported cannot yet be considered to reflect a general mechanism/effect of G4-formation in chromatin looping.

      Strengths:

      This is the first attempt to connect genomics datasets of G4s and HiC with gene expression. The use of Cas9 to artificially insert a G4 is also very elegant.

      Weaknesses:

      Lack of controls, especially to validate G4-formation after insertion with Cas9. The work is limited to a single G4-sequence and a single G4-site, which limits the generalisation of the findings.

      In the revised version we validated G4 formation inside cells at the insertion site using the reported G4-selective antibody BG4. Significant BG4 binding (by ChIP-qPCR) was clear in the G4-array insert, and not in the G4-mutated insert, supporting formation of G4s by the inserted G4-array (included as Figure S4).

      To directly address the second point, we inserted the G4-sequence, or the mutated control, at a second relatively isolated locus (at the 10 millionth position on Chr12, denoted as 10M site in text). First, BG4 ChIP was done to confirm intracellular G4 formation by the inserted array. BG4 ChIP-qPCR binding was significant within the inserted region, and not in the negative control region (Figure S8), consistent with the 79M locus. Together these demonstrate intracellular G4 formation by inserted sequences at two different loci.

      We next checked the state of chromatin of the G4-array inserted at the 10M locus, or its negative control. Histone marks H3K4Me1, H3K27Ac, H3K27Me3, H3K9me3 and H3K4Me3 were tested at the G4-array, or the negative control locus. Relative increase in the enhancer histone marks was evident, relative to the control sequence. This was largely similar to the 79M locus, supporting an enhancer-like state. Interestingly, here we further noted presence of the H3K27me3 histone mark. The presence of the H3K27Me3 repressor histone mark, along with H3K4Me1/H3K27Ac enhancer histone marks, support a poised enhancer-like status of the inserted G4 region, as has been observed earlier in other studies. Together, although data from the two distinct G4 insertion sites support the enhancer-like state, there are contextual differences likely due to the sequence/chromatin of the sites adjacent to the inserted sequence.

      Effect of the 10M G4-insertion on activation of surrounding genes (10 Mb window), and not the G4-mutant insert, was evident for most genes. Consistent with the enhancer-like state of the G4-array insert; in line with the 79M G4-array insert.

      These results have been added as the final section in the revised version, data is shown in Figure 7.

      Reviewer #2 (Public Review):

      Summary:

      Roy et al. investigated the role of non-canonical DNA structures called G-quadruplexes (G4s) in long-range chromatin interactions and gene regulation. Introducing a G4 array into chromatin significantly increased the number of long-range interactions, both within the same chromosome (cis) and between different chromosomes (trans). G4s functioned as enhancer elements, recruiting p300 and boosting gene expression even 5 megabases away. The study proposes a mechanism where G4s directly influence 3D chromatin organization, facilitating communication between regulatory elements and genes.

      Strength:

      The findings are valuable for understanding the role of G4-DNA in 3D genome organization and gene transcription.

      Weaknesses:

      The study would benefit from more robust and comprehensive data, which would add depth and clarity.

      (1) Lack of G4 Structure Confirmation: The absence of direct evidence for G4 formation within cells undermines the study's foundation. Relying solely on in vitro data and successful gene insertion is insufficient.

      Using the reported G4-specific antibody, BG4, we performed BG4 ChIP-qPCR at the 79M locus. In addition, a second G4-insertion site was created and BG4 ChIP-qPCR was used to validate intracellular G4 formation. Briefed below, more details in the response above.

      In the revised version we validated G4 formation inside cells at the insertion site using the reported G4-selective antibody BG4. Significant BG4 binding (by ChIP-qPCR) was clear in the G4-array insert, and not in the G4-mutated insert, supporting formation of G4s by the inserted G4-array (included as Figure S4).

      Further, we inserted the G4-sequence, or the mutated control, at a second relatively isolated locus (at the 10 millionth position on Chr12, denoted as 10M site in text). First, BG4 ChIP was done to confirm intracellular G4 formation by the inserted array. BG4-ChIP-qPCR was significant within the G4-array inserted region, and not in the negative control region (Figure S8), consistent with the 79M locus. Together these demonstrate intracellular G4 formation by inserted sequences at two different loci. Added in revised text in the second and the final sections of results, data shown in Figures 7, S4 and S8.

      (2) Alternative Explanations: The study does not sufficiently address alternative explanations for the observed results. The inserted sequences may not form G4s or other factors like G4-RNA hybrids may be involved.

      As mentioned in response to the previous comment, we confirmed that the inserted sequence indeed forms G4s inside the cells. RNA-DNA hybrid G4s can form within R-loops with two or more tandem G-tracks (G-rich sequences) on the nascent RNA transcript as well as the non-template DNA strand (Fay et al., 2017, 28554731). A recent study has observed that R-loop-associated G4 formation can enhance chromatin looping by strengthening CTCF binding (Wulfridge et al., 2023, 37552993). As pointed out by the reviewer, the possibility of G4-RNA hybrids remains, we have mentioned this possibility for readers in the second last paragraph of the Discussion.

      (3) Limited Data Depth and Clarity: ChIP-qPCR offers limited scope and considerable variation in some data makes conclusions difficult.

      We noted variation with one of the primers in a few ChIP-qPCR experiments (in Figures 2 and 3D). The changes however were statistically significant across replicates, and consistent with the overall trend of the experiments (Figures 2, 3 and 4). Enhancer function, in addition to ChIP, was also confirmed using complementary assays like 3C and RNA expression.

      (4) Statistical Significance and Interpretation: The study could be more careful in evaluating the statistical significance and magnitude of the effects to avoid overinterpreting the results.

      We reconfirmed our statistical calculations from biological replicate experiments. We carefully looked at potential overinterpretations, and made appropriate changes in the manuscript (details of the changes given below in response to comment to authors).

      Reviewer #3 (Public Review):

      Summary:

      This paper aims to demonstrate the role of G-quadruplex DNA structures in the establishment of chromosome loops. The authors introduced an array of G4s spanning 275 bp, naturally found within a very well-characterized promoter region of the hTERT promoter, in an ectopic region devoid of G-quadruplex and annotated gene. As a negative control, they used a mutant version of the same sequence in which G4 folding is impaired. Due to the complexity of the region, 3 G4s on the same strand and one on the opposite strand, 12 point mutations were made simultaneously (G to T and C to A). Analysis of the 3D genome organization shows that the WT array establishes more contact within the TAD and throughout the genome than the control array. Additionally, a slight enrichment of H3K4me1 and p300, both enhancer markers, was observed locally near the insertion site. The authors tested whether the expression of genes located either nearby or up to 5 Mb away was up-regulated based on this observation. They found that four genes were up-regulated from 1.5 to 3-fold. An increased interaction between the G4 array compared to the mutant was confirmed by the 3C assay. For in-depth analysis of the long-range changes, they also performed Hi-C experiments and showed a genome-wide increase in interactions of the WT array versus the mutated form.

      Strengths:

      The experiments were well-executed and the results indicate a statistical difference between the G4 array inserted cell line and the mutated modified cell line.

      Weaknesses:

      The control non-G4 sequence contains 12 point mutations, making it difficult to draw clear conclusions. These mutations not only alter the formation of G4, but also affect at least three Sp1 binding sites that have been shown to be essential for the function of the hTERT promoter, from which the sequence is derived. The strong intermingling of G4 and Sp1 binding sites makes it impossible to determine whether all the observations made are dependent on G4 or Sp1 binding. As a control, the authors used Locked Nucleic Acid probes to prevent the formation of G4. As for mutations, these probes also interfere with two Sp1 binding sites. Therefore, using this alternative method has the same drawback as point mutations. This major issue should be discussed in the paper. It is also possible that other unidentified transcription factor binding sites are affected in the presented point mutants.

      Since the sequence we used to test the effects of G4 structure formation is highly G-rich, we had to introduce at least 12 mutations to be sure that a stable G4 structure would not form in the mutated control sequence. Sp1 has been reported to bind to G4 structures (Raiber et al., 2012). Therefore, Sp1 binding is likely to be associated with the G4-dependent enhancer functions observed here. We also appreciate that apart from Sp1, other unidentified transcription factor binding sites might be affected by the mutations we introduced. We have discussed these possibilities in the fourth paragraph of the Discussion section in the revised manuscript.

      Reviewer #1 (Recommendations For The Authors):

      Whilst the data presented is promising and partially supports the authors' conclusion, this reviewer feels that some key controls are missing to fully support the narrative used. Below are my main concerns:

      (1) The main thing missing in the current manuscript is to validate the actual formation of G4 in chromatin context for the repeat inserted by CRISPR-Cas. Whilst I appreciate this will form promptly a G4 in vitro, to fully support the conclusions proposed the authors would need to demonstrate actual G4-formation in cells after insertion. This could be done by ChIP-qPCR using the G4-selective antibody BG4 for example. This is an essential piece of evidence to be added to link with confidence G4-formation to chromatin looping.

      To address the concern regarding whether the inserted G4 sequence forms G4s in cells, as suggested, we used the G4-selective antibody BG4. PCR primers in the study were designed keeping multiple points in mind: Primers should not bind to any site of G/C alteration in the mutated control insert; either the forward/reverse primer is from the adjacent region for specificity; covers adjacent regions for studying any effects on chromatin; and, PCRs optimized keeping in mind the repeats within the inserted sequence. Given these, primer pairs R1-R4 were chosen for further work following optimizations (Figure 2, top panel). For BG4 ChIP-qPCR we used primer pairs R2, which covered >100 bases of the inserted G4-array, or the G4-mutated control. Significant BG4 binding was clear in the G4-array insert, and not in the G4-mutated insert, demonstrating formation of G4s by the inserted G4-array (Figure S4).

      In response to comment #3 below, we inserted the G4-forming sequence (or its mutated control) at a second locus. This insertion was near the 10 millionth position of chromosome 12 (10M insertion locus in text). Here also, BG4 binding was significant within the G4-array inserted region, and not in the negative control region (Figure S8). Together these demonstrate G4 formation by the inserted sequence at two different loci.

      (2) I found the LNA experiment very elegant. However, what would be the effect of LNA treatment on the control sequence that does not form G4s? This control is essential to disentangle the effect of LNA pairing to the sequence itself vs disrupting the G4-structure.

      As per the reviewer’s suggestion, we performed a control experiment where we treated the G4-mutated insert (control) cells with the G4-disrupting LNA probes. The changes in the expression of the surrounding genes in this case were not significant, indicating that the effects observed in the G4-array insert cells were possibly due to disruption of the inserted G4 structures. This data is presented in Figure S5.

      (3) The authors describe their work and present its conclusion as if this were a genome-wide study, whilst the work is focused on a specific genomic location, and the looping, along with the effect on histone acetylation and gene expression, is limited to this. The authors cannot conclude, therefore, that this is a generic effect and the discussion should be more focused on the specific G4s used and the genomic location investigated. Ideally, insertion of a different G4-forming sequence or of the same in a different genomic location is recommended to really claim a generic effect.

      To address this we inserted the G4-array sequence, or the G4-mutated control sequence, at another relatively isolated locus – at the 10 millionth position of chromosome 12 – denoted as 10M. Using BG4 ChIP-qPCR intracellular G4 formation was confirmed. We observed that the enhancer-like features in terms of enhancer histone marks and increase in the expression of surrounding genes were largely reproduced at the 10M locus on G4 insertion (Figure 7). These results are added as the final section under Results.

      Reviewer #2 (Recommendations For The Authors):

      The study proposes a mechanism where G4s directly influence 3D chromatin organization, facilitating communication between regulatory elements and genes.

      While the present manuscript presents an interesting hypothesis, it would benefit from enhanced novelty and more robust data. The study complements existing G4 research (e.g., PMID: 31177910). While the conclusions hold biological relevance, they largely reiterate established knowledge. Furthermore, the presented data appear preliminary and still lack depth and clarity.

      Hou et al., 2019 (PMID: 31177910) showed presence of potential G4-forming sequences correlated with TAD boundaries, along with enrichment of architectural proteins and transcription factor binding sites. Also, other studies noted enrichment of potential G4-forming sequences at enhancers along with nucleosome depletion and higher transcription factor binding (Hou et al., 2021; Williams et al., 2020). These studies proposed the role of G4s in chromatin/TAD states based on analysis of potential G4-forming sequences using correlative bioinformatics analyses. Here we sought to directly test causality. Insertion of G4 sequence, and formation of intracellular G4s in an isolated, G4-depleted region resulted in altered characteristics of chromatin, and not in the negative control insertion that does not form G4s. These, in contrast to earlier studies, directly demonstrates the causal role of G4s as functional elements that impact local and distant chromatin.

      Major concerns:

      (1) Lack of G4 Structure Confirmation: Implement G4-specific antibodies or fluorescent probes to verify G4 structures inside the cells.

      Detailed response given above. Briefly, in the revised version we validated G4 formation inside cells at the insertion site using the reported G4-selective antibody BG4. Significant BG4 binding (by ChIP-qPCR) was clear in the G4-array insert, and not in the G4-mutated insert, supporting formation of G4s by the inserted G4-array (included as Figure S4).

      Further, we inserted the G4-sequence, or the mutated control, at a second relatively isolated locus (at the 10 millionth position on Chr12, denoted as 10M site in text). First, BG4 ChIP was done to confirm intracellular G4 formation by the inserted array. BG4 ChIP-qPCR binding was significant within the G4-array inserted region, and not in the negative control region (Figure S8), consistent with the 79M locus. Together these demonstrate intracellular G4 formation by inserted sequences at two different loci. Added in revised text in the second and the final sections of results, data shown in Figures 7, S4 and S8.

      (2) Alternative Explanations: Explore the possibility that the sequences may not form G4s or that other factors like G4-RNA hybrids are involved.

      Response provided in the public reviews section.

      (3) Limited Data Depth and Clarity: ChIP-qPCR offers limited scope. Consider employing G4 ChIP-seq for genome-wide analysis of G4 association with histone modifications. Address inconsistencies in data like H3K27me3 variation and incomplete H3K9me3 data sets.

      A recent study performed G4 CUT&Tag (Lyu et al., 2022, 34792172) and observed G4 formation at both active promoters and active and poised enhancers. We have discussed this in the sixth paragraph of the Discussion. The H3K27Me3 occupancy at the 79M locus insertions did not have any significant G4-dependent changes, however, at the second insertion site at the 10M locus (introduced in the revised manuscript, Figure 7) there was significant G4-dependent increase in H3K27Me3 occupancy along with the H3K4Me1 and H3K27Ac enhancer histone marks, indicating formation of a poised enhancer-like element.

      We completed the H3K9me3 data sets for both insertion sites.

      (4) Statistical Significance and Interpretation: Re-evaluate the statistical significance of results and interpret them in the context of relevant biological knowledge. Avoid overstating the impact of minor changes.     

      We revised several lines to avoid overstating results. Some of the changes are as below (changes underline/strikethrough)

      - There was an a relatively modest increase in the recruitment of both p300 and a substantial increase in the recruitment of the more functionally active acetylated p300/CBP to the G4-array when compared against the mutated control.

      - As expected, although modest, a decrease in the H3K4Me1 and H3K27Ac enhancer histone modifications was evident within the insert upon the LNAs treatment.

      - Moreover, the enhancer marks were relatively reduced, although not markedly, when the inserted G4s were specifically disrupted.

      (5) Unexplored Aspects: Investigate the relationship between G4 DNA and R-loops, and consider the role of CTCF and cohesin proteins in mediating long-range interactions. Integrate existing research to build a more comprehensive framework and draw more robust conclusions.

      As mentioned in response to one of the earlier comments, a recent publication extensively studied the association between G4s, R-loops, and CTCF binding (Wulfridge et al., 2023). While, here we focused on the primary features of a potential enhancer, further work will be necessary to establish how G4s influence the coordinated action between cohesin and CTCF and consequent chromatin looping. We have described this for readers in the second last paragraph of the Discussion in the revised version.

      Minor Concern:

      (1) Enhancer Definition: The term "enhancer" requires specific criteria. Modify the section heading or provide evidence demonstrating the G4 sequence fulfills all conditions for being an enhancer, such as position independence and long-range effects.

      Although we checked some of the primary features of a potential enhancer: Like expression of surrounding genes, enhancer histone marks, chromosomal looping interactions, and recruitment of transcriptional coactivators, further aspects may need to be validated. As suggested, in the revised manuscript the section heading has been modified to ‘Enhancer-like features emerged upon insertion of G4s.’

      Reviewer #3 (Recommendations For The Authors):

      In addition to the points in my public review, I would like to mention some less significant points.

      The authors mention that "the array of G4-forming sequences used for insertion was previously reported to form stable G4s in human cells" (Lim et al., 2010; Monsen et al., 2020; Palumbo et al., 2009). However, upon reading the publications, I found that these observations were made in vitro. I may have missed something, but there are now several mappings of folded-G4 in human cells based on different approaches. It would be beneficial to investigate whether the hTERT promoter is a site of G-quadruplex formation in vivo. If confirmed, a similar analysis should be conducted on the 275 bp region inserted into the ectopic region to determine if it also has the ability to form a structured G4.

      We performed BG4 ChIP to confirm in vivo G4 formation by the inserted G4-array as suggested (Figures S4, S8). Detail response given above. Briefly, in the revised version we validated G4 formation inside cells at the insertion site using the reported G4-selective antibody BG4. Significant BG4 binding (by ChIP-qPCR) was clear in the G4-array insert, and not in the G4-mutated insert, supporting formation of G4s by the inserted G4-array (included as Figure S4).

      Further, we inserted the G4-sequence, or the mutated control, at a second relatively isolated locus (at the 10 millionth position on Chr12, denoted as 10M site in text). First, BG4 ChIP was done to confirm intracellular G4 formation by the inserted array. BG4-ChIP-qPCR was significant within the inserted region, and not in the negative control region (Figure S8). Consistent with the 79M locus. Together these demonstrate intracellular G4 formation by inserted sequences at two different loci. Added in revised text in the second and the final sections of results, data shown in Figures 7, S4 and S8.

      The inserted sequence originates from a well-characterized promoter. The authors suggest that placing it in an ectopic position creates an enhancer-like region, based on the observation of increased levels of H3K27Ac and H3K4me1 on the WT array. To provide a control that it is not a promoter, it would be useful to also analyze a specific mark of promoter activity, such as H3K4me3.

      As suggested by reviewer, we also analysed the H3K4Me3 promoter activation mark at both the 79M and 10M (introduced in the revised manuscript, Figure 7) insertion loci. We did not observe any significant G4-dependent changes in the recruitment of H3K4Me3 (Figures 2, 7).

      In the discussion, the authors mention "it was proposed that inter-molecular G4 formation between distant stretches of Gs may lead to DNA looping". To investigate this further, it would be worthwhile to examine whether the promoter regions of activated genes (PAWR, PPP1R12A, NAV3, and SLC6A15) contain potentially forming G-quadruplexes (pG4). Additionally, sites that establish more contact with the G4 array described in Figure 6F could be analyzed for enrichment in pG4.

      Thank you for pointing this out. We found promoters of the four genes (PAWR, PPP1R12A, NAV3, and SLC6A15) harbour potential G4-forming sequences (pG4s). Also as suggested, we analysed the contact regions in Fig 6F, along with the whole locus, for pG4s. Relative enrichment in pG4 was seen, particularly within the significantly enhanced interacting regions, which at times spreads beyond the interacting regions also. This is shown in the lower panel of Figure 6F in the revised version. We have described this in Discussion for readers.

    1. eLife assessment

      This important study addresses the idea that defective lysosomal clearance might be causal to renal dysfunction in cystinosis. They observe that restoring expression of vATPase subunits and treatment with Astaxanthin ameliorate mitochondrial function in a model of renal epithelial cells, opening opportunities for translational application to humans. The data are convincing, but the description of methodologies is incomplete.

    2. Reviewer #1 (Public Review):

      Cystinosis is a rare hereditary disease caused by biallelic loss of the CTNS gene, encoding two cystinosin protein isoforms; the main isoform is expressed in lysosomal membranes where it mediates cystine efflux whereas the minor isoform is expressed at the plasma membrane and in other subcellular organelles. Sur et al proceed from the assumption that the pathways driving the cystinosis phenotype in the kidney might be identified by comparing the transcriptome profiles of normal vs CTNS-mutant proximal tubular cell lines. They argue that key transcriptional disturbances in mutant kidney cells might not be present in non-renal cells such as CTNS-mutant fibroblasts.

      Using cluster analysis of the transcriptomes, the authors selected a single vacuolar H+ATPase (ATP6VOA1) for further study, asserting that it was the "most significantly downregulated" vacuolar H+ATPase (about 58% of control) among a group of similarly downregulated H+ATPases. They then showed that exogenous ATP6VOA1 improved CTNS(-/-) RPTEC mitochondrial respiratory chain function and decreased autophagosome LC3-II accumulation, characteristic of cystinosis. The authors then treated mutant RPTECs with 3 "antioxidant" drugs, cysteamine, vitamin E, and astaxanthin (ATX). ATX (but not the other two antioxidant drugs) appeared to improve ATP6VOA1 expression, LC3-II accumulation, and mitochondrial membrane potential. Respiratory chain function was not studied. RTPC cystine accumulation was not studied.

      The major strengths of this manuscript reside in its two primary findings.<br /> (1) Plasmid expression of exogenous ATP6VOA1 improves mitochondrial integrity and reduces aberrant autophagosome accumulation.<br /> (2) Astaxanthin partially restores suboptimal endogenous ATP6VOA1 expression.

      Taken together, these observations suggest that astaxanthin might constitute a novel therapeutic strategy to ameliorate defective mitochondrial function and lysosomal clearance of autophagosomes in the cystinotic kidney. This might act synergistically with the current therapy (oral cysteamine) which facilitates defective cystine efflux from the lysosome.

      There are, however, several weaknesses in the manuscript.<br /> (1) The reductive approach that led from transcriptional profiling to focus on ATP6VOA1 is not transparent and weakens the argument that potential therapies should focus on correction of this one molecule vs the other H+ ATPase transcripts that were equally reduced - or transcripts among the 1925 belonging to at least 11 pathways disturbed in mutant RPTECs.<br /> (2) A precise description of primary results is missing -- the Results section is preceded by or mixed with extensive speculation. This makes it difficult to dissect valid conclusions from those derived from less informative experiments (eg data on CDME loading, data on whole-cell pH instead of lysosomal pH, etc).<br /> (3) Data on experimental approaches that turned out to be uninformative (eg CDME loading, or data on whole=cell pH assessment with BCECF).<br /> (4) The rationale for the study of ATX is unclear and the mechanism by which it improves mitochondrial integrity and autophagosome accumulation is not explored (but does not appear to depend on its anti-oxidant properties).<br /> (5) Thoughtful discussion on the lack of effect of ATP6VOA1 correction on cystine efflux from the lysosome is warranted, since this is presumably sensitive to intralysosomal pH.<br /> (6) Comparisons between RPTECs and fibroblasts cannot take into account the effects of immortalization on cell phenotype (not performed in fibroblasts).

      This work will be of interest to the research community but is self-described as a pilot study. It remains to be clarified whether transient transfection of RPTECs with other H+ATPases could achieve results comparable to ATP6VOA1. Some insight into the mechanism by which ATX exerts its effects on RPTECs is needed to understand its potential for the treatment of cystinosis.

    3. Reviewer #2 (Public Review):

      Sur and colleagues investigate the role of ATP6V0A1 in mitochondrial function in cystinotic proximal tubule cells. They propose that loss of cystinosin downregulates ATP6V0A1 resulting in acidic lysosomal pH loss, and adversely modulates mitochondrial function and lifespan in cystinotic RPTECs. They further investigate the use of a novel therapeutic Astaxanthin (ATX) to upregulate ATP6V0A1 that may improve mitochondrial function in cystinotic proximal tubules.

      The new information regarding the specific proximal tubular injuries in cystinosis identifies potential molecular targets for treatment. As such, the authors are advancing the field in an experimental model for potential translational application to humans.

    4. Author response:

      eLife assessment

      This important study addresses the idea that defective lysosomal clearance might be causal to renal dysfunction in cystinosis. They observe that restoring expression of vATPase subunits and treatment with Astaxanthin ameliorate mitochondrial function in a model of renal epithelial cells, opening opportunities for translational application to humans. The data are convincing, but the description of methodologies is incomplete.

      Public Reviews:

      Reviewer #1 (Public Review):

      Cystinosis is a rare hereditary disease caused by biallelic loss of the CTNS gene, encoding two cystinosin protein isoforms; the main isoform is expressed in lysosomal membranes where it mediates cystine efflux whereas the minor isoform is expressed at the plasma membrane and in other subcellular organelles. Sur et al proceed from the assumption that the pathways driving the cystinosis phenotype in the kidney might be identified by comparing the transcriptome profiles of normal vs CTNS-mutant proximal tubular cell lines. They argue that key transcriptional disturbances in mutant kidney cells might not be present in non-renal cells such as CTNS-mutant fibroblasts.

      Using cluster analysis of the transcriptomes, the authors selected a single vacuolar H+ATPase (ATP6VOA1) for further study, asserting that it was the "most significantly downregulated" vacuolar H+ATPase (about 58% of control) among a group of similarly downregulated H+ATPases. They then showed that exogenous ATP6VOA1 improved CTNS(-/-) RPTEC mitochondrial respiratory chain function and decreased autophagosome LC3-II accumulation, characteristic of cystinosis. The authors then treated mutant RPTECs with 3 "antioxidant" drugs, cysteamine, vitamin E, and astaxanthin (ATX). ATX (but not the other two antioxidant drugs) appeared to improve ATP6VOA1 expression, LC3-II accumulation, and mitochondrial membrane potential. Respiratory chain function was not studied. RTPC cystine accumulation was not studied.

      In this manuscript, as an initial step, we have studied the first step in respiratory chain function by performing the Seahorse Mito Stress Test to demonstrate that the genetic manipulation (knocking out the CTNS gene and plasmid-mediated expression correction of ATP6V0A1) impacts mitochondrial energetics. We did not investigate the respirometry-based assays that can identify locations of electron transport deficiency, which we plan to address in a follow-up paper.

      We would like to draw attention to Figure 3D, where cystine accumulation has been studied. This figure demonstrates an increased intracellular accumulation of cystine.

      The major strengths of this manuscript reside in its two primary findings.

      (1) Plasmid expression of exogenous ATP6VOA1 improves mitochondrial integrity and reduces aberrant autophagosome accumulation.

      (2) Astaxanthin partially restores suboptimal endogenous ATP6VOA1 expression.

      Taken together, these observations suggest that astaxanthin might constitute a novel therapeutic strategy to ameliorate defective mitochondrial function and lysosomal clearance of autophagosomes in the cystinotic kidney. This might act synergistically with the current therapy (oral cysteamine) which facilitates defective cystine efflux from the lysosome.

      There are, however, several weaknesses in the manuscript.

      (1) The reductive approach that led from transcriptional profiling to focus on ATP6VOA1 is not transparent and weakens the argument that potential therapies should focus on correction of this one molecule vs the other H+ ATPase transcripts that were equally reduced - or transcripts among the 1925 belonging to at least 11 pathways disturbed in mutant RPTECs.

      The transcriptional profiling studies on ATP6V0A1 have been fully discussed and publicly shared. Table 2 lists the v-ATPase transcripts that are significantly downregulated in cystinosis RPTECs. We have also clarified and justified the choice of further studies on ATP6V0A1, where we state the following: "The most significantly perturbed member of the V-ATPase gene family found to be downregulated in cystinosis RPTECs is ATP6V0A1 (Table 2). Therefore, further attention was focused on characterizing the role of this particular gene in a human in vitro model of cystinosis."

      (2) A precise description of primary results is missing -- the Results section is preceded by or mixed with extensive speculation. This makes it difficult to dissect valid conclusions from those derived from less informative experiments (eg data on CDME loading, data on whole-cell pH instead of lysosomal pH, etc).

      We appreciate the reviewer highlighting areas for further improving the manuscript's readership. In our resubmission, we have revised the results section to provide a more precise description of the primary findings and restrict the inferences to the discussion section only.

      (3) Data on experimental approaches that turned out to be uninformative (eg CDME loading, or data on whole=cell pH assessment with BCECF).

      We have provided data whether it was informative or uninformative. Though lysosome-specific pH measurement would be important to measure, it was not possible to do it in our cells as they were very sick and the assay did not work. Hence we provide data on pH assessment with BCECF, which measures overall cytoplasmic and organelle pH, which is also informative for whole cell pH that is an overall pH of organelle pH and cytoplasmic pH.

      (4) The rationale for the study of ATX is unclear and the mechanism by which it improves mitochondrial integrity and autophagosome accumulation is not explored (but does not appear to depend on its anti-oxidant properties).

      We have provided rationale for the study of ATX; provided in the introduction and result section, where we mentioned the following: “correction of ATP6V0A1 in CTNS-/- RPTECs and treatment with antioxidants specifically, astaxanthin (ATX) increased the production of cellular ATP6V0A1, identified from a custom FDA-drug database generated by our group, partially rescued the nephropathic RPTEC phenotype. ATX is a xanthophyll carotenoid occurring in a wide variety of organisms. ATX is reported to have the highest known antioxidant activity and has proven to have various anti-inflammatory, anti-tumoral, immunomodulatory, anti-cancer, and cytoprotective activities both in vivo and in vitro”.

      We are still investigating the mechanism by which ATX improves mitochondrial integrity and this will be the focus of a follow-on manuscript.

      (5) Thoughtful discussion on the lack of effect of ATP6VOA1 correction on cystine efflux from the lysosome is warranted, since this is presumably sensitive to intralysosomal pH.

      We have provided a thoughtful discussion in the revised manuscript on some possible mechanisms that may result in an effect of ATP6V0A1 correction on cysteine efflux from the lysosome.

      (6) Comparisons between RPTECs and fibroblasts cannot take into account the effects of immortalization on cell phenotype (not performed in fibroblasts).

      The purpose of examining different tissue sources of primary cells in nephropathic cystinosis was to assess if any of the changes in these cells were tissue source specific. We used primary cells isolated from patients with nephropathic cystinosis—RPTECs from patients' urine and fibroblasts from patients' skin—these cells are not immortalized and can therefore be compared. This is noted in the results section - “Specific transcriptional signatures are observed in cystinotic skin-fibroblasts and RPTECs obtained from the same individual with cystinosis versus their healthy counterparts”.

      We next utilized the immortalized RPTEC cell line to create CRISPR-mediated CTNS knockout RPTECs as a resource for studying the pathophysiology of cystinosis. These cells were not compared to the primary fibroblasts.

      (7) This work will be of interest to the research community but is self-described as a pilot study. It remains to be clarified whether transient transfection of RPTECs with other H+ATPases could achieve results comparable to ATP6VOA1. Some insight into the mechanism by which ATX exerts its effects on RPTECs is needed to understand its potential for the treatment of cystinosis.

      In future studies we will further investigate the effect of ATX on RPTECs for treatment of cystinosis- this will require the conduct of Phase 1 and Phase 2 clinical studies which are beyond the scope of this current manuscript.

      Reviewer #2 (Public Review):

      Sur and colleagues investigate the role of ATP6V0A1 in mitochondrial function in cystinotic proximal tubule cells. They propose that loss of cystinosin downregulates ATP6V0A1 resulting in acidic lysosomal pH loss, and adversely modulates mitochondrial function and lifespan in cystinotic RPTECs. They further investigate the use of a novel therapeutic Astaxanthin (ATX) to upregulate ATP6V0A1 that may improve mitochondrial function in cystinotic proximal tubules.

      The new information regarding the specific proximal tubular injuries in cystinosis identifies potential molecular targets for treatment. As such, the authors are advancing the field in an experimental model for potential translational application to humans.

    1. eLife assessment

      This study provides valuable findings that improve our understanding of the evolutionary conservation of the role of DDX6 in mRNA decay. The evidence supporting the authors' conclusions is convincing. This work will be of interest to molecular, cell biologists and biochemists, especially those studying RNA.

    2. Reviewer #1 (Public Review):

      Weber et al. investigated the role of human DDX6 in messenger RNA decay using CRISPR/Cas9 mediated knockout (KO) HEK293T cells. The authors showed that stretches of rare codons or codons known to cause ribosome stalling in reporter mRNAs leads to a DDX6 specific loss of mRNA decay. The authors moved on to show that there is a physical interaction between DDX6 and the ribosome. Using co-immunoprecipitation (co-IP) experiments, the authors determined that the FDF-binding surface of DDX6 is necessary for binding to the ribosome, the same domain which is necessary for binding several decapping factors such as EDC3, LSM14A, and PatL. However, they determine the interaction between DDX6, and the ribosome is independent of the DDX6 interaction with the NOT1 subunit of the CCR4-NOT complex. Interestingly, the authors were able to determine that all known functional domains, including the ATPase activity of DDX6, are required for its effect on mRNA decay. Using ribosome profiling and RNA-sequencing, the authors were able to identify a group of 260 mRNAs that exhibit increased translational efficiency (TE) in DDX6 Knockout cells, suggesting that DDX6 translationally represses certain mRNAs. The authors determined this group of mRNAs has decreased GC content, which has been previously noted to coincide with low codon optimality, the authors thus conclude DDX6 may translationally repress transcripts of low codon optimality. Furthermore, the authors identify 35 transcripts that are both upregulated in DDX6 KO cells and exhibit locally increased ribosome footprints (RBFs), suggestive of a ribosome stalling sequence. Lastly, the authors showed that both endogenous and tethering of DDX6 to reporter mRNAs with and without these translational stalling sequences leads to a relative increase in ribosome association to a transcript. Overall, this work confirms that the role of DDX6 in mRNA decay shares several conserved features with the yeast homolog Dhh1. Dhh1 is known to bind slow-moving ribosomes and lead to the differential decay of non-optimal mRNA transcripts (Radhakrishnan et al. 2016). The novelty of this work lies primarily in the identification of the physical interaction between DDX6 and the ribosome and the breakdown of which domains of DDX6 are necessary for this interaction. This work provides major insight into the role of the human DDX6 in the process of mRNA decay and emphasizes the evolutionary conservation of this process across Eukaryotes.

      Overall, the work done by Weber et al. is sound, with the proper controls. The authors expand significantly on the knowledge of what we know about DDX6 in the process of mRNA decay in humans, confirming the evolutionary conservation of the role of this factor across eukaryotes. The analysis of the RNA-seq and Ribo-seq data could be more in-depth, however, the authors were able to show with certainty that some transcripts containing known repetitive sequences or polybasic sequences exhibited a DDX6-mRNA decay effect.

    3. Reviewer #2 (Public Review):

      In the manuscript by Weber and colleagues, the authors investigated the role of a DEAD-box helicase DDX6 in regulating mRNA stability upon ribosome slowdown in human cells. The authors knocked out DDX6 KO in HEK293T cells and showed that the half-life of a reporter containing a rare codon repeat is elongated in the absence of DDX6. By analogy to the proposed function of fission yeast Dhh1p (DDX6 homolog) as a sensor for slow ribosomes, the authors demonstrated that recombinant DDX6 interacted with human ribosomes. The interaction with the ribosome was mediated by the FDF motif of DDX6 located in its RecA2 domain, and rescue experiments showed that DDX6 requires the FDF motif as well as its interaction with the CCR4-NOT deadenylase complex and ATPase activity for degrading a reporter mRNA with rare codons. To identify endogenous mRNAs regulated by DDX6, they performed RNA-Seq and ribosome footprint profiling. The authors focused on mRNAs whose stability is increased in DDX6 KO cells with high local ribosome density and validated that such mRNA sequences induced mRNA degradation in a DDX6-dependent manner.

      The experiments were well-performed, and the results clearly demonstrated the requirement of DDX6 in mRNA degradation induced by slowed ribosomes.

      [Editors' note: The authors have addressed the key points from the previous public reviews in their revised manuscript.]

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Weaknesses:

      The authors fail to truly define codon optimality, rare codons, and stalling sequences in their work, all of which are distinct terminologies. They use reporters with rare codon usage but do not mention what metrics they use to determine this, such as cAI, codon usage bias, or tAI. The distinction between the type of codon sequences that DDX6 affects is very important to differentiate and should be done here as certain stretches of codons are known to lead to different quality control RNA decay pathways that are not reliant on canonical mRNA decay factors.

      Thank you for the reviewer’s feedback on our work. Clearly defining codon optimality, rare codons, and stalling sequences is indeed crucial. We will emphasize this distinction more in our revisions to help readers better understand our analysis and findings.

      Likewise, the authors sort their Ribo-seq data to determine genes that might exhibit a DDX6 specific mRNA decay effect but fail to go into great depth about common features shared among these genes other than GO term analysis, GC content, and coding sequence (CDS) length. The authors then sort out 35 genes that are both upregulated at the mRNA level and have increased local ribosome footprint along the ORF. They are then able to show that 6 out of 9 of those genes had a DDX6-dependent mRNA decay effect. There was no comment or effort as to why 2 out of those 6 genes tested did not show as strong of a DDX6-dependent decay effect relative to the other targets tested. Thus, the efforts to identify mRNA features at a global level that exhibited DDX6-dependent mRNA decay effects are lacking in this analysis.

      We appreciate the reviewer's insightful comments regarding the need to further characterize the genes influenced by DDX6-mediated mRNA decay. To address this, we carried out additional analyses to identify potential traits of these genes. Our findings revealed that DDX6-regulated coding sequences tend to be longer and exhibit lower predicted mRNA stability scores compared to the average across the transcriptome. This observation indicates a possible connection to codon optimality. It suggests that DDX6 could play a role in regulating a specific subset of mRNAs with inherently lower stability, potentially shedding light on why some genes may exhibit varied decay patterns when DDX6 is depleted.

      Overall, the work done by Weber et al. is sound, with the proper controls. The authors expand significantly on the knowledge of what we know about DDX6 in the process of mRNA decay in humans, confirming the evolutionary conservation of the role of this factor across eukaryotes. The analysis of the RNA-seq and Ribo-seq data could be more in-depth, however, the authors were able to show with certainty that some transcripts containing known repetitive sequences or polybasic sequences exhibited a DDX6-mRNA decay effect.

      We appreciate the reviewer’s acknowledgment of the soundness of our work and the inclusion of proper controls. We are committed to refining our manuscript to meet your expectations and ensure the accuracy and depth of our findings.

      Reviewer #2 (Public Review):

      The experiments were well-performed, and the results clearly demonstrated the requirement of DDX6 in mRNA degradation induced by slowed ribosomes. However, in some cases, the authors interpreted their data in a biased way, possibly influenced by the yeast study, and drew too strong conclusions. In addition, the authors should have cited important studies about codon optimality in mammalian cells. This lack of information hinders placing their important discoveries in a correct context.

      (1) Although the authors concluded that DDX6 acts as a sensor of the slowed ribosome, it is not clear if DDX6 indeed senses the ribosome speed. What the authors showed is a requirement of DDX6 for mRNA decay induced by rare codons, and DDX6 binds to the ribosome to exert this role. For example, DDX6 may bridge the sensor and decay machinery on the ribosome. Without structural or biochemical data on the recognition of the slowed ribosome by DDX6, the role of DDX6 as a sensor remains one of the possible models. It should be described in the discussion section.

      We greatly appreciate the reviewer’s comments and suggestions. We agree that our study does not directly establish that DDX6 senses ribosome speed. We also agree that without structural or biochemical data demonstrating recognition of the slowed ribosome by DDX6, the role of DDX6 as a sensor remains one of the possible models. We will incorporate this point into the discussion section and acknowledge it as an important direction for future research.

      (2) It is not clear if DDX6 directly binds the ribosome. The authors used ribosomes purified by sucrose cushion, but ribosome-associating and FDF motif-interacting factors might remain on ribosomes, even after RNaseI treatment. Without structural or biochemical data of the direct interaction between the ribosome and DDX6, the authors should avoid description as if DDX6 directly binds to the ribosome.

      We agree with the reviewer’s perspective that, even after RNase I treatment, factors associated with the ribosome and interacting with the FDF motif might still remain on the ribosomes that were purified via a sucrose cushion. In the revised manuscript, we will describe the relationship between DDX6 and the ribosome more cautiously, avoiding the depiction of DDX6 directly binding to the ribosome.

      (3) Although the authors performed rigorous reporter assays recapitulating the effect of ribosome-retardation sequences on mRNA stability, this is not the first report showing that codon optimality determines mRNA stability in human cells. The authors did not cite important previous studies, such as Wu et al., 2019 (PMID: 31012849), Hia et al., 2019 (PMID: 31482640), Narula et al., 2019 (PMID: 31527111), and Forrest et al., 2020 (PMID: 32053646). These milestone papers should be cited in the Introduction, Results, and Discussion.

      Thank you for the reviewer’s correction. We apologize for the oversight in our references. In the revised manuscript, we will ensure these key studies are appropriately cited.

      (4) While both DDX6 and deadenylation by the CCR4-NOT were required for mRNA decay by the slowed ribosome, whether DDX6 is required for deadenylation was not investigated. Given that the CCR4-NOT deadenylate complex directly interacts with the empty ribosome E-site in yeast and humans (Buschauer et al., 2020 PMID: 32299921 and Absmeier et al., 2023 PMID: 37653243), whether the loss of DDX6 also affected the action of the CCR4-NOT complex is an important point to investigate, or at least should be discussed in this paper.

      We sincerely appreciate the reviewer's valuable suggestions. This point is indeed crucial, and we have addressed it in the revised version of our manuscript. We have included experimental results confirming that the knockout of DDX6 does not impact the CCR4-NOT complex’s deadenylation function. This addition will contribute to a more comprehensive discussion of the relevant issues and refine our manuscript.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      The authors should explain what they use to determine rare codons in their system and distinguish this feature from codon optimality. Codon optimality is a distinct feature from rare codon usage, and both should be defined better in the context of the paper. The authors interchange between the use of codon optimality, rare codon usage, and translation stalling sequences frequently and should explain and clarify these terms or consider only referring to translation stalling sequences for their discussion.

      We appreciate the reviewer's valuable feedback, we have been able to improve the clarity and rigor of the relevant statements in the manuscript. In the revised manuscript, we have provided more explicit and detailed explanations regarding the definition and use of rare codons, and differentiated this from codon optimality, in order to help readers better understand the basis of our analysis and research findings. Furthermore, in the revised manuscript, we are now referring exclusively to 'translation stalling sequences' in our discussion, in order to provide greater clarity.

      Reviewer #2 (Recommendations For The Authors):

      Interestingly, the translation efficiency of zinc-finger domain mRNAs was increased in DDX6 KO cells. This finding is consistent with the previous study reporting that mRNAs encoding zinc-finger domains are enriched with non-optimal codons and unstable. (Diez et al., 2022 PMID: 35840631). The authors might want to cite this paper and mention the consistency of the two studies.

      Thank you for noting the relevance of the increased translation efficiency of zinc-finger domain mRNAs in DDX6 KO cells. We will reference the study by Diez et al. (2022) and emphasize the consistency between their findings and ours, which supports the idea that DDX6 is involved in regulating the translation of mRNAs with these characteristics.

      A mutagenesis analysis of the poly-basic residues of BMP2 would further strengthen the authors' claim that this sequence is a primal cause of ribosome slowdown and mRNA decay.

      We greatly appreciate the reviewer’s suggestion to conduct a mutagenesis analysis of the poly-basic residues of BMP2. We agree that such an analysis could potentially strengthen our claim. However, considering the constraints we are currently encountering, and our study has already provided substantial evidence to support our findings, we believe that at this stage of our research, conducting this analysis may not be the most immediate priority. We will consider undertaking a mutagenesis analysis in future studies to further validate our conclusions.

      In the Introduction, RQC is not commonly referred to as "ribosome-based quality control." Please consider the use of "ribosome-associated quality control."

      We appreciate the reviewer providing this suggestion. During the revision process, we corrected the relevant terminology to ensure more precise and appropriate usage.

      In the Introduction, the authors should avoid introducing NMD as a part of RQC. NMD was discovered and defined independently of RQC.

      Thank you for pointing out this important distinction. We recognize that NMD was discovered and defined independently from RQC, and should not be presented as an integral part of the RQC process. In the revised manuscript, we have made sure to avoid introducing nonsense-mediated decay (NMD) as a component of ribosome-associated quality control (RQC).

    1. eLife assessment

      This study presents a useful description of RNA in extracellular vesicles (EV-RNAs) and highlights the potential to develop biomarkers for the early detection of colorectal cancer (CRC) and precancerous adenoma (AA). The data were analysed using overall solid methodology and would benefit from further validation of predicted lncRNAs and biomarker validation at each stage of CRC/AA to evaluate the potential application to early detection of CRC and AA.

    2. Joint Public Review:

      Detection of early-stage colorectal cancer is of great importance. Laboratory scientists and clinicians have reported different exosomal biomarkers to identify colorectal cancer patients. This is a proof-of-principle study of whether exosomal RNAs, and particularly predicted lncRNAs, are potential biomarkers of early-stage colorectal cancer and its precancerous lesions.

      Strengths:

      The study provides a valuable dataset of the whole-transcriptomic profile of circulating sEVs, including miRNA, mRNA, and lncRNA. This approach adds to the understanding of sEV-RNAs' role in CRC carcinogenesis and facilitates the discovery of potential biomarkers.

      The developed 60-gene t-SNE model successfully differentiated T1a stage CRC/AA from normal controls with high specificity and sensitivity, indicating the potential of sEV-RNAs as diagnostic markers for early-stage colorectal lesions.

      The study combines RNA-seq, RT-qPCR, and modelling algorithms to select and validate candidate sEV-RNAs, maximising the performance of the developed RNA signature. The comparison of different algorithms and consideration of other factors enhance the robustness of the findings.

      Weaknesses:

      Validation in larger cohorts would be required to establish as biomarkers and to demonstrate whether the predicted lncRNAs implicated in these biomarkers are indeed present and whether they are robustly predictive/prognostic.

      The following points were noted during preprint review:

      (1) Lack of analysis on T1-only patients in the validation cohort: While the study identifies key sEV-RNAs associated with T1a stage CRC and AA, the validation cohort is only half of the patients in T1(25 out of 49). It would be better to do an analysis using only the T1 patients in the validation cohort, so the conclusion is not affected by the T2-T3 patients.

      (2) Lack of performance analysis across different demographic and tumor pathology factors listed in Supplementary Table 12. It's important to know if the sEV-RNAs identified in the study work better/worse in different age/sex/tumor size/Yamada subtypes etc.

      (3) The authors tested their models in a medium size population of 124 individuals, which is not enough to obtain an accurate evaluation of the specificity and sensitivity of the biomarkers proposed here. External validation would be required.

      (4) Depicting the full RNA landscape of circulating exosomes is still quite challenging. The authors annotated 58,333 RNA species in exosomes, most of which were lncRNAs, with annotation methods briefly described in Suppl Methods.

    3. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1:

      Detection of early-stage colorectal cancer is of great importance. Laboratory scientists and clinicians have reported different exosomal biomarkers to identify colorectal cancer patients. This is a proof-of-principle study of whether exosomal RNAs, and particularly predicted lncRNAs, potential biomarkers of early-stage colorectal cancer and its precancerous lesions.

      Strengths:

      The study provides a valuable dataset of the whole-transcriptomic profile of circulating sEVs, including miRNA, mRNA, and lncRNA. This approach adds to the understanding of sEV-RNAs' role in CRC carcinogenesis and facilitates the discovery of potential biomarkers.

      The developed 60-gene t-SNE model successfully differentiated T1a stage CRC/AA from normal controls with high specificity and sensitivity, indicating the potential of sEV-RNAs as diagnostic markers for early-stage colorectal lesions.

      The study combines RNA-seq, RT-qPCR, and modelling algorithms to select and validate candidate sEV-RNAs, maximising the performance of the developed RNA signature. The comparison of different algorithms and consideration of other factors enhance the robustness of the findings.

      Weaknesses:

      Validation in larger cohorts would be required to establish as biomarkers, and to demonstrate whether the predicted lncRNAs implicated in these biomarkers are indeed present, and whether they are robustly predictive/prognostic.

      Thank you for your careful evaluation and valuable suggestions, which have provided valuable guidance for the improvement of our paper. In response to your feedback, we have implemented the following improvements.

      (1) More detail about how lncRNA and miRNA candidates were defined, and how this compares to previously published miRNA and lncRNA predictions. The Suppl Methods section for lncRNAs does not describe in detail how the "CPC/CNCI/Pfam" "methods" were combined to define lncRNAs here.

      Author response and action taken: Thanks for your comments. In the Supplementary Methods section titled " Selection of Predictive Biomarkers", we have provided a more detailed illustration regarding the screening process for candidate RNA biomarkers. The revised section is as follows: To ensure the predictive performance of the sEV-RNA signature, candidate sEV-RNAs were ultimately selected based on their fold change in colorectal cancer/ precancerous advanced adenoma, absolute abundance, and module attribution. In detail, we initially selected the top 10 RNAs from each category (mRNA, miRNA, and lncRNA) with a fold change greater than 4. In cases where fewer than 10 RNAs were meeting this criterion, all RNAs with a fold change greater than 4 were included. Subsequently, we filtered out RNAs with low abundance, and we selected the top-ranked RNAs from each module based on the fold change ranking for inclusion in the final model.

      Compared to most previous studies on EV biomarkers, the overall discriminative performance of the biomarker model we constructed is considerable, holding clinical value for practical application. In contrast, the supplementary merit of this study lies in uncovering the heterogeneity at the whole transcriptome level among samples of different categories, providing a more comprehensive insight into the dynamic changes of biological states. For instance, we inferred the cell subtypes of EV origins through ssGSEA and correlated them with the tumor microenvironment status. The regulatory relationships among different RNA categories were delineated, and their impacts on biological signaling pathways were analyzed, a feat challenging to accomplish solely through sequencing of a single RNA category.

      In the Supplementary Methods section titled " Identification of mRNAs and lncRNAs", we have provided a more detailed explanation regarding how the "CPC/CNCI/Pfam" methods were combined to define lncRNAs. The revised section is as follows: Three computational approaches including CPC (Coding Potential Calculator)/CNCI (Coding-Non-Coding Index)/Pfam were combined to sort non-protein coding RNA candidates from putative protein-coding RNAs in the unknown transcripts. CPC is a sequence alignment-based tool used to assess protein-coding capacity. By aligning transcripts with known protein databases, CPC evaluates the biological sequence characteristics of each coding frame of the transcript to determine its coding potential and identify non-coding RNAs.1 CNCI analysis is a method used to distinguish between coding and non-coding transcripts based on adjacent nucleotide triplets. This tool does not rely on known annotation files and can effectively predict incomplete transcripts and antisense transcript pairs.2 Pfam divides protein domains into different protein families and establishes statistical models for the amino acid sequences of each family through protein sequence alignment.3 Transcripts that can be aligned are considered to have a certain protein domain, indicating coding potential, while transcripts without alignment results are potential lncRNAs. Putative protein-coding RNAs were filtered out using a minimum length and exon number threshold. Transcripts above 200 nt with more than two exons were selected as lncRNA candidates and further screened by CPC/CNCI/Pfam. We distinguished lncRNAs from protein-coding genes by intersecting the results of the three determination methods mentioned above.

      (2) The role and function of many lncRNAs are unknown, and some lncRNA species may simply be the product of pervasive transcription. Although this is an exploratory and descriptive study of potential biomarkers, it would benefit from some discussion of potential mechanisms because the proposed prediction models include lncRNAs. Do the authors have a hypothesis as to why lncRNAs were informative and predictive in this study? Are these lncRNAs well-studied and/or known to be functional? Or are they markers for pervasive transcription, for example?

      Author response and action taken: Thanks for your comments. Whole transcriptome sequencing results facilitate the discussion of regulatory mechanisms between different biomarkers, supplying evidence for future investigations. Among the three lncRNAs involved in this study, lnc-MKRN2-42:1 is involved in the occurrence and development of Parkinson's disease4. The other two lncRNAs, however, lack relevant reports. Therefore, we cannot confirm that these lncRNAs have specific biological functions. In the Supplementary Methods section titled " Identification of mRNAs and lncRNAs", we acknowledge the limited understanding of sEV-lncRNAs in current research. In contrast, many miRNAs in the model have been proven to participate in the occurrence and development of colorectal cancer, such as miR-36155, miR-425-5p6, and miR-106b-3p7. These data provide biological support for the performance of the model, which is particularly valuable for model prediction.

      (3) In the Results section "Cell-specific features of the sEV-RNA profile indicated the different proportion of cells of sEV origin among different groups", the sEV-RNA profiles were correlated with existing transcriptome profiles from specific cell types (ssGSEA) and used to estimate "tumour microenvironment-associated scores". This transcriptomic correlation is a valuable observation, but there is no further evidence provided that the sEV-RNAs profiles truly reflect differential cell types of sEV origin between the sample subgroups.

      Could the authors clarify the strength of evidence for the cells-of-origin estimates, which are based only on sEV-RNA transcriptome profiles? Would sEV-RNA-derived cells-of-origin be expected to correlate with histopath-derived scores (tumour microenvironment; immune infiltrate) for example? Or is this section intended as an exploratory description of sEV-RNAs, perhaps a check on the plausibility of the sEV-RNA profiles, rather than an accurate estimation of cells-of-origin in each subgroup?

      Author response: Thanks for your comments. This section explores the proportional distribution of EVs from different cellular subgroups solely based on transcriptome profiles and algorithms, rather than providing precise estimates of cellular origins within each subgroup.

      (4) Software and R package version numbers should be provided.

      Author response and action taken: Thanks for your comments. We have added version information for relevant R packages at the first mention in the original text (e.g., WGCNA (version 1.61), Rtsne (version 0.15), GSVA (version 1.42.0), ESTIMATE (version 1.0.13), DOSE (version 3.8.0)).

      References

      (1) Kong L, et al. CPC: assess the protein-coding potential of transcripts using sequence features and support vector machine. Nucleic Acids Res. 35, W345-349 (2007).

      (2) Sun L, et al. Utilizing sequence intrinsic composition to classify protein-coding and long non-coding transcripts. Nucleic Acids Res. 41, e166 (2013).

      (3) Finn RD, et al. Pfam: the protein families database. Nucleic Acids Res. 42, D222-230 (2014).

      (4) Wang Q, et al. Integrated analysis of exosomal lncRNA and mRNA expression profiles reveals the involvement of lnc-MKRN2-42:1 in the pathogenesis of Parkinson's disease. CNS Neurosci Ther. 26, 527-537 (2020).

      (5) Zheng G, et al. Identification and validation of reference genes for qPCR detection of serum microRNAs in colorectal adenocarcinoma patients. PLoS One. 8, e83025 (2013).

      (6) Liu D, Zhang H, Cui M, Chen C, Feng Y. Hsa-miR-425-5p promotes tumor growth and metastasis by activating the CTNND1-mediated β-catenin pathway and EMT in colorectal cancer. Cell Cycle. 19, 1917-1927 (2020).

      (7) Liu H, et al. Colorectal cancer-derived exosomal miR-106b-3p promotes metastasis by down-regulating DLC-1 expression. Clin Sci (Lond). 134, 419-434 (2020).

    1. eLife assessment

      In this study, Ger and colleagues present a valuable new technique that uses recurrent neural networks to distinguish between model misspecification and behavioral stochasticity when interpreting cognitive-behavioral model fits. Simulations provide solid evidence for the validity of this technique and broadly support the claims of the paper, although more work is needed to understand its applicability to real behavioral experiments. This technique addresses a long-standing problem that is likely to be of interest to researchers pushing the limits of cognitive computational modeling.

    2. Reviewer #1 (Public Review):

      Summary:

      Ger and colleagues address an issue that often impedes computational modeling: the inherent ambiguity between stochasticity in behavior and structural mismatch between the assumed and true model. They propose a solution to use RNNs to estimate the ceiling on explainable variation within a behavioral dataset. With this information in hand, it is possible to determine the extent to which "worse fits" result from behavioral stochasticity versus failures of the cognitive model to capture nuances in behavior (model misspecification). The authors demonstrate the efficacy of the approach in a synthetic toy problem and then use the method to show that poorer model fits to 2-step data in participants with low IQ are actually due to an increase in inherent stochasticity, rather than systemic mismatch between model and behavior.

      Strengths:

      Overall I found the ideas conveyed in the paper interesting and the paper to be extremely clear. The method itself is clever and intuitive and I believe it could potentially be useful in certain circumstances, particularly ones where the sources of structure in behavioral data are unknown. Support for the method from synthetic data is clear and compelling. The flexibility of the method means that it could potentially be applied to different types of behavioral data - without any hypotheses about the exact behavioral features that might be present in a given task.

      Weaknesses:

      That said, I have some concerns with the manuscript in its current form, largely related to the applicability of the proposed methods for problems of importance in computational cognitive neuroscience. This concern stems from the fact that the toy problem explored in the manuscript is somewhat simple, and the theoretical problem addressed in it could have been identified through other means (for example through use of posterior predictive checking for model validation), and the actual behavioral data analyzed were interpreted as a null result (failure to reject that the behavioral stochasticity hypothesis), rather than actual identification of model misspecification. Thus, in my opinion, the jury is still out on whether this method could be used to identify a case of model misspecification that actually affects how individual differences are interpreted in a real cognitive task. Furthermore, the method requires considerable data for pretraining, well beyond what would be collected in a typical behavioral study, raising further questions about its applicability in problems of practical relevance. I expand on these primary concerns and raise several smaller points below.

      A primary concern I have about this work is that it is unclear whether the method described could provide any advantage for real cognitive modeling problems beyond what is typically done to minimize the chance of model misspecification (in particular, posterior predictive checking). The toy problem examined in the manuscript is pretty extreme (two of the three synthetic agents are very far from what a human would do on the task, and the models deviate from one another to a degree that detecting the difference should not be difficult for any method). The issue posed in the toy data would easily be identified by following good modeling practices, which include using posterior predictive checking over summary measures to identify model insufficiencies, which in turn would call for the need for a broader set of models (See Wilson & Collins 2019). In this manuscript descriptive analyses are not performed ( which, to me, feels a bit problematic for a paper that aims to improve cognitive modeling practices), however I think it is almost certain that the differences between the toy models would be evident by eye in standard summary measures of two-step task data. The primary question posed in the analysis of the empirical data is as to whether fit differences related IQ might reflect systematic differences in the model across individuals, but in this case application of the newly developed method provides little evidence for structural (model) differences. Thus, it remains unclear whether the method could identify model misspecification in real world data, and even more so whether it could reveal misspecification in situations where standard posterior predictive checking techniques would fall short. The rebuttal highlighted the better fit of the RNN on the empirical data as providing positive evidence for the ability of the method to identify model insufficiency, but I see this result as having limited epistemological value, given that there is no follow up to explore what the insufficiency actually was, or why accounting for it might be important. The authors list many of the points above as limitations in their discussion section, but in my opinion, they are relatively major ones.

      The manuscript now mentions in the discussion that the newly developed methods should be seen as being just one tool in the larger toolkit of the computational cognitive modeler. However, one practical consideration here is that, since other existing tools such as simulation and descriptive analyses can be combined to 1) identify model insufficiency, 2) motivate specific model changes that can fix the problem, it is not exactly clear what the value added from the proposed method is.

      One final practical limitation of the method is that it requires extensive pretraining (on >500 participants) in existing study, limiting its applicability for most use cases.

    3. Reviewer #2 (Public Review):

      SUMMARY:

      In this manuscript, Ger and colleagues propose two complementary analytical methods aimed at quantifying the model misspecification and irreducible stochasticity in human choice behavior. The first method involves fitting recurrent neural networks (RNNs) and theoretical models to human choices and interpreting the better performance of RNNs as providing evidence of the misspecifications of theoretical models. The second method involves estimating the number of training iterations for which the fitted RNN achieves the best prediction of human choice behavior in a separate, validation data set, following an approach known as "early stopping". This number is then interpreted as a proxy for the amount of explainable variability in behavior, such that fewer iterations (earlier stopping) correspond to a higher amount of irreducible stochasticity in the data. The authors validate the two methods using simulations of choice behavior in a two-stage task, where the simulated behavior is generated by different known models. Finally, the authors use their approach in a real data set of human choices in the two-stage task, concluding that low-IQ subjects exhibit greater levels of stochasticity than high-IQ subjects.

      STRENGTHS:

      The manuscript explores an extremely important topic to scientists interested in characterizing human decision-making. While it is generally acknowledged that any computational model of behavior will be limited in its ability to describe a particular data set, one should hope to understand whether these limitations arise due to model misspecification or due to irreducible stochasticity in the data. Evidence for the former suggests that better models ought to exist; evidence for the latter suggests they might not.

      To address this important topic, the authors elaborate carefully on the rationale of their proposed approach. They describe a variety of simulations -- for which the ground truth models and the amount of behavioral stochasticity are known -- to validate their approaches. This enables the reader to understand the benefits (and limitations) of these approaches when applied to the two-stage task, a task paradigm commonly used in the field. Through a set of convincing analyses, the authors demonstrate that their approach is capable of identifying situations where an alternative, untested computational model can outperform the set of tested models, before applying these techniques to a realistic data set.

      WEAKNESSES:

      The most significant weakness is that the paper rests on the implicit assumption that the fitted RNNs explain as much variance as possible, an assumption that is likely incorrect and which can result in incorrect conclusions. While in low-dimensional tasks RNNs can predict behavior as well as the data-generating models, this is not always the case, and the paper itself illustrates (in Figure 3) several cases where the fitted RNNs fall short of the ground-truth model. In such cases, we cannot conclude that a subject exhibiting a relatively poor RNN fit necessarily has a relatively high degree of behavioral stochasticity. Instead, it is at least conceivable that this subject's behavior is generated precisely (i.e., with low noise) by an alternative model that is pooly fit by an RNN -- e.g., a model with long-term sequential dependencies, which RNNs are known to have difficulties in capturing.

      These situations could lead to incorrect conclusions for both of the proposed methods. First, the model mis-specification analysis might show equal predictive performance for a particular theoretical model and for the RNN. While a scientist might be inclined to conclude that the theoretical model explains the maximum amount of explainable variance and therefore that no better model should exist, the scenario in the previous paragraph suggests that a superior model might nonetheless exist. Second, in the early-stopping analysis, a particular subject may achieve optimal validation performance with fewer epochs than another, leading the scientist to conclude that this subject exhibits higher behavioral noise. However, as before, this could again result from the fact that this subject's behavior is produced with little noise by a different model. The possibility of such scenarios does not mean that such scenarios are common, and the conclusions drawn in the paper are likely appropriate for the particular examples analyzed. However, it is much less obvious that the RNNs will provide optimal fits in other types of tasks, particularly those with more complex rules and long-term sequential dependencies, and in such scenarios, an ill-advised scientist might end up drawing incorrect conclusions from the application of the proposed approaches. The authors acknowledge this limitation in their discussion, but it remains a significant caveat that readers should be aware of when using the technique proposed.

      In addition to this general limitation, the relationship between the number of optimal epochs and behavioral stochasticity may not hold for every task and every subject. For example, Figure 4 highlights the relationship between the optimal epochs and agent noise. Yet, it is nonetheless possible that the optimal epoch is influenced by model parameters other than inverse temperature (e.g., hyperparameters such as learning rate, etc). This could again lead to invalid conclusions, such as concluding that low-IQ is associated with optimal epoch when an alternative account might be that low-IQ is associated with low learning rate, which in turn is associated with optimal epoch. Additional factors such as the deep double-descent (Nakkiran et al., ICLR 2020) can also influence the optimal epoch value as computed by the authors. These concerns are partially addressed by the authors in the revised manuscript, where they show that the number of optimal epochs is primarily sensitive to the amount of true underlying noise, assuming the number of trials and network size are constant. The authors also acknowledge, in the discussion section, that many factors can affect the number of optimal epochs, and that inferring behavioral stochasticity from this number should be done with caution.

      APPRAISAL AND DISCUSSION:

      Overall, the authors propose a novel method that aims to solve an important problem, but since the evidence provided refers to a single task and to a single dataset, it is not clear that the method would be appropriate in general settings. In the future, it would be beneficial to test the proposed approach in a broader setting, including simulations of different tasks, different model classes, and different model parameters. Nonetheless, even without such additional work, the proposed methods are likely to be used by cognitive scientists and neuroscientists interested in assessing the quality and limits of their behavioral models.

    4. Author response:

      The following is the authors’ response to the previous reviews.

      eLife assessment

      In this study, Ger and colleagues present a valuable new technique that uses recurrent neural networks to distinguish between model misspecification and behavioral stochasticity when interpreting cognitivebehavioral model fits. Evidence for the usefulness of this technique, which is currently based primarily on a relatively simple toy problem, is considered incomplete but could be improved via comparisons to existing approaches and/or applications to other problems. This technique addresses a long-standing problem that is likely to be of interest to researchers pushing the limits of cognitive computational modeling.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Ger and colleagues address an issue that often impedes computational modeling: the inherent ambiguity between stochasticity in behavior and structural mismatch between the assumed and true model. They propose a solution to use RNNs to estimate the ceiling on explainable variation within a behavioral dataset. With this information in hand, it is possible to determine the extent to which "worse fits" result from behavioral stochasticity versus failures of the cognitive model to capture nuances in behavior (model misspecification). The authors demonstrate the efficacy of the approach in a synthetic toy problem and then use the method to show that poorer model fits to 2-step data in participants with low IQ are actually due to an increase in inherent stochasticity, rather than systemic mismatch between model and behavior.

      Strengths:

      Overall I found the ideas conveyed in the paper interesting and the paper to be extremely clear and wellwritten. The method itself is clever and intuitive and I believe it could be useful in certain circumstances, particularly ones where the sources of structure in behavioral data are unknown. In general, the support for the method is clear and compelling. The flexibility of the method also means that it can be applied to different types of behavioral data - without any hypotheses about the exact behavioral features that might be present in a given task.

      Thank you for taking the time to review our work and for the positive remarks regarding the manuscript. Below is a point-by-point response to the concerns raised.

      Weaknesses:

      That said, I have some concerns with the manuscript in its current form, largely related to the applicability of the proposed methods for problems of importance in computational cognitive neuroscience. This concern stems from the fact that the toy problem explored in the manuscript is somewhat simple, and the theoretical problem addressed in it could have been identified through other means (for example through the use of posterior predictive checking for model validation), and the actual behavioral data analyzed were interpreted as a null result (failure to reject that the behavioral stochasticity hypothesis), rather than actual identification of model-misspecification. I expand on these primary concerns and raise several smaller points below.

      A primary question I have about this work is whether the method described would actually provide any advantage for real cognitive modeling problems beyond what is typically done to minimize the chance of model misspecification (in particular, post-predictive checking). The toy problem examined in the manuscript is pretty extreme (two of the three synthetic agents are very far from what a human would do on the task, and the models deviate from one another to a degree that detecting the difference should not be difficult for any method). The issue posed in the toy data would easily be identified by following good modeling practices, which include using posterior predictive checking over summary measures to identify model insufficiencies, which in turn would call for the need for a broader set of models (See Wilson & Collins 2019). Thus, I am left wondering whether this method could actually identify model misspecification in real world data, particularly in situations where standard posterior predictive checking would fall short. The conclusions from the main empirical data set rest largely on a null result, and the utility of a method for detecting model misspecification seems like it should depend on its ability to detect its presence, not just its absence, in real data.

      Beyond the question of its advantage above and beyond data- and hypothesis-informed methods for identifying model misspecification, I am also concerned that if the method does identify a modelinsufficiency, then you still would need to use these other methods in order to understand what aspect of behavior deviated from model predictions in order to design a better model. In general, it seems that the authors should be clear that this is a tool that might be helpful in some situations, but that it will need to be used in combination with other well-described modeling techniques (posterior predictive checking for model validation and guiding cognitive model extensions to capture unexplained features of the data). A general stylistic concern I have with this manuscript is that it presents and characterizes a new tool to help with cognitive computational modeling, but it does not really adhere to best modeling practices (see Collins & Wilson, eLife), which involve looking at data to identify core behavioral features and simulating data from best-fitting models to confirm that these features are reproduced. One could take away from this paper that you would be better off fitting a neural network to your behavioral data rather than carefully comparing the predictions of your cognitive model to your actual data, but I think that would be a highly misleading takeaway since summary measures of behavior would just as easily have diagnosed the model misspecification in the toy problem, and have the added advantage that they provide information about which cognitive processes are missing in such cases.

      As a more minor point, it is also worth noting that this method could not distinguish behavioral stochasticity from the deterministic structure that is not repeated across training/test sets (for example, because a specific sequence is present in the training set but not the test set). This should be included in the discussion of method limitations. It was also not entirely clear to me whether the method could be applied to real behavioral data without extensive pretraining (on >500 participants) which would certainly limit its applicability for standard cases.

      The authors focus on model misspecification, but in reality, all of our models are misspecified to some degree since the true process-generating behavior almost certainly deviates from our simple models (ie. as George Box is frequently quoted, "all models are wrong, but some of them are useful"). It would be useful to have some more nuanced discussion of situations in which misspecification is and is not problematic.

      We thank the reviewer for these comments and have made changes to the manuscript to better describe these limitations. We agree with the reviewer and accept that fitting a neural network is by no means a substitute for careful and dedicated cognitive modeling. Cognitive modeling is aimed at describing the latent processes that are assumed to generate the observed data, and we agree that careful description of the data-generating mechanisms, including posterior predictive checks, is always required. However, even a well-defined cognitive model might still have little predictive accuracy, and it is difficult to know how much resources should be put into trying to test and develop new cognitive models to describe the data. We argue that RNN can lead to some insights regarding this question, and highlight the following limitations that were mentioned by the review: 

      First, we accept that it is important to provide positive evidence for the existence of model misspecification. In that sense, a result where the network shows dramatic improvement over the best-fitting theoretical model is easier to interpret compared to when the network shows no (or very little) improvement in predictive accuracy. This is because there is always an option that the network, for some reason, was not flexible enough to learn the data-generating model, or because the data-generating mechanism has changed from training to test. We have now added this more clearly in the limitation section. However, when it comes to our empirical results, we would like to emphasize that the network did in fact improve the predictive accuracy for all participants. The result shows support in favor of a "null" hypothesis in the sense that we seem to find evidence that the change in predictive accuracy between the theoretical model and RNN is not systematic across levels of IQ. This allows us to quantify evidence (use Bayesian statistics) for no systematic model misspecification as a function of IQ. While it is always possible that a different model might systematically improve the predictive accuracy of low vs high IQ individuals' data, this seems less likely given the flexibility of the current results.  

      Second, we agree that our current study only applies to the RL models that we tested. In the context of RL, we have used a well-established and frequently applied paradigm and models. We emphasize in the discussion that simulations are required to further validate other uses for this method with other paradigms.  

      Third, we also accept that posterior predictive checks should always be capitalized when possible, which is now emphasized in the discussion. However, we note that these are not always easy to interpret in a meaningful way and may not always provide details regarding model insufficiencies as described by the reviewer. It is very hard to determine what should be considered as a good prediction and since the generative model is always unknown, sometimes very low predictive accuracy can still be at the peak of possible model performance. This is because the data might be generated from a very noisy process, capping the possible predictive accuracy at a very low point. However, when strictly using theoretical modeling, it is very hard to determine what predictive accuracy to expect. Also, predictive checks are not always easy to interpret visually or otherwise. For example, in two-armed bandit tasks where there are only two actions, the prediction of choices is easier to understand in our opinion when described using a confusion matrix that summarizes the model's ability to predict the empirical behavior (which becomes similar to the predictive estimation we describe in eq 22).  

      Finally, this approach indeed requires a large dataset, with at least three sessions for each participant (training, validation, and test). Further studies might shed more light on the use of optimal epochs as a proxy for noise/complexity that can be used with less data (i.e., training and validation, without a test set).

      Please see our changes at the end of this document.  

      Reviewer #2 (Public Review):

      SUMMARY:

      In this manuscript, Ger and colleagues propose two complementary analytical methods aimed at quantifying the model misspecification and irreducible stochasticity in human choice behavior. The first method involves fitting recurrent neural networks (RNNs) and theoretical models to human choices and interpreting the better performance of RNNs as providing evidence of the misspecifications of theoretical models. The second method involves estimating the number of training iterations for which the fitted RNN achieves the best prediction of human choice behavior in a separate, validation data set, following an approach known as "early stopping". This number is then interpreted as a proxy for the amount of explainable variability in behavior, such that fewer iterations (earlier stopping) correspond to a higher amount of irreducible stochasticity in the data. The authors validate the two methods using simulations of choice behavior in a two-stage task, where the simulated behavior is generated by different known models. Finally, the authors use their approach in a real data set of human choices in the two-stage task, concluding that low-IQ subjects exhibit greater levels of stochasticity than high-IQ subjects.

      STRENGTHS:

      The manuscript explores an extremely important topic to scientists interested in characterizing human decision-making. While it is generally acknowledged that any computational model of behavior will be limited in its ability to describe a particular data set, one should hope to understand whether these limitations arise due to model misspecification or due to irreducible stochasticity in the data. Evidence for the former suggests that better models ought to exist; evidence for the latter suggests they might not.

      To address this important topic, the authors elaborate carefully on the rationale of their proposed approach. They describe a variety of simulations - for which the ground truth models and the amount of behavioral stochasticity are known - to validate their approaches. This enables the reader to understand the benefits (and limitations) of these approaches when applied to the two-stage task, a task paradigm commonly used in the field. Through a set of convincing analyses, the authors demonstrate that their approach is capable of identifying situations where an alternative, untested computational model can outperform the set of tested models, before applying these techniques to a realistic data set.

      Thank you for reviewing our work and for the positive tone. Please find below a point-by-point response to the concerns you have raised.

      WEAKNESSES:

      The most significant weakness is that the paper rests on the implicit assumption that the fitted RNNs explain as much variance as possible, an assumption that is likely incorrect and which can result in incorrect conclusions. While in low-dimensional tasks RNNs can predict behavior as well as the data-generating models, this is not *always* the case, and the paper itself illustrates (in Figure 3) several cases where the fitted RNNs fall short of the ground-truth model. In such cases, we cannot conclude that a subject exhibiting a relatively poor RNN fit necessarily has a relatively high degree of behavioral stochasticity. Instead, it is at least conceivable that this subject's behavior is generated precisely (i.e., with low noise) by an alternative model that is poorly fit by an RNN - e.g., a model with long-term sequential dependencies, which RNNs are known to have difficulties in capturing.

      These situations could lead to incorrect conclusions for both of the proposed methods. First, the model misspecification analysis might show equal predictive performance for a particular theoretical model and for the RNN. While a scientist might be inclined to conclude that the theoretical model explains the maximum amount of explainable variance and therefore that no better model should exist, the scenario in the previous paragraph suggests that a superior model might nonetheless exist. Second, in the earlystopping analysis, a particular subject may achieve optimal validation performance with fewer epochs than another, leading the scientist to conclude that this subject exhibits higher behavioral noise. However, as before, this could again result from the fact that this subject's behavior is produced with little noise by a different model. Admittedly, the existence of such scenarios *in principle* does not mean that such scenarios are common, and the conclusions drawn in the paper are likely appropriate for the particular examples analyzed. However, it is much less obvious that the RNNs will provide optimal fits in other types of tasks, particularly those with more complex rules and long-term sequential dependencies, and in such scenarios, an ill-advised scientist might end up drawing incorrect conclusions from the application of the proposed approaches.

      Yes, we understand and agree. A negative result where RNN is unable to overcome the best fitting theoretical model would always leave room for doubt regarding the fact that a different approach might yield better results. In contrast, a dramatic improvement in predictive accuracy for RNN is easier to interpret since it implies that the theoretical model can be improved. We have made an effort to make this issue clear and more articulated in the discussion. We specifically and directly mention in the discussion that “Equating RNN performance with the generative model should be avoided”.   

      However, we would like to note that our empirical results provided a somewhat more nuanced scenario where we found that the RNN generally improved the predictive accuracy of most participants. Importantly, this improvement was found to be equal across participants with no systematic benefits for low vs high IQ participants. We understand that there is always the possibility that another model would show a systematic benefit for low vs. high IQ participants, however, we suggest that this is less likely given the current evidence. We have made an effort to clearly note these issues in the discussion.  

      In addition to this general limitation, the paper also makes a few additional claims that are not fully supported by the provided evidence. For example, Figure 4 highlights the relationship between the optimal epochs and agent noise. Yet, it is nonetheless possible that the optimal epoch is influenced by model parameters other than inverse temperature (e.g., learning rate). This could again lead to invalid conclusions, such as concluding that low-IQ is associated with optimal epoch when an alternative account might be that low-IQ is associated with low learning rate, which in turn is associated with optimal epoch. Yet additional factors such as the deep double-descent (Nakkiran et al., ICLR 2020) can also influence the optimal epoch value as computed by the authors.

      An additional issue is that Figure 4 reports an association between optimal epoch and noise, but noise is normalized by the true minimal/maximal inverse-temperature of hybrid agents (Eq. 23). It is thus possible that the relationship does not hold for more extreme values of inverse-temperature such as beta=0 (extremely noisy behavior) or beta=inf (deterministic behavior), two important special cases that should be incorporated in the current study. Finally, even taking the association in Figure 4 at face value, there are potential issues with inferring noise from the optimal epoch when their correlation is only r~=0.7. As shown in the figures, upon finding a very low optimal epoch for a particular subject, one might be compelled to infer high amounts of noise, even though several agents may exhibit a low optimal epoch despite having very little noise.

      Thank you for these comments. Indeed, there is much we do not yet fully understand about the factors that influence optimal epochs. Currently, it is clear to us that the number of optimal epochs is influenced by a variety of factors, including network size, the data size, and other cognitive parameters, such as the learning rate. We hope that our work serves as a proof-of-concept, suggesting that, in certain scenarios, the number of epochs can be utilized as an empirical estimate. Moreover, we maintain that, at least within the context of the current paradigm, the number of optimal epochs is primarily sensitive to the amount of true underlying noise, assuming the number of trials and network size are constant. We are therefore hopeful that this proofof-concept will encourage research that will further examine the factors that influence the optimal epochs in different behavioral paradigms.  

      To address the reviewer's justified concerns, we have made several amendments to the manuscript. First, we added an additional version of Figure 4 in the Supplementary Information material, where the noise parameter values are not scaled. We hope this adjustment clarifies that the parameters were tested across a broad spectrum of values (e.g., 0 to 10 for the hybrid model), spanning the two extremes of complete randomness and high determinism. Second, we included a linear regression analysis showing the association of all model parameters (including noise) with the optimal number of epochs. As anticipated by the reviewer, the learning rate was also found to be associated with the number of optimal epochs. Nonetheless, the noise parameter appears to maintain the most substantial association with the number of optimal epochs. We have also added a specific mentioning of these associations in the discussion, to inform readers that the association between the number of optimal epochs and model parameters should be examined using simulation for other paradigms/models. Lastly, we acknowledge in the discussion that the findings regarding the association between the number of optimal epochs and noise warrant further investigation, considering other factors that might influence the determination of the optimal epoch point and the fact that the correlation with noise is strong, but not perfect (in the range of 0.7).

      The discussion now includes the following:

      “Several limitations should be considered in our proposed approach. First, fitting a data-driven neural network is evidently not enough to produce a comprehensive theoretical description of the data generation mechanisms. Currently, best practices for cognitive modeling \citep{wilson2019ten} require identifying under what conditions the model struggles to predict the data (e.g., using posterior predictive checks), and describing a different theoretical model that could account for these disadvantages in prediction. However, identifying conditions where the model shortcomings in predictive accuracy are due to model misspecifications rather than noisier behavior is a challenging task. We propose leveraging data-driven RNNs as a supplementary tool, particularly when they significantly outperform existing theoretical models, followed by refined theoretical modeling to provide insights into what processes were mis-specified in the initial modeling effort.

      Second, although we observed a robust association between the optimal number of epochs and true noise across varying network sizes and dataset sizes (see Fig.~\ref{figS2}), additional factors such as network architecture and other model parameters (e.g., learning rate, see .~\ref{figS7}) might influence this estimation. Further research is required to allow us to better understand how and why different factors change the number of optimal epochs for a given dataset before it can be applied with confidence to empirical investigations. 

      Third, the empirical dataset used in our study consisted of data collected from human participants at a single time point, serving as the training set for our RNN. The test set data, collected with a time interval of approximately $\sim6$ and $\sim18$ months, introduced the possibility of changes in participants' decision-making strategies over time. In our analysis, we neglected any possible changes in participants' decision-making strategies during that time, changes that may lead to poorer generalization performance of our approach. Thus, further studies are needed to eliminate such possible explanations.

      Fourth, our simulations, albeit illustrative, were confined to known models, necessitating in-silico validation before extrapolating the efficacy of our approach to other model classes and tasks. Our aim was to showcase the potential benefits of using a data-driven approach, particularly when faced with unknown models. However, whether RNNs will provide optimal fits for tasks with more complex rules and long-term sequential dependencies remains uncertain.

      Finally, while positive outcomes where RNNs surpass theoretical models can prompt insightful model refinement, caution is warranted in directly equating RNN performance with that of the generative model, as seen in our simulations (e.g., Figure 3). We highlight that our empirical findings depict a more complex scenario, wherein the RNN enhanced the predictive accuracy for all participants uniformly. Notably, we also provide evidence supporting a null effect among individuals, with no consistent difference in RNN improvement over the theoretical model based on IQ. Although it remains conceivable that a different datadriven model could systematically heighten the predictive accuracy for individuals with lower IQs in this task, such a possibility seems less probable in light of the current findings.”

      Reviewer #1 (Recommendations For The Authors):

      Minor comments:

      Is the t that gets fed as input to RNN just timestep?

      t = last transition type (rare/common). not timestep

      Line 378: what does "optimal epochs" mean here?

      The number of optimal training epochs that minimize both underfitting and overfitting (define in the line ~300)

      Line 443: I don't think "identical" is the right word here - surely the authors just mean that there is not an obvious systematic difference in the distributions.

      Fixed

      I was expecting to see ~500 points in Figure 7a, but there seem to be only 50... why weren't all datasets with at least 2 sessions used for this analysis?

      We used the ~500 subjects (only 2 datasets) to pre-train the RNN, and then fine-tuned the pre-trained RNN on the other 54 subjects that have 3 datasets. The correlation of IQ and optimal epoch also hold for the 500 subjects as shown below. 

      Author response image 1.

      Reviewer #2 (Recommendations For The Authors):

      Figure 3b: despite spending a long time trying to understand the meaning of each cell of the confusion matrix, I'm still unsure what they represent. Would be great if you could spell out the meaning of each cell individually, at least for the first matrix in the paper.

      We added a clarification to the Figure caption. 

      Figure 5: Why didn't the authors show this exact scenario using simulated data? It would be much easier to understand the predictions of this figure if they had been demonstrated in simulated data, such as individuals with different amounts of behavioral noise or different levels of model misspecifications.

      In Figure 5 the x-axis represents IQ. Replacing the x-axis with true noise would make what we present now as Figure 4. We have made an effort to emphasize the meaning of the axes in the caption. 

      Line 195 ("...in the action selection. Where"). Typo? No period is needed before "where".

      Fixed

      Line 213 ("K dominated-hand model"). I was intrigued by this model, but wasn't sure whether it has been used previously in the literature, or whether this is the first time it has been proposed.

      This is the first time that we know of that this model is used.  

      Line 345 ("This suggests that RNN is flexible enough to approximate a wide range of different behavioral models"): Worth explaining why (i.e., because the GRUs are able to capture dependencies across longer delays than a k-order Logistic Regression model).

      Line 356 ("We were interested to test"): Suggestion: "We were interested in testing".

      Fixed

      Line 389 ("However, as long as the number of observations and the size of the network is the same between two datasets, the number of optimal epochs can be used to estimate whether the dataset of one participant is noisier compared with a second dataset."): This is an important claim that should ideally be demonstrated directly. The paper only illustrates this effect through a correlation and a scatter plot, where higher noise tends to predict a lower optimal epoch. However, is the claim here that, in some circumstances, optimal epoch can be used to *deterministically* estimate noise? If so, this would be a strong result and should ideally be included in the paper.

      We have now omitted this sentenced and toned down our claims, suggesting that while we did find a strong association between noise and optimal epochs, future research is required to established to what extent this could be differentiated from other factors (i.e., network size, amount of observations).

    1. eLife assessment

      This useful study aimed to examine the relationship of spatial frequency selectivity of single macaque inferotemporal (IT) neurons to category selectivity. There are some interesting findings in this report but some of these findings were difficult to evaluate because several critical details of the analysis are incomplete. The conclusion that single-unit spatial frequency selectivity can predict object coding needs further evidence to confirm.

    2. Reviewer #1 (Public Review):

      This study reports that spatial frequency representation can predict category coding in the inferior temporal cortex. The original conclusion was based on likely problematic stimulus timing (33 ms which was too brief). Now the authors claim that they also have a different set of data on the basis of longer stimulus duration (200 ms).

      One big issue in the original report was that the experiments used a stimulus duration that was too brief and could have weakened the effects of high spatial frequencies and confounded the conclusions. Now the authors provided a new set of data on the basis of a longer stimulus duration and made the claim that the conclusions are unchanged. These new data and the data in the original report were collected at the same time as the authors report.

      The authors may provide an explanation why they performed the same experiments using two stimulus durations and only reported one data set with the brief duration. They may also explain why they opted not to mention in the original report the existence of another data set with a different stimulus duration, which would otherwise have certainly strengthened their main conclusions.

      I suggest the authors upload both data sets and analyzing codes, so that the claim could be easily examined by interested readers.

    3. Reviewer #2 (Public Review):

      Summary:

      This paper aimed to examine the spatial frequency selectivity of macaque inferotemporal (IT) neurons and its relation to category selectivity. The authors suggest in the present study that some IT neurons show a sensitivity for the spatial frequency of scrambled images. Their report suggests a shift in preferred spatial frequency during the response, from low to high spatial frequencies. This agrees with a coarse-to-fine processing strategy, which is in line with multiple studies in the early visual cortex. In addition, they report that the selectivity for faces and objects, relative to scrambled stimuli, depends on the spatial frequency tuning of the neurons.

      Strengths:

      Previous studies using human fMRI and psychophysics studied the contribution of different spatial frequency bands to object recognition, but as pointed out by the authors little is known about the spatial frequency selectivity of single IT neurons. This study addresses this gap and shows spatial frequency selectivity in IT for scrambled stimuli that drive the neurons poorly. They related this weak spatial frequency selectivity to category selectivity, but these findings are premature given the low number of stimuli they employed to assess category selectivity.

      The authors revised their manuscript and provided some clarifications regarding their experimental design and data analysis. They responded to most of my comments but I find that some issues were not fully or poorly addressed. The new data they provided confirmed my concern about low responses to their scrambled stimuli. Thus, this paper shows spatial frequency selectivity in IT for scrambled stimuli that drive the neurons poorly (see main comments below). They related this (weak) spatial frequency selectivity to category selectivity, but these findings are premature given the low number of stimuli to assess category selectivity.

      Main points.

      (1) They have provided now the responses of their neurons in spikes/s and present a distribution of the raw responses in a new Figure. These data suggest that their scrambled stimuli were driving the neurons rather poorly and thus it is unclear how well their findings will generalize to more effective stimuli. Indeed, the mean net firing rate to their scrambled stimuli was very low: about 3 spikes/s. How much can one conclude when the stimuli are driving the recorded neurons that poorly? Also, the new Figure 2- Appendix 1 shows that the mean modulation by spatial frequency is about 2 spikes/s, which is a rather small modulation. Thus, the spatial frequency selectivity the authors describe in this paper is rather small compared to the stimulus selectivity one typically observes in IT (stimulus-driven modulations can be at least 20 spikes/s).<br /> (2) Their new Figure 2-Appendix 1 does not show net firing rates (baseline-subtracted; as I requested) and thus is not very informative. Please provide distributions of net responses so that the readers can evaluate the responses to the stimuli of the recorded neurons.<br /> (3) The poor responses might be due to the short stimulus duration. The authors report now new data using a 200 ms duration which supported their classification and latency data obtained with their brief duration. It would be very informative if the authors could also provide the mean net responses for the 200 ms durations to their stimuli. Were these responses as low as those for the brief duration? If so, the concern of generalization to effective stimuli that drive IT neurons well remains.<br /> (4) I still do not understand why the analyses of Figures 3 and 4 provide different outcomes on the relationship between spatial frequency and category selectivity. I believe they refer to this finding in the Discussion: "Our results show a direct relationship between the population's category coding capability and the SF coding capability of individual neurons. While we observed a relation between SF and category coding, we have found uncorrelated representations. Unlike category coding, SF relies more on sparse, individual neuron representations.". I believe more clarification is necessary regarding the analyses of Figures 3 and 4, and why they can show different outcomes.<br /> (5) The authors found a higher separability for faces (versus scrambled patterns) for neurons preferring high spatial frequencies. This is consistent for the two monkeys but we are dealing here with a small amount of neurons. Only 6% of their neurons (16 neurons) belonged to this high spatial frequency group when pooling the two monkeys. Thus, although both monkeys show this effect I wonder how robust it is given the small number of neurons per monkey that belong to this spatial frequency profile. Furthermore, the higher separability for faces for the low-frequency profiles is not consistent across monkeys which should be pointed out.<br /> (6) I agree that CNNs are useful models for ventral stream processing but that is not relevant to the point I was making before regarding the comparison of the classification scores between neurons and the model. Because the number of features and trial-to-trial variability differs between neural nets and neurons, the classification scores are difficult to compare. One can compare the trends but not the raw classification scores between CNN and neurons without equating these variables.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This study reports that IT neurons have biased representations toward low spatial frequency

      (SF) and faster decoding of low SFs than high SFs. High SF-preferred neurons, and low SF-preferred neurons to a lesser degree, perform better category decoding than neurons with other profiles (U and inverted U shaped). SF coding also shows more sparseness than category coding in the earlier phase of the response and less sparseness in the later phase. The results are also contrasted with predictions of various DNN models.

      Strengths:

      The study addressed an important issue on the representations of SF information in a high-level visual area. Data are analyzed with LDA which can effectively reduce the dimensionality of neuronal responses and retain category information.

      We would like to express our sincere gratitude for your insightful and constructive comments which greatly contributed to the refinement of the manuscript. We appreciate the time and effort you dedicated to reviewing our work and providing suggestions. We have carefully considered each of your comments and addressed the suggested revisions accordingly.

      Weaknesses:

      The results are likely compromised by improper stimulus timing and unmatched spatial frequency spectrums of stimuli in different categories.

      The authors used a very brief stimulus duration (35ms), which would degrade the visual system's contrast sensitivity to medium and high SF information disproportionately (see Nachmias, JOSAA, 1967). Therefore, IT neurons in the study could have received more degraded medium and high SF inputs compared to low SF inputs, which may be at least partially responsible for higher firing rates to low SF R1 stimuli (Figure 1c) and poorer recall performance with median and high SF R3-R5 stimuli in LDA decoding. The issue may also to some degree explain the delayed onset of recall to higher SF stimuli (Figure 2a), preferred low SF with an earlier T1 onset (Figure 2b), lower firing rate to high SF during T1 (Figure 2c), somewhat increased firing rate to high SF during T2 (because weaker high SF inputs would lead to later onset, Figure 2d).

      We appreciate your concern regarding the course-to-fine nature of SF processing in the vision hierarchy and the short exposure time of our paradigm. According to your comment, we repeated the analysis of SF representation with 200ms exposure time as illustrated in Appendix 1 - Figure 4. Our recorded data contains the 200ms version of exposure time for all neurons in the main phase. As can be seen, the results are similar to what we found with 33 ms experiments.

      Next, we bring your attention to the following observations:

      (1) According to Figure 2d, the average firing rate of IT neurons for HSF could be higher than LSF in the late response phase. Therefore, the amount of HSF input received by the IT neurons is as much as LSF, however, its impact on the IT response is observable in the later phase of the response. Thus, the LSF preference is because of the temporal advantage of the LSF processing rather than contrast sensitivity.

      (2) According to Figure 3a, 6% of the neurons are HSF-preferred and their firing rate in HSF is comparable to the LSF firing rate in the LSF-preferred group. This analysis is carried out in the early phase of the response (70-170 ms). While most of the neurons prefer LSF, this observation shows that there is an HSF input that excites a small group of neurons. Furthermore, the highest separability index also belongs to the HSF-preferred profile in the early phase of the response which supports the impact of the HSF part of the input.

      (3) Similar LSF-preferred responses are also reported by Chen et al. (2018) (50ms for SC) and Zhang et al. (2023) (3.5 - 4 secs for V2 and V4) for longer duration times.

      Our results suggest that the LSF-preferred nature of the IT responses in terms of firing rate and recall, is not due to the weakness or lack of input source (or information) for HSF but rather to the processing nature of the SF in the vision hierarchy.

      To address this issue in the manuscript:

      Figure Appendix 1 - Figure 4 is added to the manuscript and shows the recall value and onset for R1-R5 with 200ms of exposure time.

      We added the following description to the discussion:

      “To rule out the degraded contrast sensitivity of the visual system to medium and high SF information because of the brief exposure time, we repeated the analysis with 200ms exposure time as illustrated in Appendix 1 - Figure 4 which indicates the same LSF-preferred results. Furthermore, according to Figure 2, the average firing rate of IT neurons for HSF could be higher than LSF in the late response phase. It indicates that the amount of HSF input received by the IT neurons in the later phase is as much as LSF, however, its impact on the IT response is observable in the later phase of the response. Thus, the LSF preference is because of the temporal advantage of the LSF processing rather than contrast sensitivity. Next, according to Figure 3(a), 6\% of the neurons are HSF-preferred and their firing rate in HSF is comparable to the LSF firing rate in the LSF-preferred group. This analysis is carried out in the early phase of the response (70-170ms). While most of the neurons prefer LSF, this observation shows that there is an HSF input that excites a small group of neurons. Additionally, the highest SI belongs to the HSF-preferred profile in the early phase of the response which supports the impact of the HSF part of the input. Similar LSF-preferred responses are also reported by Chen et. al. (2018) (50ms for SC) and Zhang et. al. (2023) (3.5 - 4 secs for V2 and V4). Therefore, our results show that the LSF-preferred nature of the IT responses in terms of firing rate and recall, is not due to the weakness or lack of input source (or information) for HSF but rather to the processing nature of the SF in the IT cortex.”

      Figure 3b shows greater face coding than object coding by high SF and to a lesser degree by low SF neurons. Only the inverted-U-shaped neurons displayed slightly better object coding than face coding. Overall the results give an impression that IT neurons are significantly more capable of coding faces than coding objects, which is inconsistent with the general understanding of the functions of IT neurons. The problem may lie with the selection of stimulus images (Figure 1b). To study SF-related category coding, the images in two categories need to have similar SF spectrums in the Fourier domain. Such efforts are not mentioned in the manuscript, and a look at the images in Figure 1b suggests that such efforts are likely not properly made. The ResNet18 decoding results in Figure 6C, in that IT neurons of different profiles show similar face and object coding, might be closer to reality.

      Because of the limited number of stimuli in our experiments, it is hard to discuss the category selectivity, which needs a higher number of stimuli. To overcome the limited number of stimuli in our experiment, we fixed 60% (nine out of 15 stimuli) while varying the remaining stimuli to reduce the selective bias. To check the coding capability of the IT neurons for face and non-face objects, we evaluated the recall of face vs. non-face classification in intact stimuli (similar to classifiers stated in the manuscript). Results show that at the population level, the recall value for objects is 90.45%, and for faces is 92.45%. However, the difference is not significant (p-value=0.44). On the other hand, we note that a large difference in the SI value does not translate directly to the classification accuracy, rather it illustrates the strength of representation.

      Regarding the SF spectrums, after matching the luminance and contrast of the images we matched the power of the images concerning SF and category. Powers are calculated using the sum of the absolute value of the Fourier transform of the image. Considering all stimuli, the ANOVA analysis shows that various SF bands have similar power (one-way ANOVA, p-value=0.24). Furthermore, comparing the power of faces and images in all SF bands (including intact) and both unscrambled and scrambled images indicates no significant difference between face and object (p-vale > 0.1). Therefore, the result of Figure 3b suggests that IT employs various SF bands for the recognition of various objects.

      Comparing the results of CNNs and IT shows that the CNNs do not capture the complexities of the IT cortex in terms of SF. One of the sources of this difference is because of the behavioral saliency of the face stimulus in the training of the primate visual system.

      To address this issue in the manuscript:

      The following description is added to the discussion:

      “… the decoding performance of category classification (face vs. non-face) in intact stimuli is 94.2%. The recall value for objects vs. scrambled is 90.45%, and for faces vs. scrambled is 92.45% (p-value=0.44), which indicates the high level of generalizability and validity characterizing our results.”

      The following description is added to the method section, SF filtering.

      “Finally, we equalized the stimulus power in all SF bands (intact, R-R5). The SF power among all conditions (all SF bands, face vs. non-face and unscrambled vs. scrambled) does not vary significantly (p-value > 0.1). SF power is calculated as the sum of the square value of the image coefficients in the Fourier domain.”

      Reviewer #2 (Public Review):

      Summary:

      This paper aimed to examine the spatial frequency selectivity of macaque inferotemporal (IT) neurons and its relation to category selectivity. The authors suggest in the present study that some IT neurons show a sensitivity for the spatial frequency of scrambled images. Their report suggests a shift in preferred spatial frequency during the response, from low to high spatial frequencies. This agrees with a coarse-to-fine processing strategy, which is in line with multiple studies in the early visual cortex. In addition, they report that the selectivity for faces and objects, relative to scrambled stimuli, depends on the spatial frequency tuning of the neurons.

      Strengths:

      Previous studies using human fMRI and psychophysics studied the contribution of different spatial frequency bands to object recognition, but as pointed out by the authors little is known about the spatial frequency selectivity of single IT neurons. This study addresses this gap and they show that at least some IT neurons show a sensitivity for spatial frequency and

      interestingly show a tendency for coarse-to-fine processing.

      We extend our sincere appreciation for your thoughtful and constructive feedback on our paper. We are grateful for the time and expertise you invested in reviewing our work. Your detailed suggestions have been instrumental in addressing several key aspects of the paper, contributing to its clarity and scholarly merit. We have carefully considered each of your comments and have made revisions accordingly.

      Weaknesses and requested clarifications:

      (1) It is unclear whether the effects described in this paper reflect a sensitivity to spatial frequency, i.e. in cycles/ deg (depends on the distance from the observer and changes when rescaling the image), or is a sensitivity to cycles /image, largely independent of image scale. How is it related to the well-documented size tolerance of IT neuron selectivity?

      Our stimuli are filtered using cycles/images and knowing the distance of the subject to the monitor, we can calculate the cycles/degrees. To the best of our knowledge, this is also the case for all other SF-related studies. To find the relation of observations to the cycles/image and degree/image, one should keep one of them fixed while changing the other, for example changing the subject's distance to the monitor will change the SF content in terms of cycle/degree. With our current data, we cannot discriminate this effect. To address this issue, we added the following description to the discussion. To address this issue, we added the following description to the discussion:

      “Finally, since our experiment maintains a fixed SF content in terms of both cycles per degree and cycles per image, further experiments are needed to discern whether our observations reflect sensitivity to cycles per degree or cycles per image.”

      (2) The authors' band-pass filtered phase scrambled images of faces and objects. The original images likely differed in their spatial frequency amplitude spectrum and thus it is unclear whether the differing bands contained the same power for the different scrambled images. If not, this could have contributed to the frequency sensitivity of the neurons.

      After equalizing the luminance and contrast of the images, we equilized their power concerning SF and category. The powers were calculated using the sum of the absolute values of the Fourier transform of the images. The results of the ANOVA analysis across all stimuli indicate that various SF bands exhibit similar power (one-way ANOVA, p-value = 0.24). Additionally, a comparison of power between faces and objects in all SF bands (including intact), for both unscrambled and scrambled images, reveals no significant differences (p-value > 0.1). To clarify this point, we have incorporated the following information into the Methods section.

      “Finally, we equalized the stimulus power in all SF bands (intact, R-R5). The SF power among all conditions (all SF bands, face vs. non-face and unscrambled vs. scrambled) does not vary significantly (ANOVA, p-value > 0.1).”

      (3) How strong were the responses to the phase-scrambled images? Phase-scrambled images are expected to be rather ineffective stimuli for IT neurons. How can one extrapolate the effect of the spatial frequency band observed for ineffective stimuli to that for more effective stimuli, like objects or (for some neurons) faces? A distribution should be provided, of the net responses (in spikes/s) to the scrambled stimuli, and this for the early and late windows.

      The sample neuron in Figure 1c is chosen to be a good indicator of the recorded neurons. In the early response phase, the average firing rate to scrambled stimuli is 26.3 spikes/s which is significantly higher than the response in -50 to 50ms which is 23.4. In comparison, the mean response to intact face stimuli is 30.5 spikes/s, while object stimuli elicit an average response of 28.8 spikes/s. Moving to the late phase, T2, the responses to scrambled, face, and object stimuli are 19.5, 19.4, and 22.4 spikes/s, respectively. Moreover, when the classification accuracy for SF exceeds chance levels, it indicates a significant impact of SF bands on the IT response. This raises a direct question about the explicit coding for SF bands in the IT cortex observed for ineffective stimuli and how it relates to complex and effective stimuli, such as faces. To show the strength of neuron responses to the SF bands in scrambled images, We added Appendix 1 - Figure 2 and also added Appendix 1 - Figure 1, according to comment 4, which shows the average and std of the responses to all SF bands. The following description is added to the results section.

      “Considering the strength of responses to scrambled stimuli, the average firing rate in response to scrambled stimuli is 26.3 Hz, which is significantly higher than the response observed between -50 and 50 ms, where it is 23.4 Hz (p-value=3x10-5). In comparison, the mean response to intact face stimuli is 30.5 Hz, while non-face stimuli elicit an average response of 28.8 Hz. The distribution of neuron responses for scrambled, face, and non-face in T1 is illustrated in Appendix 1 - Figure 2.

      […]

      Moreover, the average firing rates of scrambled, face, and non-face stimuli are 19.5 Hz, 19.4 Hz, and 22.4 Hz, respectively. The distribution of neuron responses is illustrated in Appendix 1 Figure 2.”

      (4) The strength of the spatial frequency selectivity is unclear from the presented data. The authors provide the result of a classification analysis, but this is in normalized units so that the reader does not know the classification score in percent correct. Unnormalized data should be provided. Also, it would be informative to provide a summary plot of the spatial frequency selectivity in spikes/s, e.g. by ranking the spatial frequency bands for each neuron based on half of the trials and then plotting the average responses for the obtained ranks for the other half of the trials. Thus, the reader can appreciate the strength of the spatial frequency selectivity, considering trial-to-trial variability. Also, a plot should be provided of the mean response to the stimuli for the two analysis windows of Figure 2c and 2d in spikes/s so one can appreciate the mean response strengths and effect size (see above).

      The normalization of the classification result is just obtained by subtracting the chance level, which is 0.2, from the whole values. Therefore the values could still be interpreted in percent as we did in the results section. To make this clear, we removed the “a.u.” from the figure and we added the following description to the results section.

      “The accuracy value is normalized by subtracting the chance level (0.2).”

      Regarding the selectivity of the neuron, as suggested by your comment, we added a new figure in the appendix section, Appendix 1 - figure 2. This figure shows the strength of SF selectivity, considering trial-to-trial variability. The following description is added to the results section:

      “The strength of SF selectivity, considering the trial-to-trial variability is provided in Appendix 1 Figure 2, by ranking the SF bands for each neuron based on half of the trials and then plotting the average responses for the obtained ranks for the other half of the trials.”

      The firing rates of Figures 2c and 2d are normalized for better illustration since the variation in firing rates is high across neurons, as can be observed in Figure Appendix 1 - Figure 1. Since we seek trends in the response, the absolute values are not important (since the baseline firing rates of neurons are different), but the values relative to the baseline firing rate determine the trend. To address the mean response and the strength of the SF response, the following description is added to the results section.

      “Considering the strength of responses to scrambled stimuli, the average firing rate in response to scrambled stimuli is 26.3 Hz, which is significantly higher than the response observed between -50 and 50 ms, where it is 23.4 Hz (p-value=3x10-5). In comparison, the mean response to intact face stimuli is 30.5 Hz, while non-face stimuli elicit an average response of 28.8 Hz. The distribution of neuron responses for scrambled, face, and non-face in T1 is illustrated in Appendix 1 - Figure 2.

      […]

      Moreover, the average firing rates of scrambled, face, and non-face stimuli are 19.5 Hz, 19.4

      Hz, and 22.4 Hz, respectively. The distribution of neuron responses is illustrated in Appendix 1 Figure 2.”

      Furthermore, we added a figure, Appendix 1 - Figure 3, to illustrate the strength of SF selectivity in our profiles. The following is added to the results section:

      “To check the robustness of the profiles, considering the trial-to-trial variability, the strength of SF selectivity in each profile is provided in Appendix 1 - Figure 3, by forming the profile of each neuron based on half of the trials and then plotting the average SF responses with the other

      half of the trials.”

      (5) It is unclear why such brief stimulus durations were employed. Will the results be similar, in particular the preference for low spatial frequencies, for longer stimulus durations that are more similar to those encountered during natural vision?

      Please refer to the first comment of Reviewer 1.

      (6) The authors report that the spatial frequency band classification accuracy for the population of neurons is not much higher than that of the best neuron (line 151). How does this relate to the SNC analysis, which appears to suggest that many neurons contribute to the spatial frequency selectivity of the population in a non-redundant fashion? Also, the outcome of the analyses should be provided (such as SNC and decoding (e.g. Figure 1D)) in the original units instead of undefined arbitrary units.

      The population accuracy is approximately 5% higher than the best neuron. However, we have no reference to compare the effect size (the value is roughly similar for face vs object while the chance levels are different). However, as stated in Methods, SNC is calculated for two label modes (LSF and HSF) and it can not be directly compared to the best neuron accuracy. Regarding the unit of SNC, it can be interpreted directly to percent by multiplying by a factor of 100. We removed the “a.u.” to prevent misunderstanding and modified the results section for clearance.

      “… SNC score for SF (two labels, LSF (R1 and R2) vs. HSF (R4 and R5)) and category … (average SNC for SF=0.51\%±0.02 and category=0.1\%±0.04 …”

      (7) To me, the results of the analyses of Figure 3c,d, and Figure 4 appear to disagree. The latter figure shows no correlation between category and spatial frequency classification accuracies while Figure 3c,d shows the opposite.

      In Figure 3c,d, following what we observed in Figure 3a,b about the category coding capabilities in the population of neurons based on the profile of the single neurons, we tested a similar idea if the coding capability of single neurons in SF/category could predict the coding capability of population neurons in terms of category/SF. Therefore, both analyses investigate a relation between a characteristic of single neurons and the coding capability of a population of similar neurons. On the other hand, in Figure 4, the idea is to check the characteristics of the coding mechanisms behind SF and category coding. In Figure 4a, we check if there exists any relation between category and SF coding capability within a single neuron activity without the impact of other neurons, to investigate the idea that SF coding may be a byproduct of an object recognition mechanism. In Figure 4b, we investigated the contribution of all neurons in population decision, again to check whether the mechanisms behind the SF and category coding are the same or not. This analysis shows how individual neurons contribute to SF or category coding at the population level. Therefore, the experiments in Figures 3 and 4 are different in the analysis method and what they were designed to investigate and we cannot directly compare the results.

      (8) If I understand correctly, the "main" test included scrambled versions of each of the "responsive" images selected based on the preceding test. Each stimulus was presented 15 times (once in each of the 15 blocks). The LDA classifier was trained to predict the 5 spatial frequency band labels and they used 70% of the trials to train the classifier. Were the trained and tested trials stratified with respect to the different scrambled images? Also, LDA assumes a normal distribution. Was this the case, especially because of the mixture of repetitions of the same scrambled stimulus and different scrambled stimuli?

      In response to your inquiry regarding the stratification of trials, both the training and testing data were representative of the entire spectrum of scrambled images used in our experiment. To address your concern about the assumption of a normal distribution, especially given the mixture of repetitions of the same scrambled stimulus and different stimuli, our analysis of firing rates reveals a slightly left-skewed normal distribution. While there is a deviation from a perfectly normal distribution, we are confident that this skewness does not compromise the robustness of the LDA classifier.

      (9) The LDA classifiers for spatial frequency band (5 labels) and category (2 labels) have different chance and performance levels. Was this taken into account when comparing the SNC between these two classifiers? Details and SNC values should be provided in the original (percent difference) instead of arbitrary units in Figure 5a. Without such details, the results are impossible to evaluate.

      For both SNC and CMI calculations in SF, we considered two labels of HSF (R4 and R5) and LSF (R1 and R2). This was mentioned in the Methods section, after equation (5). According to your comment, to make it clear in the results section, we also added this description to the results section.

      “… illustrates the SNC score for SF (two labels, LSF (R1 and R2) vs. HSF (R4 and R5)) and category (face vs. non-face) … conditioned on the label, SF (LSF (R1 and R2) vs. HSF (R4 and R5)) or category, to assess the information.”

      The value of SNC can also be directly converted to the percent by a factor of 100. To make it clear, we removed “a.u.” from the y-axis.

      (10) Recording locations should be described in IT, since the latter is a large region. Did their recordings include the STS? A/P and M/L coordinate ranges of recorded neurons?

      We appreciate your suggestion for the recording location. Nevertheless, given the complexities associated with neurophysiological recordings and the limitations imposed by our methodologies, we face challenges in precisely localizing every unit if they are located in STS or not. To address your comment, We added Appendix 1 - Figure 5 which shows the SF and category coding capability of neurons along their recorded locations.

      (11) The authors should show in Supplementary Figures the main data for each of the two animals, to ensure the reader that both monkeys showed similar trends.

      We added Appendix 2 which shows the consistency of the main results in the two monkeys.

      (12) The authors found that the deep nets encoded better the spatial frequency bands than the IT units. However, IT units have trial-to-trial response variability and CNN units do not. Did they consider this when comparing IT and CNN classification performance? Also, the number of features differs between IT and CNN units. To me, comparing IT and CNN classification performances is like comparing apples and oranges.

      Deep convolutional neural networks are currently considered the state-of-the-art models of the primate visual pathway. However, as you mentioned and based on our results, they do not yet capture various complexities of the visual ventral stream. Yet studying the similarities and differences between CNN and brain regions, such as the IT cortex, is an active area of research, such as:

      a. Kubilius, Jonas, et al. "Brain-like object recognition with high-performing shallow recurrent ANNs." Advances in neural information processing systems 32 (2019).

      b. Xu, Yaoda, and Maryam Vaziri-Pashkam. "Limits to visual representational correspondence between convolutional neural networks and the human brain." Nature Communications, 12.1 (2021).

      c. Jacob, Georgin, et al. "Qualitative similarities and differences in visual object representations between brains and deep networks." Nature Communications, 12.1 (2021).

      Therefore, we believe comparing IT and CNN, despite all of the differences in terms of their characteristics, can help both fields grow faster, especially in introducing brain-inspired networks.

      (13) The authors should define the separability index in their paper. Since it is the main index to show a relationship between category and spatial frequency tuning, it should be described in detail. Also, results should be provided in the original units instead of undefined arbitrary units. The tuning profiles in Figure 3A should be in spikes/s. Also, it was unclear to me whether the classification of the neurons into the different tuning profiles was based on an ANOVA assessing per neuron whether the effect of the spatial frequency band was significant (as should be done).

      Based on your comment, we added the description of the separability index to the methods section. However, since the separability index is defined as the division of two dispersion matrices, it has no units by nature. The tuning profiles in Figure 3a are normalized for better illustration since the variation in firing rates is high. Since we seek trends in the response, the absolute values are not important. Regarding the SF profile formation, to better present the SF profile assignment, we updated the method section. Furthermore, The strength of responses for scrambled stimuli can be observed in Appendix 1 - Figures 1 and 2.

      (14) As mentioned above, the separability analysis is the main one suggesting an association between category and spatial frequency tuning. However, they compute the separability of each category with respect to the scrambled images. Since faces are a rather homogeneous category I expect that IT neurons have on average a higher separability index for faces than for the more heterogeneous category of objects, at least for neurons responsive to faces and/or objects. The higher separability for faces of the two low- and high-pass spatial frequency neurons could reflect stronger overall responses for these two classes of neurons. Was this the case? This is a critical analysis since it is essential to assess whether it is category versus responsiveness that is associated with the spatial frequency tuning. Also, I do not believe that one can make a strong claim about category selectivity when only 6 faces and 3 objects (and 6 other, variable stimuli; 15 stimuli in total) are employed to assess the responses for these categories (see next main comment). This and the above control analysis can affect the main conclusion and title of the paper.

      We appreciate your concern regarding category selectivity or responsiveness of the SF profiles. First, we note that we used SI since it overcomes the limitations of the accuracy and recall metrics as they are discrete and can be saturated. Using SI, we cannot directly calculate face vs object with SI, since this index only reports one value for the whole discrimination task. Therefore, we have to calculate the SI for face/object vs scrambled to obtain a value per category. However, as you suggested, it raises the question of whether we assess how well the neural responses distinguish between actual images (faces or objects) and their scrambled versions or if we just assess the responsiveness. Based on Figure 3b, since we have face-selective (LSF and HSF preferred profiles), object-selective (inverse U), and the U profile, where SI is the same for both face and object, we believe the SF profile is associated with the category selectivity, otherwise we would have the same face/object recall in all profiles, as we have in the U shape profile.

      To analyze this issue further, we calculated the number of face/object selective neurons in 70-170ms. We found 43 face-selective neurons and 36 object-selective neurons (FDR corrected p-value < 0.05). Therefore, the number of face-selective and object-selective neurons is similar. Next, we check the selectivity of the neurons within each profile. Number of face/object selective neurons is LP=13/3, HP=6/2, IU=3/9, U=14/13, and the remaining belong to the NP group. Results show higher face-selective neurons in LP and HP and a higher number of object-selective neurons in the IU class. The U class contains roughly the same number of face and object-selective neurons. This observation supports the relationship between category selectivity and profiles.

      Next, we examined the average neuron response to the face and object in each profile. The difference between the firing rate of the face and object in none of the profiles was significant (Ranksum with a significance level of 0.05). However, the rates are as follows. The average firing rate (spikes/s) of face/object is LP=36.72/28.77, HP=28.55/25.52, IU=21.55/27.25, U=38.48/36.28. While the differences are not significant, they support the relationship between profiles and categories instead of responsiveness.

      The following description is added to the results section to cover this point of view.

      “To assess whether the SF profiles distinguish category selectivity or merely evaluate the neuron's responsiveness, we quantified the number of face/non-face selective neurons in the 70-170ms time window. Our analysis shows a total of 43 face-selective neurons and 36 non-face-selective neurons (FDR-corrected p-value < 0.05). The results indicate a higher proportion of face-selective neurons in LP and HP, while a greater number of non-face-selective neurons is observed in the IU category (number of face/non-face selective neurons: LP=13/3, HP=6/2, IU=3/9). The U category exhibits a roughly equal distribution of face and non-face-selective neurons (U=14/13). This finding reinforces the connection between category selectivity and the identified profiles. We then analyzed the average neuron response to faces and non-faces within each profile. The difference between the firing rates for faces and non-faces in none of the profiles is significant (face/non-face average firing rate (Hz): LP=36.72/28.77, HP=28.55/25.52, IU=21.55/27.25, U=38.48/36.28, Ranksum with significance level of 0.05). Although the observed differences are not statistically significant, they provide support for the association between profiles and categories rather than mere responsiveness.”

      About the low number of stimuli, please check the next comment.

      (15) For the category decoding, the authors employed intact, unscrambled stimuli. Were these from the main test? If yes, then I am concerned that this represents a too small number of stimuli to assess category selectivity. Only 9 fixed + 6 variable stimuli = 15 were in the main test. How many faces/ objects on average? Was the number of stimuli per category equated for the classification? When possible use the data of the preceding selectivity test which has many more stimuli to compute the category selectivity.

      We used only the main phase recorded data, which contains 15 images in each session. Each image results in 12 stimuli (intact, R1-R5, and phase-scrambled version). Thus, there exists a total of 180 unique stimuli in each session. Increasing the number of images would have increased the recording time. We compensated for this limitation by increasing the diversity of images in each session by picking the most responsive ones from the selectivity phase. On average, 7.54 of the stimuli were face in each session. We added this information to the Methods section. Furthermore, as mentioned in the discussion, for each classification run, the number of samples per category is equalized. We note that we cannot use the selectivity data for analysis, since the SF-related stimuli are filtered in different bands.

      Recommendations For The Authors:

      Reviewer #1 (Recommendations For The Authors):

      I suggest that the authors double-check their results by performing control experiments with longer stimulus duration and SF-spectrum-matched face and object stimuli.

      Thanks for your suggestion, according to your comment, we added Appendix 1 - Figure 3.

      In addition, I had a very difficult time understanding the differences between Figure 3c and Figure 4a. Please rewrite the descriptions to clarify.

      Thanks for your suggestion, we tried to revise the description of these two figures. The following description is added to the results section for Figure 3c.

      “Next, to examine the relation between the SF (category) coding capacity of the single neurons and the category (SF) coding capability of the population level, we calculated the correlation between coding performance at the population level and the coding performance of single neurons within that population (Figure 3 c and d). In other words, we investigated the relation between single and population levels of coding capabilities between SF and category. The SF (or category) coding performance of a sub-population of 20 neurons that have roughly the same single-level coding capability of the category (or SF) is examined.”

      Lines 147-148: The text states that 'The maximum accuracy of a single neuron was 19.08% higher than the chance level'. However, in Figure 4, the decoding accuracies of individual neurons for category and SF range were between 49%-90% and 20%-40%, respectively.

      Please explain the discrepancies.

      The first number is reported according to chance level which is 20%, thus the unnormalized number is 39% which is consistent with the SF accuracy in Figure 4. We added the following description to prevent any misunderstanding.

      “… was 19.08\% higher than the chance level (unnormalized accuracy is 49.08\%, neuron \#193, M2).”

      Lines 264-265: Should 'the alternative for R3 and R4' be 'the alternative for R4 and R5'?

      Thanks for your attention, it's “R4 and R5”. We corrected that mistake.

      Lines 551-562: The labels for SF classification are R1-R5. Is it a binary or a multi-classification task?

      It’s a multi-label classification. We made it clear in the text.

      “… labels were SF bands (R1, R2, ..., R5, multi-label classifier).”

      Figure 4b: Neurons in SF/category decoding exhibit both positive and negative weights. However, in the analysis of sparse neuron weights in Equation 6, only the magnitude of the weights is considered. Is the sign of weight considered too?

      We used the absolute value of the neuron weight to calculate sparseness. We also corrected Equation 6.

      Reviewer #2 (Recommendations For The Authors):

      (1) Line 52: what do the authors mean by coordinate processing in object recognition?

      To avoid any potential misunderstanding, we used the exact phrase in Saneyoshi and Michimata (2015). It is in fact, coordinate relations processing. Coordinate relations specify the metric information of the relative locations of objects.

      (2) About half of the Introduction is a summary of the Results. This can be shortened.

      Thanks for your suggestion.

      (3) Line 134: Peristimulus time histogram instead of Prestimulus time histogram.

      Thanks for your attention. We corrected that.

      (4) Line 162: the authors state that R1 is decoded faster than R5, but the reported statistic is only for R1 versus R2.

      It was a typo, the p-value is only reported for R1 and R5.

      (5) Line 576: which test was used for the asses the statistical significance?

      The test is Wilcoxon signed-rank. We added it to the text.

      (6) How can one present a 35 ms long stimulus with a 60 Hz frame rate (the stimuli were presented on a 60Hz monitor (line 470))? Please correct.

      Thanks for your attention. We corrected that. The time of stimulus presentation is 33ms and the monitor rate is 120Hz.

    1. Author response:

      The following is the authors’ response to the original reviews.

      These are valuable findings that support a link between low-dimensional brain network organization, patterns of ongoing thought, and trait-level personality factors, making it relevant for researchers in the field of spontaneous cognition, personality, and neuropsychiatry. While this link is not entirely new, the paper brings to bear a rich dataset and a well-conducted study, to approach this question in a novel way. The evidence in support of the findings is convincing.

      We thank the reviewers and editors for their time, feedback, and recommendations for improvement. We have revised the manuscript with those recommendations in mind and provide a point-by-point description of the revisions below.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The authors ran an explorative analysis in order to describe how a "tri-partite" brain network model could describe the combination of resting fMRI data and individual characteristics. They utilized previously obtained fMRI data across four scanning runs in 144 individuals. At the end of each run, participants rated their patterns of thinking on 12 statements (short multi-dimensional experience sampling-MDES) using a 0-100% visual analog scale. Also, 71 personality traits were obtained on 21 questionnaires. The authors ran two separate principal component analyses (PCA) to obtain low dimensional summaries of the two individual characteristics (personality traits from questionnaires, and thought patterns from MDES). The dimensionality reduction of the fMRI data was done by means of gradient analysis, which was combined with Neurosynth decoding to visualize the functional axis of the gradients. To test the reliability of thought components across scanning time, intra-class correlation coefficients (ICC) were calculated for the thought patterns, and discriminability indices were calculated for whole gradients. The relationship between individual differences in traits, thoughts, and macro-scale gradients was tested with multivariate regression.

      The authors found: a) reliability of thought components across the one hour of scanning, b) Gradient 1 differentiated between visual regions and DMN, Gradient 2 dissociated somatomotor from visual cortices, Gradient 3 differentiated the DMN from the fronto-parietal system, c) the associations between traits/thought patterns and brain gradients revealed significant effects of "introversion" and "specific internal" thought: "Introversion" was associated with variant parcels on the three gradients, with most of parcels belonging to the VAN and then to the DMN; and "Specific internal thought" was associated with variant parcels on the three gradients with most of parcels belonging to the DAN and then the visual. The authors conclude that interactions between attention systems and the DMN are important influences on ongoing thought at rest.

      Strengths:

      The study's strength lies in its attempt to combine brain activity with individual characteristics using state-of-the-art methodologies.

      Weaknesses:

      The study protocol in its current form restricts replicability. This is largely due to missing information on the MRI protocol and data preprocessing. The article refers the reader to the work of Mendes et al 2019 which is said to provide this information, but the paper should rather stand alone with all this crucial material mentioned here, as well. Also, effect sizes are provided only for the multiple multivariate regression of the inter-class correlations, which makes it difficult to appreciate the power of the other obtained results. 

      Thank you for these comments. We have addressed both issues by adding effect sizes for reported trait and thought related effects within the results table (Table 3, Line 427) and providing more information about the fMRI protocol and preprocessing steps.  (Lines 162- 188)

      Reviewer #2 (Public Review):

      The authors set out to draw further links between neural patterns observed at "rest" during fMRI, with their related thought content and personality traits. More specifically, they approached this with a "tri-partite network" view in mind, whereby the ventral attention network (VAN), the dorsal attention network (DAN), and the default mode network (DMN) are proposed to play a special role in ongoing conscious thought. They used a gradients approach to determine the low dimensional organisation of these networks. In concert, using PCA they reduced thought patterns captured at four time points during the scan, as well as traits captured from a large battery of questionnaires.

      The main findings were that specific thought and trait components were related to variations in the organisation of the tri-partite networks, with respect to cortical gradients.

      Strengths of the methods/results: Having a long (1 hr) resting state MRI session, which could be broken down into four separate scanning/sampling components is a strength. Importantly, the authors could show (via intra-class correlation coefficients) the similarity of thoughts and connectivity gradients across the entire session. Not only did this approach increase the richness of the data available to them, it speaks in an interesting way to the stability of these measures. The inclusion of both thought patterns during scanning along with trait-level dispositional factors is most certainly a strength, as many studies will often include either/or of these, rather than trying to reconcile across. Of the two main findings, the finding that detailed self-generated thought was associated with a decoupling of regions of DAN from regions in DMN was particularly compelling, in light of mounting literature from several fields that support this.

      Weaknesses of the methods/results: Considering the richness of the thought and personality data, I was a little surprised that only two main findings emerged (i.e., a relationship with trait introversion, and a relationship with the "specific internal" thought pattern). I wondered whether, at least in part and in relation to traits, this might stem from the large and varied set of questionnaires used to discern the traits. These questionnaires mostly comprised personality/mood, but some sampled things that do not fall into that category (e.g., musicality, internet addition, sleep), and some related directly to spontaneous thought properties (e.g., mind wandering, musical imagery). It would be interesting to see what relationships would emerge by being more selective in the traits measured, and in the tools to measure them.

      We agree that being more selective in trait measures and measuring tools could lead to more insights into trait – brain relationships. In part the emergence of only two main findings could also be a trade-off of multiple comparison corrections inherent in our current approach (i.e. 400 separate models for all parcels). Furthermore, we have adjusted the text in the discussion in this revision to highlight that more targeted measures of personality (e.g. self-consciousness) could provide a more nuanced view of the relationship between traits and patterns of thought at rest. (Line 532):

      “In the future it may also be important to consider measures of traits that could have relationships to both neural activity and or experience at rest (e.g. self-consciousness de Caso et al., 2017, or autistic tendencies, Turnbull et al., 2020a).”  

      Taken together, the main findings are interesting enough. However, the real significance of this work, and its impact, lie in the richness of the approach: combing across fMRI, spontaneous thought, and trait-level factors. Triangulating these data has important potential for furthering our understanding of brain-behaviour relationship across different levels of organisation.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Recommendations for improving the writing and presentation.

      - Frame the study objectives more clearly. If it's about which theoretical framework best supports the data, you might need to advocate on why the tri-partite approach is a more efficient framework than others. If not, the argument will beg the question: you will find an effect on this model, so you will claim that this is an informative model. For example, if the focus is on these three RSNs and thought reporting, the authors might want to contextualize it historically, like how from two networks (DMN-antagonistic; Vanhaudenhuyse JCognNeurosci 2012; Demertzi et al, NetwNeuroci 2022) we end up to three and why this is a more suitable approach. What about whole-brain connectomic approaches, such as the work by Amico et al? 

      We have expanded on the objectives and rationale of the study by editing/ expanding the introduction as follows (Lines 84-87): 

      “Traditionally, it was argued that the DMN was thought to have an antagonistic relationship with systems linked to external processing (Fox et al., 2005). However, according to the ‘tri-partite’ network accounts the relationship between the DMN and other brain systems is more nuanced. From this perspective key hubs of the ventral attention network, such as the anterior insula and dorso-lateral prefrontal cortex, help gate access to conscious experience, influencing regardless of the focus of attention. This is hypothesised to occur because the VAN influences interactions between the DAN, which is more important for external mental content (Corbetta and Shulman, 2002), and the DMN which is important when states (including tasks) rely more on internal representations (Smallwood et al., 2021a)..”  (… and Lines 112:125):

      “Our current study explored whether this “tri-partite network” view of ongoing conscious thought derived from studies focused on understanding conscious experience, provides a useful organizing framework for understanding the relation between observed brain activity at rest and patterns of cognition/ personality traits. Such analysis is important because at rest there are multiple features of brain activity that can be identified via complex analyses that include regions that show patterns of coactivation (which are traditionally viewed as forming a cohesive network, (Biswal et al., 1995) as well as patterns of anti-correlation with other regions (e.g. Fox et al., 2005). However, it is unclear which of these relationships reflect aspects of cognition or behaviour or are in fact aspects of the functional organization of the cortex (Fox and Raichle, 2007). Consequently, our study builds on foundational work (e.g. Vanhaudenhuyse et al., 2011) in order to better understand which aspects of neural function observed at rest are mostly likely linked to cognition and behaviour. With this aim in mind, we examined links between macro-scale neural activation and both (i) trait descriptions of individuals and (ii) patterns of ongoing thought.”

      - As there was no explicit description of the adopted design and the fMRI procedure, I deduced that it was about a within-subject design, 1-hour scanning session, comprised of four runs, each lasting 15 min, can that be correct? In any case, an explicit description of the design and the fMRI procedure eases the reading and replicability. 

      Thank you for pointing this out. We have now restructured and edited the text relating to write those details clearly and explain the MDES part of the procedure in the same section. It now reads (Lines 162:167): 

      “Resting state fMRI with Multidimensional Experience Sampling (MDES)

      The current sample includes one hour of fully pre-processed rs-fMRI data from 144 participants (four scans from 135 participants, and three scans from nine participants whose data were missing or incomplete). The rs-fMRI was performed in four adjacent 15-minute sessions each immediately followed by MDES which retrospectively measured various dimensions of spontaneous thought during the scan.”

      - Was there a control to the analysis, such as a gradient which also associated with these characteristics? Anything else?

      In our analyses we explore multiple gradients and how they link to traits and thoughts at rest. While there is no explicit control, each analyses provides a constraint on the interpretation of the other analyses. We have added the following text to expand on this point (Line 372): 

      “To this end, we performed a multiple multivariate regression with thoughts, traits, and nuisance variables (motion, age and gender) as independent variables, with whole brain functional organisation, as captured by the first three gradients, as dependent variables. In this analytic approach relationships between cognition along one gradient but not along another help identify which relationships between brain systems are mostly likely to relate to the feature of cognition in question (i.e. each gradient acts as a control for the other).”  

      - I feel that Table 1 (list of tests) carries less information compared to Supplementary Table 1 (how spontaneous thought was reported and scored). I would suggest swapping them, unless Table 1 further contains which outcome measures per test were used for the analysis.  

      Thank you for this suggestion. Table showing the MDES questions has now been moved to the main text (Table 1, Line 194). However, as there is no other description of the questionnaires included in the main text, we have also retained the table listing personality/ trait questionnaires (Table 2, Line 200).

      - Ten group-level gradients were calculated out of which three were shown on the basis of previous work. Please, visualize all 10 gradients as complementary material to inform potential future works on how these look.  

      Thank you for this suggestion. Supplementary figure 3 now shows all 10 gradients.

      - Please provide more information on preprocessing, especially with motion artifacts and how the global signal was processed.  

      Thank you for pointing this out. We have now included the following text, summarized from Mendes et al., 2019, to describe the preprocessing in brief (Line 171:188): 

      “Motion correction parameters were derived by rigid-body realignment of the timeseries to the first (after discarding the first five volumes) volume with FSL MCFLIRT (Jenkinson et al., 2002). Parameters for distortion correction were calculated by rigidly registering a temporal mean image of this time series to the fieldmap magnitude image using FSL FLIRT (Jenkinson and Smith, 2001) which was then unwarped using FSL FUGUE (Jenkinson et al., 2012). Transformation parameters were derived by coregistering the unwarped temporal mean to the subject’s structural scan using FreeSurfer’s boundary-based registration algorithm (Greve and Fischl, 2009). All three spatial transformations were then combined and applied to each volume of the original time series in a single interpolation step. The time series was residualised against the six motion parameters, their first derivatives, “outliers” identified by Nipype’s rapidart algorithm (https://nipype.readthedocs.io/en/latest/interfaces/ A CompCor (Behzadi et al., 2007) approach was implemented to remove physiological noise from the residual time-series- which included first six principal components from all the voxels identified as white-matter cerebrospinal fluid. The denoised time series were temporally filtered to a frequency range between 0.01 and 0.1 Hz using FSL, mean centered and variance normalized using Nitime (Rokem et al., 2009). Imaging and pre-processing protocols are described in detail in Mendes et al (Mendes et al., 2019).”

      - Please, describe the duration of the whole process, and when the questionnaire data were collected.

      We apologize for the lack of clarity. “Data” section of the Methods has now been edited to explain this more clearly, it now reads (Line 146:154):

      “The dataset used here is part of the MPI-Leipzig Mind-Brain-Body (MPILMBB) database (Mendes et al., 2019). The complete dataset consists of a battery of selfreported personality measures, measures of spontaneous thought, task data, and structural and resting-state functional MRI (one hour, divided into four adjacent 15-min sessions) from participants between 20 and 75 years of age. Data were collected over a period of five days, with the MRI sessions always falling on day 3. The questionnaires were completed by participants before and after this day, using Limesurvey (https://www.limesurvey.org: version 2.00+) at their own convenience and using penand-paper on-site. A detailed description of the participants, measures, and data acquisition protocol has been previously published along with the dataset (Mendes et al., 2019).”

      - In light of the discussion about sample sizes and the power of the correlations, can you indicate the effect sizes of the reported results?  

      Thank you for pointing this out. Effect sizes have been added to the results table (Table 3, Line 427)

      Minor corrections to the text and figures

      - Introduction: "Our sample was a cohort....states were explanatory variables": Better move this part to Methods. Ideally, provide the hypotheses here, the ways you wanted to test them, and how you would negate them. What would it mean that you got the hypotheses confirmed? What would the opposite outcome mean? 

      We have added the following text before this part to clarify expand on the objective of the study (Lines 112:125): 

      “Our current study explored whether this “tri-partite network” view of ongoing conscious thought derived from studies focused on understanding conscious experience, provides a useful organising framework for understanding the relation between observed brain activity at rest and patterns of cognition/ personality traits. Such analysis is important because at rest there are multiple features of brain activity that can be identified via complex analyses that include regions that show patterns of coactivation (which are traditionally viewed as forming a cohesive network, (Biswal et al., 1995) as well as patterns of anti-correlation with other regions (e.g. Fox et al., 2005). However, it is unclear which of these relationships reflect aspects of cognition or behaviour or are in fact aspects of the functional organisation of the cortex (Fox and Raichle, 2007). Consequently, our study builds on foundational work (e.g. Vanhaudenhuyse et al., 2011) in order to better understand which aspects of neural function observed at rest are mostly likely linked to cognition and behaviour. With this aim in mind, we examined links between macro-scale neural activation and both (i) trait descriptions of individuals and (ii) patterns of ongoing thought.”   

      We have refrained from listing hypothesis, as the analyses we performed were data driven rather than hypothesis driven to include all possible associations between largescale connectivity patterns and individual state and trail level differences in personality and thought/ experience. We hope that the added text provides more context to understand this rationale.  

      - Please, clarify whether "conscious thought" means "reportable. 

      Thank you for this suggestion. We have now edited the first reference to thought patterns in the discussions to read “self-reports of ongoing thought”, instead of just “ongoing thought” (Line 432)

      - Please, clarify whether "experience" and "thought" are used interchangeably. This is because experience can also be ineffable, beyond thought reporting. 

      To clarify this in the context of the current study, we have edited first reference to “ongoing experience” in the introduction to “self-reports of ongoing experience”. (Line 75)

      - To ease reading comprehension for each Figure, communicate the main findings first, before describing the figures. 

      We believe this lack of clarity is caused by including the figure reference in the heading of the results subsections. We hope this issue is fixed by editing the text in the following manner (Line 381):

      “Trait Introversion 

      Along the first gradient, a parcel within the right orbitofrontal cortex (within the executive control network, shown in orange) showed more similarity with transmodal regions for individuals high on introversion. Six parcels within the ventral attention network, including anterior insula, operculum and cingulate cortex were closer to the somatomotor end along gradient two (shown in purple). The same regions showed lower scores along the third gradient in participants with higher introversion scores, indicating stronger integration with the default mode network. A parcel within posterior cingulate cortex (control) was also more segregated from the visual end of gradient two in participants with higher introversion scores. Associations between trait “introversion” and brain-wide activity are shown in Figure 4.”

    2. eLife assessment

      These are important findings that support a link between low-dimensional brain network organisation, patterns of ongoing thought, and trait-level personality factors, making it relevant for researchers in the field of spontaneous cognition, personality, and neuropsychiatry. While this link is not entirely new, the paper brings to bear a rich dataset and a well-conducted study, to approach this question in a novel way. The evidence in support of the findings is convincing.

    3. Reviewer #1 (Public Review):

      Summary:

      The authors ran an explorative analysis in order to describe how a "tri-partite" brain network model could describe the combination between resting fMRI data and individual characteristics. They utilized previously obtained fMRI data across four scanning runs in 144 individuals. At the end of each run, participants rated their patterns of thinking on 12 statements (short multi-dimensional experience sampling-MDES) using a 0-100% visual analog scale. Also, 71 personality traits were obtained on 21 questionnaires. The authors ran two separate principal component analyses (PCAs) to obtain low dimensional summaries of the two individual characteristics (personality traits from questionnaires, and thought patterns from MDES). The dimensionality reduction of the fMRI data was done by means of gradient analysis, which was combined with Neurosynth decoding to visualize the functional axis of the gradients. To test the reliability of thought components across scanning time, intra-class correlation coefficients (ICC) were calculated for the thought patterns, and discriminability indices were calculated for whole gradients. The relationship between individual differences in traits, thoughts, and macro-scale gradients was tested with multivariate regression. The authors found: a) reliability of thought components across the one hour of scanning, b) Gradient 1 differentiated between visual regions and DMN, Gradient 2 dissociated somatomotor from visual cortices, Gradient 3 differentiated the DMN from the fronto-parietal system), c) the associations between traits/thought patterns and brain gradients revealed significant associations with "introversion" and "specific internal" thought: "Introversion" was associated with variant parcels on the three gradients, with most of parcels belonging to the VAN and then to the DMN; and "Specific internal thought" was associated with variant parcels on the three gradients with most of parcels belonging to the DAN and then the visual. The authors conclude that interactions between attention systems and the DMN are important influences on ongoing thought at rest.

      Strengths:

      The study's strength lies in its attempt to combine brain activity with individual characteristics using state-of-the-art methodologies.

      Weaknesses:<br /> The study protocol in its current form restricts replicability. This is largely due to missing information on the MRI protocol and data preprocessing. The article refers the reader to the work of Mendes et al 2019 which is said to provide this information, but the paper should rather stand alone with all this crucial material mentioned here, as well. Also, effect sizes are provided only for the multiple multivariate regression of the inter-class correlations, which makes it difficult to appreciate the power of the other obtained results.

    4. Reviewer #2 (Public Review):

      The authors set out to draw further links between neural patterns observed at "rest" during fMRI, with their related thought content and personality traits. More specifically, they approached this with a "tri-partite network" view in mind, whereby the ventral attention network (VAN), the dorsal attention network (DAN) and the default mode network (DMN) are proposed to play a special role in ongoing conscious thought. They used a gradient approach to determine the low dimensional organisation of these networks. In concert, using PCA they reduced thought patterns captured at four time points during the scan, as well as traits captured from a large battery of questionnaires.

      The main findings were that specific thought and trait components were related to variations in the organisation of the tri-partite networks, with respect to cortical gradients.

      Strengths of the methods/results: Having a long (1 hour) resting state MRI session, which could be broken down into four separate scanning/sampling components is a strength. Importantly, the authors could show (via intra-class correlation coefficients) similarity of thoughts and connectivity gradients across the entire session. Not only did this approach increase the richness of the data available to them, it speaks in an interesting way to the stability of these measures. The inclusion of both thought patterns during scanning along with trait-level dispositional factors is most certainly a strength, as many studies will often include either/or of these, rather than trying to reconcile across. Of the two main findings, the finding that detailed self-generated thought was associated with a decoupling of regions of DAN from regions in DMN was particularly compelling, in light of mounting literature from several fields that support this.

      Weaknesses of the methods/results: Considering the richness of the thought and personality data, I was a little surprised that only two main findings emerged (i.e., a relationship with trait introversion, and a relationship with the "specific internal" thought pattern). I wondered whether, at least in part and in relation to traits, this might stem from the large and varied set of questionnaires used to discern the traits. These questionnaires mostly comprised personality/mood, but some sampled things that do not fall into that category (e.g., musicality, internet addition, sleep) and some related directly to spontaneous thought properties (e.g., mind wandering, musical imagery). It would be interesting to see what relationships would emerge by being more selective in the traits measured, and in the tools to measure them.

      Taken together, the main findings are interesting enough. However, the real significance of this work and its impact, lie in the richness of the approach: combing across fMRI, spontaneous thought, and trait-level factors. Triangulating across these data has important potential for furthering our understanding of brain-behaviour relationship across different levels of organisation.

    1. eLife assessment

      This is a fundamental study examining the role of prediction error in state allocation of memories. The data provided are convincing and largely support the conclusion that a gradual change between acquisition and extinction maintains the memory state of acquisition and thus results in extinction that is resistant to restoration. This paper is of interest to behavioural and neuroscience researchers studying learning, memory, and the neural mechanisms of those processes as well as to clinicians using extinction-based therapies in treating anxiety-based disorders

    2. Reviewer #1 (Public Review):

      Summary:

      In this study, Kennedy et al examine how new information is organized in memory. They tested an idea based on latent theory that suggests that large prediction error leads to the formation of a new memory, whereas small prediction error leads to memory updating. They directly tested the prediction by extinguishing fear conditioned rats with gradual extinction. For their experiment, gradual extinction was carried out by progressively reducing the intensity of shocks that were co-terminated with the CS, until the CS was presented alone. Doing so resulted in diminished spontaneous recovery and reinstatement compared to Standard Extinction. The results are compelling and have important implications for the field of fear learning and memory as well as translation to anxiety-related disorders.

      The authors carried out the Spontaneous Recovery experiment in 2 separate experiments. In one, they found differences between the Gradual and Standard Extinction groups, but in the second, they did not. It seems that their reinstatement test was more robust, and showed significant differences between the Gradual and Standard Extinction groups.

      The authors carried out important controls which enable proper contextualization of the findings. They included a "Home" group, in which rats received fear conditioning, but not an extinction manipulation. Relative to this group, the Gradual and Standard extinction groups showed a reduction in freezing.

      In Experiments 3 and 4, the authors essentially carried out clever controls which served to examine whether shock devaluation (Experiment 4) and reduction in shock intensity (rather than a gradual decrease in shock intensity) (Experiment 3) would also yield a decrease in the return of fear. In-line with a latent-cause updating explanation for accounting for the Gradual Extinction, they did not.

      In Experiment 5, the authors examined whether a prediction error produced by a change of context might contribute interference to the latent cause updating afforded by the Gradual Extinction. Such a prediction would align with a more flexible interpretation of a latent-cause model, such as those proposed by Redish (2007) and Gershman et al (2017), but not the latent-cause interpretation put forth by the Cochran-Cisler model (2019). Their findings showed that whereas Gradual Extinction carried out in the same context as acquisition resulted in less return of fear than Standard Extinction, it actually yielded a greater degree of return of fear when carried out in a different context, in support of the Redish and Gershman accounts, but not Cochran-Cisler.

      Experiment 6 extended the findings from Experiment 5 in a different state-splitting modality: timing. In this experiment, the authors tested whether a shift in temporal context also influenced the gradual extinction effect. They thus carried out the extinction sessions 21 days after conditioning. They found that while Gradual Extinction was indeed effective when carried out one day after fear conditioning, it did not when conducted 21 days later.

      The authors next carried out an omnibus analysis which included all the data from their 6 experiments, and found that overall, Gradual Extinction resulted in diminished return of fear relative to Standard Extinction. I thought the omnibus analysis was a great idea, and an appropriate way to do their data justice.

      Strengths: Compelling findings. The data support the conclusions. 6 rigorous experiments were conducted which included clever controls. Data include male and female rats. I really liked the omnibus analysis.

      Weaknesses: None noted

    3. Reviewer #2 (Public Review):

      Summary:

      The present article describes a series of experiments examining how a gradual reduction in unconditional stimulus intensity facilitates fear reduction and reduces relapse (spontaneous recovery and reinstatement) relative to a standard extinction procedure. The experiments provide compelling, if somewhat inconsistent, evidence of this effect and couch the results in a scholarly discussion surrounding how mechanisms of prediction error contribute to this effect.

      Strengths:

      The experiments are theoretically motivated and hypothesis-driven, well-designed, and appropriately conducted and analyzed. The results are clear and appropriately contextualized into the broader relevant literature. Further, the results are compelling and ask fundamental questions regarding how to persistently weaken fear behavior, which has both strong theoretical and real-world implications. I found the 'scrambled' experiment especially important in determining the mechanism through which this reduction in shock intensity persistently weakens fear behavior.

      Weaknesses:

      Overall, I found very few weaknesses with this paper. I think some might view the somewhat inconsistent effects on relapse between experiments to be a substantial weakness, I appreciate the authors directly confronting this and using it as an opportunity to aggregate data to look at general trends. Further, while Experiment 1 only used males, this was corrected in the rest of the experiments and therefore is not a substantial concern.

    4. Reviewer #3 (Public Review):

      Summary:

      The manuscript examined the role or large versus small prediction errors (PEs) in creating a state-based memory distinction between acquisition and extinction. The premise of the paper is based on theoretical claims and empirical findings that gradual changes between acquisition and extinction would lead to the potential overwriting of the acquisition memory with extinction, resulting in a more durable reduction in conditioned responding (i.e. more durable extinction effect). The paper tests the hypotheses in a series of elegant experiments in which the shock intensity is decreased across extinction sessions before non-reinforced CS presentations are given. Additional manipulations include context change, shock devaluation, controlling for lower shock intensity exposure. The critical comparison was standard non-reinforced extinction training. The critical tests were done in spontaneous recovery and reinstatement.

      Strengths:

      The findings are of tremendous importance in understanding how memories can be updated and reveal a well-defined role of PE in this process. It is well-established that PE is critical for learning, so delineating how PE is critical for generating memory states and the role it serves in keeping memories dissociable (or not) is exciting and clever. As such the paper addresses a fundamental question in the field.

      The studies test clear and defined predictions derived from simulations of the state-belief model of Cochran & Cisler (2019). The designs are excellent: well-controlled and address the question.

      The authors have done an excellent job at explaining the value of the latent state models.

      The authors have studied both sexes in the studied presented, providing generality across the sexes in their findings. The figures depict the individual data points for males and females allowing the reader to see the responses for each sex.

      The authors have addressed the previously raised weaknesses. They noted that procedurally it would be difficult to provide independent evidence that delivering a lower intensity shock will generate a smaller PE than say no shock. The differences in the data obtained based on error vs shock devaluation are convincing, although direct evidence for shock devaluation would have strengthened the argument.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In "Prediction error determines how memories are organized in the brain: a study of Pavlovian fear 2 extinction in rats", Kennedy et al examine how new information is organized in memory. They tested an idea based on latent theory that suggests that a large prediction error leads to the formation of a new memory, whereas a small prediction error leads to memory updating. They directly tested the prediction by extinguishing fear-conditioned rats with gradual extinction. For their experiment, gradual extinction was carried out by progressively reducing the intensity of shocks that were co-terminated with the CS, until the CS was presented alone. Doing so resulted in diminished spontaneous recovery and reinstatement compared to Standard Extinction. The results are compelling, and have important implications for the field of fear learning and memory as well as translation to anxiety-related disorders.

      The authors carried out the Spontaneous Recovery experiment in 2 separate experiments. In one, they found differences between the Gradual and Standard Extinction groups, but in the second, they did not. It seems that their reinstatement test was more robust, and showed significant differences between the Gradual and Standard Extinction groups.

      The authors carried out important controls that enable proper contextualization of the findings. They included a "Home" group, in which rats received fear conditioning, but not extinction manipulation. Relative to this group, the Gradual and Standard extinction groups showed a reduction in freezing.

      In Experiments 3 and 4, the authors essentially carried out clever controls that served to examine whether shock devaluation (Experiment 4) and reduction in shock intensity (rather than a gradual decrease in shock intensity) (Experiment 3) would also yield a decrease in the return of fear. In line with a latent-cause updating explanation for accounting for the Gradual Extinction, they did not.

      In Experiment 5, the authors examined whether a prediction error produced by a change of context might contribute interference to the latent cause updating afforded by the Gradual Extinction. Such a prediction would align with a more flexible interpretation of a latent-cause model, such as those proposed by Redish (2007) and Gershman et al (2017), but not the latent-cause interpretation put forth by the Cochran-Cisler model (2019). Their findings showed that whereas Gradual Extinction carried out in the same context as acquisition resulted in less return of fear than Standard Extinction, it actually yielded a greater degree of return of fear when carried out in a different context, in support of the Redish and Gershman accounts, but not Cochran-Cisler.

      Experiment 6 extended the findings from Experiment 5 in a different state-splitting modality: timing. In this experiment, the authors tested whether a shift in temporal context also influenced the gradual extinction effect. They thus carried out the extinction sessions 21 days after conditioning. They found that while Gradual Extinction was indeed effective when carried out one day after fear conditioning, it did not when conducted 21 days later.

      The authors next carried out an omnibus analysis which included all the data from their 6 experiments, and found that overall, Gradual Extinction resulted in diminished return of fear relative to Standard Extinction. I thought the omnibus analysis was a great idea and an appropriate way to do their data justice.

      Strengths:

      Compelling findings. The data support the conclusions. 6 rigorous experiments were conducted which included clever controls. Data include male and female rats. I really liked the omnibus analysis.

      We thank the reviewer for their positive comments – they are appreciated.

      Weaknesses:

      None noted

      Reviewer #2 (Public Review):

      Summary:

      The present article describes a series of experiments examining how a gradual reduction in unconditional stimulus intensity facilitates fear reduction and reduces relapse (spontaneous recovery and reinstatement) relative to a standard extinction procedure. The experiments provide compelling, if somewhat inconsistent, evidence of this effect and couch the results in a scholarly discussion surrounding how mechanisms of prediction error contribute to this effect.

      Strengths:

      The experiments are theoretically motivated and hypothesis-driven, well-designed, and appropriately conducted and analyzed. The results are clear and appropriately contextualized into the broader relevant literature. Further, the results are compelling and ask fundamental questions regarding how to persistently weaken fear behavior, which has both strong theoretical and real-world implications. I found the 'scrambled' experiment especially important in determining the mechanism through which this reduction in shock intensity persistently weakens fear behavior.

      We thank the reviewer for their positive comments – they are appreciated.

      Weaknesses:

      Overall, I found very few weaknesses in this paper. I think some might view the somewhat inconsistent effects on relapse between experiments to be a substantial weakness, I appreciate the authors directly confronting this and using it as an opportunity to aggregate data to look at general trends. Further, while Experiment 1 only used males, this was corrected in the rest of the experiments and therefore is not a substantial concern.

      Reviewer #3 (Public Review):

      Summary:

      The manuscript examined the role of large versus small prediction errors (PEs) in creating a state-based memory distinction between acquisition and extinction. The premise of the paper is based on theoretical claims and empirical findings that gradual changes between acquisition and extinction would lead to the potential overwriting of the acquisition memory with extinction, resulting in a more durable reduction in conditioned responding (i.e. more durable extinction effect). The paper tests the hypotheses in a series of elegant experiments in which the shock intensity is decreased across extinction sessions before non-reinforced CS presentations are given. Additional manipulations include context change, shock devaluation, and controlling for lower shock intensity exposure. The critical comparison was standard non-reinforced extinction training. The critical tests were done in spontaneous recovery and reinstatement.

      Strengths:

      The findings are of tremendous importance in understanding how memories can be updated and reveal a well-defined role of PE in this process. It is well-established that PE is critical for learning, so delineating how PE is critical for generating memory states and the role it serves in keeping memories dissociable (or not) is exciting and clever. As such the paper addresses a fundamental question in the field.

      The studies test clear and defined predictions derived from simulations of the state-belief model of Cochran & Cisler (2019). The designs are excellent: well-controlled and address the question.

      The authors have done an excellent job of explaining the value of the latent state models.

      The authors have studied both sexes in the study presented, providing generality across the sexes in their findings. However, depicting the individual data points in the bar graphs and noting which data represent males and which represent females would be of great value.

      We thank the reviewer for their positive comments. We have included individual data points in the bar graphs and indicated which represent males and females.

      Weaknesses:

      (1) While it seems obvious that delivering a lower intensity shock will generate a smaller PE than say no shock, it would have been nice to see data from say a compound testing procedure that confirms this.

      It would be great if we could provide independent evidence that shifting from a 0.8 mA shock to a 0.4 mA shock (first session of gradual extinction) produces a smaller prediction error than shifting from a 0.8 mA shock to no shock at all (first session of standard extinction). In theory, this could be assessed using Rescorla’s (2000) compound test procedure. However, application of this procedure requires the use of a within-subject design and latent state theories would not predict the gradual extinction effect in such a design (as all prediction errors generated in such a design would affect the state-splitting process). That is, the between-subject design used to generate the gradual extinction effect is not amenable to application of the compound test procedure; and the within-subject design in which the compound test procedure could be applied is unlikely to generate the gradual extinction effect. Thus, we instead rely on the high degree of similarity between our results and those predicted by Cochran & Cisler (2019) to argue that the gradual extinction protocol produces a series of smaller prediction errors than does the standard extinction protocol: hence the present pattern of results.

      (2) The devaluation experiment is quite clever, but it also would be strengthened if there was evidence in the paper that this procedure does indeed lead to shock devaluation.

      The aim of Experiment 3 was to determine whether the gradual extinction effect is due to prediction error-based memory updating or shock devaluation. If the effect was due to shock devaluation, the group that received the gradual extinction treatment should have displayed the same low level of spontaneous recovery as the group that only experienced the shock at its lowest (0.1 mA) intensity (i.e., the shock devaluation group). Contrary to this prediction, the results showed that the gradually extinguished group displayed less spontaneous recovery than the shock devaluation group. That is, in this experiment, the slow and progressive reduction in shock intensity was processed differently to the repeated 0.1 mA shock exposures but the results were inconsistent with any shock devaluation effect. Hence, we conclude that the gradual extinction effect does not involve shock devaluation but instead is due to prediction error-based memory updating.

      (3) It would have been very exciting to see even more parametric examinations of this idea, like maintaining shock intensity but gradually reducing shock duration, which would have increased the impact of the paper.

      We appreciate the reviewer’s point. As each shock was presented for just 0.5 s, we are not confident that rats would detect gradual and progressive changes in its duration in the same way as they can obviously detect gradual and progressive changes in its intensity. We are, however, investigating the effects of gradual extinction in a second order conditioning protocol, which will allow us to examine the full range of parameters that are important for its regulation, including manipulations of stimulus duration. In our second-order conditioning protocol, rats are first exposed to pairings of a 10 s S1 and a 0.5 s foot shock US; and then exposed to pairings of a 30 s S2 and the 10 s S1. Across the latter pairings, rats acquire second-order conditioned fear responses to S2. Importantly, these responses can be extinguished through repeated presentations of the S2 in the absence of its S1-associate; and the duration of the S1 can be progressively and gradually reduced from 10 s to 0 s across the shift to this extinction. These experiments are currently in progress and will eventually represent an extension of the present findings.

      (4) Individual data points should be represented in the test figures (see above also).

      We have updated the figures to show these data points.

      Rescorla, R. A. (2000). Associative changes in excitors and inhibitors differ when they are conditioned in compound. Journal of Experimental Psychology: Animal Behavior Processes26(4), 428.

      Reviewing Editor (Recommendations For The Authors):

      The eLife assessment relates to the present form of the paper. However, following a discussion with the reviewers, the significance of the findings could be bolstered to fundamental if you decided to revise the current manuscript by scaling up the investigation to examine a wider set of parameters and conditions under which error can influence state allocation of memories. One way of doing this, but not limited to this, is suggested in the reviews (e.g. maintaining shock intensity, reducing its duration). Relatedly, a more extensive discussion of the Gershamn et al. (2013) paper would be relevant.

      As noted in our response to Reviewer 3, we are currently investigating the effects of gradual extinction in a second order conditioning protocol, which will allow us to examine the full range of parameters that are important for its regulation, including manipulations of stimulus duration. These experiments are in-progress and will eventually represent an extension of the present findings. They are not, however, ready to be included as part of the present study.

      We have further referenced the Gershman et al., (2013) paper as well as the related Bouton et al., (2004) paper on the effects of gradually reducing the frequency of the US across extinction. This appears in the fifth paragraph of the Discussion: “The present study adds to a growing body of evidence that manipulations applied across the shift from CS-US pairings to presentations of the CS alone can influence the effectiveness of extinction. For example, Gershman et al., (2013) and Bouton et al., (2004) showed that gradually reducing the proportion of reinforced CS presentations results in less spontaneous recovery and slower reacquisition, respectively; though both studies left open fundamental questions about the basis of their findings (see also Woods & Bouton, 2007).”

      Reviewer #1 (Recommendations For The Authors):

      I don't have any strong recommendations. I think the paper is really great as is.

      One minor suggestion to consider:

      The authors carried out the Spontaneous Recovery experiment in 2 separate experiments. In one, they found differences between the Gradual and Standard Extinction groups, but in the second, they did not. This is perhaps not entirely surprising, since their extinction test was conducted 2 weeks post-extinction, and not all rats show spontaneous recovery within that timeframe. The authors mention that the lack of SR might be due to the low level of freezing reported in their test, but since they are showing group mean data, they might consider showing the individual data points to showcase the range of SR freezing as an additional way to make sense of the variability (ie, maybe a few rats that showed very low freezing carried the mean down in the Standard Extinction group, while others showed return of fear).

      We agree and have included individual data points for test results in Figures 2D, 2F, 3D, 3H, 4D and 4H. Hence, these figures now reflect both group and individual freezing levels.

      Reviewer #2 (Recommendations For The Authors):

      Overall, I thought this was an exceptional paper. Aside from the comments listed above which I'm not sure are inherently addressable, the only real changes I would like to see are that individual data points should be depicted in the main testing figures, as is becoming more conventional in the field.

      We thank the reviewer for their positive comments. As indicated in our response to the other reviewers, we have added individual data points to the histograms showing test results.

      Reviewer #3 (Recommendations For The Authors):

      Figures

      (1) The test data are presented as bars, but I did wonder if there were differences between the groups from the start of testing or if those emerged across testing (SR vs extinction savings).

      We have added two new figures to the supplementary section, Figures 8 and 9. These display the trial-by-trial data from spontaneous recovery and reinstatements tests in each experiment. The data clearly show that the between-group differences in freezing were very stable across the test sessions.

      (2) While I understand the importance of presenting the last extinction session, I felt depicting the entire CS session would be more informative. Alternatively, removing this altogether and leaving the information from the extinction session in the supplemental would focus the reader on the key test data.

      We appreciate the reviewer’s point. It is important to show that the groups displayed equivalent freezing in the final extinction session prior to testing. Given that the test data are conveniently and best presented in a histogram, we have chosen to present the data from the final extinction session in the same way. The full, trial-by-trial trajectory of freezing across conditioning and extinction, as well as the analyses of these data, are presented in the supplementary A.

      (3) I did not find the figures to be very aesthetically pleasing (in part because some panels were unnecessarily large). For example, I found it rather odd that the simulation panels were split in Figure 1. One suggestion of how this figure could look better is to keep the size of panels B, C, and D the same and align them on the same row with the design figure above them. The other option is to have the design figure above the test figure and the two simulation figures above each other and next to the design and test. Also, there are grey lines that appear around the simulation figures on my PDF.

      We have updated the figures so that they are consistent across experiments and more aesthetically pleasing. Specifically, we have consistently: 1) inserted the simulations of Cochran & Cisler (2019) next to the design schematic; 2) inserted the extinction and test data beneath the design schematic; and 3) Made the sizing of figures more uniform across Experiments 1-6.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment 

      This study presents valuable findings as it shows that sleep rhythm formation and memory capabilities depend on a balanced and rich diet in fly larvae. The evidence supporting the claims of the authors is convincing with rigorous behavioral assays and state-of-the-art genetic manipulations. The work will be of interest to researchers working on sleep and memory. 

      Public Reviews: 

      Summary: 

      This manuscript investigates how energetic demands affect the sleep-wake cycle in Drosophila larvae. L2 stage larvae do not show sleep rhythm and long-term memory (LTM), however, L3 larvae do. The authors manipulate food content to provide insufficient nutrition, which leads to more feeding, no LTM, and no sleep even in older larvae. Similarly, activation of NPF neurons suppresses sleep rhythm. Furthermore, they try to induce a sleep-like state using pharmacology or genetic manipulations in L2 larvae, which can mimic some of the L3 behaviours. A key experimental finding is that activation of DN1a neurons activate the downstream DH44 neurons, as assayed by GCaMP calcium imaging. This occurs only in third instar and not in second instar, in keeping with the development of sleep-wake and feeding separation. The authors also show that glucose metabolic genes are required in Dh44 neurons to develop sleep rhythm and that DH44 neurons respond differently in malnutrition or younger larvae. 

      Strengths: 

      Previous studies from the same lab have shown the sleep is required for LTM formation in the larvae, and that this requires DN1a and DH44 neurons. The current work builds upon this observation and addresses in more detail when and how this might develop. The authors can show that low quality food exposure and enhanced feeding during larval stage of Drosophila affects the formation of sleep rhythm and long-term memory. This suggests that the development of sleep and LTM are only possible under well fed and balanced nutrition in fly larvae. Non-sleep larvae were fed in low sugar conditions and indeed, the authors also find glucose metabolic genes to be required for a proper sleep rhythm. The paper presents precise genetic manipulations of individual classes of neurons in fly larvae followed by careful behavioural analysis. The authors also combine thermogenetic or peptide bath application experiments with direct calcium imaging of specific neurons. 

      Weaknesses: 

      The authors tried to induce sleep in younger L2 larvae, however the behavioral results suggest that they were not able to induce proper sleep behaviour as in normal L3 larvae. Thus, they cannot show that sleep during L2 stage would be sufficient to form LTM. 

      We agree that the experiments with Gaboxadol feeding in L2 did not perfectly mimic L3 sleep behaviors. However, genetic induction of sleep in L2 was effective in increasing sleep duration and depth similar to that observed in normal L3. As noted below in response to specific reviewer comments, because gaboxadol feeding is standard in the field for adult sleep induction, we prefer to still include this data in the manuscript for transparency. Moreover, the gaboxadol manipulation did cause a significant decrease in arousal threshold compared to control larvae. Together these approaches support the hypothesis that sleeping more/more deeply is not sufficient to promote LTM in L2.

      The authors suggest that larval Dh44 neurons may integrate "information about the nutritional environment through the direct sensing of glucose levels to modulate sleep-wake rhythm development". They identify glucose metabolism genes (e.g., Glut1) in the downstream DH44 neurons as being required for the organization of the sleep-wake-feeding rhythm, and that CCHa signaling in DN1a signaling to the DH44 cells via the receptor. However, how this is connected is not well explained. Do the authors think that the nutrient sensing is only occurring in the DH44 neurons and not in DN1a or other neurons? Would not knocking down glucose metabolism in any neuron lead to a functional defect? What is the evidence that Dh44 neurons are specific sensors of nutritional state? For example, do the authors think that e.g. the overexpression of Glut1 in Dh44 neurons, a manipulation that can increase transport of glucose into cells, would rescue the effects of low-sugar food? 

      We thank the reviewer for these suggestions and have added the experiment proposed. We found that knockdown of Hex-C in DN1a neurons did not disrupt sleep-wake rhythms (Fig. S4G-I) suggesting that Dh44 neurons are specialized in requiring glucose metabolism to drive sleep-wake rhythms. We have also added further clarification in the text regarding the existing evidence that Dh44 neurons act has nutrient sensors.

      Some of the genetic controls seem to be inconsistent suggesting some genetic background effects. In Figure 2B, npf-gal4 flies without the UAS show no significant circadian change in sleep duration, whereas UAS-TrpA flies do. The genetic control data in Figure 2D are also inconsistent. Npf-Gal4 seems to have some effect by itself without the UAS. The same is not seen with R76G11-Gal4. Suppl Fig 2: Naïve OCT and AM preference in L3 expressing various combinations of the transgenes show significant differences. npf-Gal4 alone seems to influence preference. 

      The sleep duration and bout number/length data are highly variable. 

      All experiments are performed in isogenized background so variability seen in genetic controls likely reflects stochastic nature of behavioral experiments. Indeed, adult sleep data also shows a great deal of variability within the same genetic background (PMID: 29228366). We agree it is an important point, and we attempt to minimize variability as much as possible with backcrossing of flies and tight control of environmental conditions.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      Low sugar exposure and activation of NPF neurons might not induce the same behavioral changes. LS exposure does not enhance mouth hook movements, but overall food intake. NPF activation seems to enhance mouth hook movements, but the data for food intake is not shown. This information would be necessary to compare the two different manipulations. 

      We thank the reviewer for this suggestion. However, we elected not to perform food intake experiments with the NPF activation experiments. Since we are not directly comparing the low sugar and NPF manipulations to each other, we think that both experiments together support the conclusion that immature food acquisition strategies (whether food intake or feeding rate) limit LTM performance. 

      The authors write that the larval feeding assays run for 4 hours, can they explain why that long? Larvae should already have processed food within 4 hours, so that the measurement would not include all eaten food.

      We clarified the rationale for doing 4 hour feeding assays in the results section. We did 4 hours on blue dyed food because initial experiments of 1 hour with control L3 at CT1-4 were difficult to interpret. The measurement does not include all of the eaten food in the 4 hours but does reflect more long-term changes in food intake.

      Sleep induction with Gaboxadol seems to not really work - sleep duration, bout number and length are not enhanced, and arousal threshold is only slightly lower. Thus, the authors should not use this data as an example for inducing sleep behaviour. 

      We agree this approach did not have a large effect in larvae. However, because gaboxadol feeding is standard in the field for adult sleep induction, we prefer to still include this data in the manuscript for transparency. Moreover, the Gaboxadol manipulation did cause a mild (but significant) decrease in arousal threshold compared to control larvae. Gaboxadol feeding also caused a significant decrease in total body weight compared to control larvae indicating that even slightly deeper sleep could be detrimental to younger animals.

      Activation of R76G11 with TrpA1 seems to work better for inducing sleep like behaviour. However, the authors describe that they permanently activated neurons. To induce a "normal" sleep pattern, the authors might try to only activate these neurons during the normal enhanced sleep time in L3 (CT13?) and not during the whole day. This might also allow larvae to eat during day time and gain more weight. 

      We apologize that this point was not clearer, but we did do acute activation of R76G11(+) neurons, as proposed by the reviewer. We have clarified the text to make this point.

      It would be interesting to see how larvae fed with high sucrose and low protein diet would behave in this assay. Do the authors suggest that sugar is most important for the development of sleep behaviour or that it is a combination of sugar and protein that might be required? 

      We agree that feeding larvae a high sucrose and low protein diet would be interesting. However, we initially tried a low protein diet and observed significant developmental delays. Therefore, we are concerned that developmental defects on a high sucrose and low protein diet would confound behavioral results. Additionally, the Dh44 manipulations (glucose & GCN2 signaling) suggest that sugar is the most important for the development of sleep behaviors.

      Reviewer #3 (Recommendations For The Authors): 

      The authors could discuss if the interaction between DN1a clock neurons and Dh44 neurons is mediated synaptic or by volume transmission following the extracellular release of the CCHa1 neuropeptide. They write that "the development of Dh44 neuronal competency to receive clock-driven cues" and that "DN1a clock neurons anatomically and functionally connect to Dh44" but a discussion about volume vs. synaptic signalling would be of interest. 

      We thank the reviewer for this suggestion. We revised the discussion to address this point.

      line 223 " demonstrating that post-synaptic processes likely". It would be interesting to read a discussion on whether it is known if these are postsynaptic or peptide-mediated volume effects? 

      We added additional text to the discussion to address these points.

      - The authors may want to include a schematic of the circuit and how its position in the general anatomy of the fly larva. 

      We thank the reviewer for this suggestion. We have added a model figure to Fig. S6.

      "Dh44 neurons act through glucose metabolic genes" - consider rewording e.g. require glucose metabolic genes 

      We revised the text.

      - line 45 "Early in development, young animals must obtain enough nutrients to ensure proper growth" - this is too general, many animals do not feed in early life-cycle stages (e.g. lecitotrophic development), consider rewording 

      We revised the text to be more specific.

      - line 90 "however, L3 at CT1 consume more than L3 at CT12 (Figure S1A)" - typo CT13, also consider rewording to match the structure of the sentence before 'however, L3 consumed more at CT1 than at CT13' 

      We revised the text to fix this error.

      - Line 111 "and loss of deep sleep" - how is deep sleep defined and measured in the larvae? It is not clear from the data or the text. 

      We revised the text to define deep sleep in the results section. We also have a description of how arousal threshold is calculated in the methods.

      - In Figure 3B and G the individual data points are not shown 

      We did not show individual data points for those graphs because we are plotting the average percentage of 4 biological replicates.

      Typo: 

      Figure 1 legend "F, n= n=100-172 " 

      We revised the text to fix this typo.

    2. eLife assessment

      This study presents valuable findings as it shows that sleep rhythm formation and memory capabilities depend on a balanced and rich diet in fly larvae. The evidence supporting the claims of the authors is convincing with rigorous behavioral assays and state-of-the-art genetic manipulations. The work will be of interest to researchers working on sleep and memory.

    3. Joint Public Review:

      Summary:

      This manuscript investigates how energetic demands affect the sleep-wake cycle in Drosophila larvae. L2 stage larvae do not show sleep rhythm and long-term memory (LTM), however, L3 larvae do. The authors manipulate food content to provide insufficient nutrition, which leads to more feeding, no LTM, and no sleep even in older larvae. Similarly, activation of NPF neurons suppresses sleep rhythm. Furthermore, they try to induce a sleep-like state using pharmacology or genetic manipulations in L2 larvae, which can mimic some of the L3 behaviours. A key experimental finding is that activation of DN1a neurons activates the downstream DH44 neurons, as assayed by GCaMP calcium imaging. This occurs only in the third instar and not in the second instar, in keeping with the development of sleep-wake and feeding separation. The authors also show that glucose metabolic genes are required in Dh44 neurons to develop sleep rhythm and that DH44 neurons respond differently in malnutrition or younger larvae.

      Strengths:

      Previous studies from the same lab have shown that sleep is required for LTM formation in the larvae, and that this requires DN1a and DH44 neurons. The current work builds upon this observation and addresses in more detail when and how this might develop. The authors can show that low quality food exposure and enhanced feeding during larval stage of Drosophila affects the formation of sleep rhythm and long-term memory. This suggests that the development of sleep and LTM are only possible under well fed and balanced nutrition in fly larvae. Non-sleep larvae were fed in low sugar conditions and indeed, the authors also find glucose metabolic genes to be required for a proper sleep rhythm. The paper presents precise genetic manipulations of individual classes of neurons in fly larvae followed by careful behavioural analysis. The authors also combine thermogenetic or peptide bath application experiments with direct calcium imaging of specific neurons.

      Weaknesses:

      The authors tried to induce sleep in younger L2 larvae with Gaboxadol feeding, however, the behavioral results suggest that they were not able to induce proper sleep behaviour as in normal L3 larvae.

      Some of the genetic controls seem to be inconsistent. Given that the experiments were carried out in isogenized background, this is likely due to the high variability of some of the behaviours.

    1. eLife assessment

      This important study provides new insights into the maturation of ribbon synapses in zebrafish neuromast hair cells. Convincing evidence, based on live-cell imaging and pharmacological and genetic manipulations, is provided to show that the formation of this synaptic organelle is a dynamic process involving the fusion of presynaptic elements and microtubule transport. These findings will be of interest to neuroscientists studying synapse formation and function and should inspire further research into the molecular basis for synaptic ribbon maturation.

    2. Reviewer #1 (Public Review):

      Summary:

      The manuscript by Hussain and collaborators aims at deciphering the microtubule-dependent ribbon formation in zebrafish hair cells. By using confocal imaging, pharmacology tools, and zebrafish mutants, the group of Katie Kindt convincingly demonstrated that ribbon, the organelle that concentrates glutamate-filled vesicles at the hair cell synapse, originates from the fusion of precursors that move along the microtubule network. This study goes hand in hand with a complementary paper (Voorn et al.) showing similar results in mouse hair cells.

      Strengths:

      This study clearly tracked the dynamics of the microtubules, and those of the microtubule-associated ribbons and demonstrated fusion ribbon events. In addition, the authors have identified the critical role of kinesin Kif1aa in the fusion events. The results are compelling and the images and movies are magnificent.

      Weaknesses:

      The lack of functional data regarding the role of Kif1aa. Although it is difficult to probe and interpret the behavior of zebrafish after nocodazole treatment, I wonder whether deletion of kif1aa in hair cells may result in a functional deficit that could be easily tested in zebrafish?

      Impact:

      The synaptogenesis in the auditory sensory cell remains still elusive. Here, this study indicates that the formation of the synaptic organelle is a dynamic process involving the fusion of presynaptic elements. This study will undoubtedly boost a new line of research aimed at identifying the specific molecular determinants that target ribbon precursors to the synapse and govern the fusion process.

    3. Reviewer #2 (Public Review):

      Summary:

      In this manuscript, the authors set out to resolve a long-standing mystery in the field of sensory biology - how large, presynaptic bodies called "ribbon synapses" migrate to the basolateral end of hair cells. The ribbon synapse is found in sensory hair cells and photoreceptors, and is a critical structural feature of a readily-releasable pool of glutamate that excites postsynaptic afferent neurons. For decades, we have known these structures exist, but the mechanisms that control how ribbon synapses coalesce at the bottom of hair cells are not well understood. The authors addressed this question by leveraging the highly-tractable zebrafish lateral line neuromast, which exhibits a small number of visible hair cells, easily observed in time-lapse imaging. The approach combined genetics, pharmacological manipulations, high-resolution imaging, and careful quantifications. The manuscript commences with a developmental time course of ribbon synapse development, characterizing both immature and mature ribbon bodies (defined by position in the hair cell, apical vs. basal). Next, the authors show convincing (and frankly mesmerizing) imaging data of plus end-directed microtubule trafficking toward the basal end of the hair cells, and data highlighting the directed motion of ribbon bodies. The authors then use a series of pharmacological and genetic manipulations showing the role of microtubule stability and one particular kinesin (Kif1aa) in the transport and fusion of ribbon bodies, which is presumably a prerequisite for hair cell synaptic transmission. The data suggest that microtubules and their stability are necessary for normal numbers of mature ribbons and that Kif1aa is likely required for fusion events associated with ribbon maturation. Overall, the data provide a new and interesting story on ribbon synapse dynamics.

      Strengths:

      (1) The manuscript offers a comprehensive Introduction and Discussion sections that will inform generalists and specialists.

      (2) The use of Airyscan imaging in living samples to view and measure microtubule and ribbon dynamics in vivo represents a strength. With rigorous quantification and thoughtful analyses, the authors generate datasets often only obtained in cultured cells or more diminutive animal models (e.g., C. elegans).

      (3) The number of biological replicates and the statistical analyses are strong. The combination of pharmacology and genetic manipulations also represents strong rigor.

      (4) One of the most important strengths is that the manuscript and data spur on other questions - namely, do (or how do) ribbon bodies attach to Kinesin proteins? Also, and as noted in the Discussion, do hair cell activity and subsequent intracellular calcium rises facilitate ribbon transport/fusion?

      Weaknesses:

      (1) Neither the data or the Discussion address a direct or indirect link between Kinesins and ribbon bodies. Showing Kif1aa protein in proximity to the ribbon bodies would add strength.

      (2) Neither the data or Discussion address the functional consequences of loss of Kif1aa or ribbon transport. Presumably, both manipulations would reduce afferent excitation.

      (3) It is unknown whether the drug treatments or genetic manipulations are specific to hair cells, so we can't know for certain whether any phenotypic defects are secondary.

    4. Reviewer #3 (Public Review):

      Summary:

      The manuscript uses live imaging to study the role of microtubules in the movement of ribeye aggregates in neuromast hair cells in zebrafish. The main findings are that<br /> (1) Ribeye aggregates, assumed to be ribbon precursors, move in a directed motion toward the active zone;<br /> (2) Disruption of microtubules and kif1aa increases the number of ribeye aggregates and decreases the number of mature synapses.

      The evidence for point 2 is compelling, while the evidence for point 1 is less convincing. In particular, the directed motion conclusion is dependent upon fitting of mean squared displacement that can be prone to error and variance to do stochasticity, which is not accounted for in the analysis. Only a small subset of the aggregates meet this criteria and one wonders whether the focus on this subset misses the bigger picture of what is happening with the majority of spots.

      Strengths:

      (1) The effects of Kif1aa removal and nocodozole on ribbon precursor number and size are convincing and novel.

      (2) The live imaging of Ribeye aggregate dynamics provides interesting insight into ribbon formation. The movies showing the fusion of ribeye spots are convincing and the demonstrated effects of nocodozole and kif1aa removal on the frequency of these events is novel.

      (3) The effect of nocodozole and kif1aa removal on precursor fusion is novel and interesting.

      (4) The quality of the data is extremely high and the results are interesting.

      Weaknesses:

      (1) To image ribeye aggregates, the investigators overexpressed Ribeye-a TAGRFP under the control of a MyoVI promoter. While it is understandable why they chose to do the experiments this way, expression is not under the same transcriptional regulation as the native protein, and some caution is warranted in drawing some conclusions. For example, the reduction in the number of puncta with maturity may partially reflect the regulation of the MyoVI promoter with hair cell maturity. Similarly, it is unknown whether overexpression has the potential to saturate binding sites (for example motors), which could influence mobility.

      (2) The examples of punctae colocalizing with microtubules look clear (Figures 1 F-G), but the presentation is anecdotal. It would be better and more informative, if quantified.

      (3) It appears that any directed transport may be rare. Simply having an alpha >1 is not sufficient to declare movement to be directed (motor-driven transport typically has an alpha approaching 2). Due to the randomness of a random walk and errors in fits in imperfect data will yield some spread in movement driven by Brownian motion. Many of the tracks in Figure 3H look as though they might be reasonably fit by a straight line (i.e. alpha = 1).

      (4) The "directed motion" shown here does not really resemble motor-driven transport observed in other systems (axonal transport, for example) even in the subset that has been picked out as examples here. While the role of microtubules and kif1aa in synapse maturation is strong, it seems likely that this role may be something non-canonical (which would be interesting).

      (5) The effect of acute treatment with nocodozole on microtubules in movie 7 and Figure 6 is not obvious to me and it is clear that whatever effect it has on microtubules is incomplete.

    5. Author response:

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      The manuscript by Hussain and collaborators aims at deciphering the microtubule-dependent ribbon formation in zebrafish hair cells. By using confocal imaging, pharmacology tools, and zebrafish mutants, the group of Katie Kindt convincingly demonstrated that ribbon, the organelle that concentrates glutamate-filled vesicles at the hair cell synapse, originates from the fusion of precursors that move along the microtubule network. This study goes hand in hand with a complementary paper (Voorn et al.) showing similar results in mouse hair cells. 

      Strengths: 

      This study clearly tracked the dynamics of the microtubules, and those of the microtubule-associated ribbons and demonstrated fusion ribbon events. In addition, the authors have identified the critical role of kinesin Kif1aa in the fusion events. The results are compelling and the images and movies are magnificent. 

      Weaknesses: 

      The lack of functional data regarding the role of Kif1aa. Although it is difficult to probe and interpret the behavior of zebrafish after nocodazole treatment, I wonder whether deletion of kif1aa in hair cells may result in a functional deficit that could be easily tested in zebrafish? 

      We have examined functional deficits in kif1aa mutants in another paper David et al. 2024. In Submission, preprint available:  

      https://www.biorxiv.org/content/10.1101/2024.05.20.595037v1

      In addition to playing a role in ribbon fusions, Kif1aa is also responsible for enriching glutamate-filled secretory vesicles at the presynaptic active zone. In kif1aa mutants (and crispants), vesicles are no longer localized to the hair cell base, and there is a reduction in the number of vesicles associated with presynaptic ribbons. Kif1aa mutants also have functional defects including reductions in spontaneous vesicle release and evoked postsynaptic calcium responses. Behaviorally, kif1aa mutants exhibit impaired rheotaxis, indicating defects in the lateral-line system and an inability to accurately detect water flow.  Since our paper focuses on microtubule-associated ribbon movement and dynamics early in hair cell development, we have only discussed the effects of Kif1aa directly related to ribbon dynamics during this time window in this paper. In our revision, we will reference this recently submitted work.

      Impact: 

      The synaptogenesis in the auditory sensory cell remains still elusive. Here, this study indicates that the formation of the synaptic organelle is a dynamic process involving the fusion of presynaptic elements. This study will undoubtedly boost a new line of research aimed at identifying the specific molecular determinants that target ribbon precursors to the synapse and govern the fusion process. 

      Reviewer #2 (Public Review): 

      Summary:

      In this manuscript, the authors set out to resolve a long-standing mystery in the field of sensory biology - how large, presynaptic bodies called "ribbon synapses" migrate to the basolateral end of hair cells. The ribbon synapse is found in sensory hair cells and photoreceptors, and is a critical structural feature of a readily-releasable pool of glutamate that excites postsynaptic afferent neurons. For decades, we have known these structures exist, but the mechanisms that control how ribbon synapses coalesce at the bottom of hair cells are not well understood. The authors addressed this question by leveraging the highly-tractable zebrafish lateral line neuromast, which exhibits a small number of visible hair cells, easily observed in time-lapse imaging. The approach combined genetics, pharmacological manipulations, high-resolution imaging, and careful quantifications. The manuscript commences with a developmental time course of ribbon synapse development, characterizing both immature and mature ribbon bodies (defined by position in the hair cell, apical vs. basal). Next, the authors show convincing (and frankly mesmerizing) imaging data of plus end-directed microtubule trafficking toward the basal end of the hair cells, and data highlighting the directed motion of ribbon bodies. The authors then use a series of pharmacological and genetic manipulations showing the role of microtubule stability and one particular kinesin (Kif1aa) in the transport and fusion of ribbon bodies, which is presumably a prerequisite for hair cell synaptic transmission. The data suggest that microtubules and their stability are necessary for normal numbers of mature ribbons and that Kif1aa is likely required for fusion events associated with ribbon maturation. Overall, the data provide a new and interesting story on ribbon synapse dynamics. 

      Strengths: 

      (1) The manuscript offers a comprehensive Introduction and Discussion sections that will inform generalists and specialists. 

      (2) The use of Airyscan imaging in living samples to view and measure microtubule and ribbon dynamics in vivo represents a strength. With rigorous quantification and thoughtful analyses, the authors generate datasets often only obtained in cultured cells or more diminutive animal models (e.g., C. elegans). 

      (3) The number of biological replicates and the statistical analyses are strong. The combination of pharmacology and genetic manipulations also represents strong rigor. 

      (4) One of the most important strengths is that the manuscript and data spur on other questions - namely, do (or how do) ribbon bodies attach to Kinesin proteins? Also, and as noted in the Discussion, do hair cell activity and subsequent intracellular calcium rises facilitate ribbon transport/fusion? 

      These are important strengths and we do plan to investigate adaptors and how hair cell activity impacts ribbon fusion and transport in the future!

      Weaknesses: 

      (1) Neither the data or the Discussion address a direct or indirect link between Kinesins and ribbon bodies. Showing Kif1aa protein in proximity to the ribbon bodies would add strength.

      This is a great point, and we are working to create a transgenic line with fluorescently labelled Kif1aa to directly visualize its association with ribbons. At present, we have not obtained a transgenic line, and localization of Kif1aa and ribbons in live hair cells it is beyond the scope of this paper. In our revision we will discuss this caveat.

      (2) Neither the data or Discussion address the functional consequences of loss of Kif1aa or ribbon transport. Presumably, both manipulations would reduce afferent excitation.

      Excellent point. Please see the response above to Reviewer #1 weaknesses.  

      (3) It is unknown whether the drug treatments or genetic manipulations are specific to hair cells, so we can't know for certain whether any phenotypic defects are secondary. 

      This is correct and is a caveat of our Kif1aa and drug experiments. However, to mitigate this in the pharmacological experiments, we have done the drug treatments at 3 different timescales: long-term (overnight), short-term (4 hr) and fast (30 min) treatments. The faster experiment done after 30 min drug treatment is where we observe reduced directional motion and fusions. This later experiment should not be affected by any long-term changes or developmental defects that could be caused by the drugs as hair cell development occurs over 8-12 hrs. However, we acknowledge that these treatments and genetic experiments could have secondary phenotypic defects that are not hair-cell specific. In our revision, we will discuss these issues.

      Reviewer #3 (Public Review): 

      Summary: 

      The manuscript uses live imaging to study the role of microtubules in the movement of ribeye aggregates in neuromast hair cells in zebrafish. The main findings are that 

      (1) Ribeye aggregates, assumed to be ribbon precursors, move in a directed motion toward the active zone; 

      (2) Disruption of microtubules and kif1aa increases the number of ribeye aggregates and decreases the number of mature synapses. 

      The evidence for point 2 is compelling, while the evidence for point 1 is less convincing. In particular, the directed motion conclusion is dependent upon fitting of mean squared displacement that can be prone to error and variance to do stochasticity, which is not accounted for in the analysis. Only a small subset of the aggregates meet this criteria and one wonders whether the focus on this subset misses the bigger picture of what is happening with the majority of spots. 

      Strengths: 

      (1) The effects of Kif1aa removal and nocodozole on ribbon precursor number and size are convincing and novel. 

      (2) The live imaging of Ribeye aggregate dynamics provides interesting insight into ribbon formation. The movies showing the fusion of ribeye spots are convincing and the demonstrated effects of nocodozole and kif1aa removal on the frequency of these events is novel. 

      (3) The effect of nocodozole and kif1aa removal on precursor fusion is novel and interesting. 

      (4) The quality of the data is extremely high and the results are interesting. 

      Weaknesses: 

      (1) To image ribeye aggregates, the investigators overexpressed Ribeye-a TAGRFP under the control of a MyoVI promoter. While it is understandable why they chose to do the experiments this way, expression is not under the same transcriptional regulation as the native protein, and some caution is warranted in drawing some conclusions. For example, the reduction in the number of puncta with maturity may partially reflect the regulation of the MyoVI promoter with hair cell maturity. Similarly, it is unknown whether overexpression has the potential to saturate binding sites (for example motors), which could influence mobility. 

      We agree that overexpression in transgenic lines is a common issue and would have loved to do these experiments with endogenously expressed fluorescent proteins under a native promoter. However, this was not technically possible for us. We originally characterized several transgenic Ribeye lines in the past to ensure they have normal ribbon numbers and size (myo6b:ribb-mcherry, myo6b:riba-tagRFP and myo6b:riba-GFP) - in 2014. Unfortunately, we no longer have the raw data from this analysis. In our revision, we will repeat our immunolabel on myo6b:riba-tagRFP transgenic fish and examine ribbon numbers and size and show what impact (or not) exogenous Ribeye expression has on ribbon formation.

      (2) The examples of punctae colocalizing with microtubules look clear (Figures 1 F-G), but the presentation is anecdotal. It would be better and more informative, if quantified. 

      We attempted a co-localization study between microtubules and ribbons but decided not to move forward with it due to several issues:

      (1)  Hair cells have an extremely crowded environment, especially since the nucleus occupies the majority of the cell. All proteins are pushed together in the small space surrounding the nucleus and hence co-localization is not meaningful because the distances are so small.

      (2) We also attempted to segment microtubules in these images and quantify how many ribbons were associated with microtubules, but 3D microtubule segmentation was not accurate in these hair cells due to highly varying filament intensities, and diffuse cytoplasmic tubulin signal.

      Therefore, we decided that a better measure of ribbon-microtubule association would be a demonstration that individual ribbons keep their association with microtubules over time (in our time lapses), rather than a co-localization study. We see that ribbons localize to microtubules in all our timelapses, including the examples shown. We observed that if a ribbon dissociates, it is just to switch from one filament to another. We have not observed free-floating ribbons in our study.

      (3) It appears that any directed transport may be rare. Simply having an alpha >1 is not sufficient to declare movement to be directed (motor-driven transport typically has an alpha approaching 2). Due to the randomness of a random walk and errors in fits in imperfect data will yield some spread in movement driven by Brownian motion. Many of the tracks in Figure 3H look as though they might be reasonably fit by a straight line (i.e. alpha = 1). 

      As we have stated in the paper, we only see a small subset of the ribbon precursors moving directionally. The majority of the ribbons are stationary. We cannot say for sure what is happening with the stationary ribbons, but our hypothesis is that these ribbons eventually exhibit directed motion. This idea is supported by the fact that we have seen ribbons that are stationary begin movement, and ribbons that are moving come to a stop during the acquisition of our timelapses. The ribbons that are stationary may not have enough motors attached, or they may be in a sort of ‘seeding’ phase where the ribeye protein could be condensing on the ribbon. We have discussed the possibility of ribbons being biomolecular condensates in our Discussion.

      In our revision we will discuss why ribbon transport does not resemble typical motor-driven transport (also see response to point 4 below). We will also reexamine our MSD data in more detail as suggested by Reviewer 3 and provide distributions of alpha values in our revision.

      (4) The "directed motion" shown here does not really resemble motor-driven transport observed in other systems (axonal transport, for example) even in the subset that has been picked out as examples here. While the role of microtubules and kif1aa in synapse maturation is strong, it seems likely that this role may be something non-canonical (which would be interesting). 

      One major difference between axonal and ribbon transport is that microtubules are very stable and linear in axonal transport. Therefore, the directed motion observed is ‘canonical’. In hair cells, the microtubules are extremely dynamic, especially towards the hair cell base. Within a single time frame (60-100 s), we see the network changing (moving and branching). This dynamic network adds another layer of complexity onto the motion of the ribbon, as the filament track itself is changing. Therefore, we see a lot of stalling, filament switching, and reversals of ribbon movement in our movies. However, we have demonstrated in our movies as well as using MSD analysis, that a subset of ribbons exhibit directional motion. In our revision we will discuss why directed motion in hair cells does not resemble canonical motor-driven transport in axons.

      (5) The effect of acute treatment with nocodozole on microtubules in movie 7 and Figure 6 is not obvious to me and it is clear that whatever effect it has on microtubules is incomplete. 

      When using Nocodazole, it is important to optimize the concentration of the drug such that there is minimal cytotoxicity, while still being effective. Microtubules in the apical region of hair cells are very stable and do not respond well to Nocodazole treatment at concentrations that are tolerable to hair cells. While a few stable filaments remain largely at the cell apex, there are almost no filaments at the hair cell base, which is different from the wild-type hair cells. In addition, Nocodazole-treated hair cells have more cytoplasmic YFP-tubulin signal compared to wild type. We will add additional images and quantification in our revision to illustrate these points.

    1. Reviewer #1 (Public Review):

      Summary:

      The study by Pudlowski et al. investigates how the intricate structure of centrioles is formed by studying the role of a complex formed by delta- and epsilon-tubulin and the TEDC1 and TEDC2 proteins. For this, they employ knockout cell lines, EM, and ultrastructure expansion microscopy as well as pull-downs. Previous work has indicated a role of delta- and epsilon-tubulin in triplet microtubule formation. Without triplet microtubules centriolar cylinders can still form, but are unstable, resulting in futile rounds of de novo centriole assembly during S phase and disassembly during mitosis. Here the authors show that all four proteins function as a complex and knockout of any of the four proteins results in the same phenotype. They further find that mutant centrioles lack inner scaffold proteins and contain an extended proximal end including markers such as SAS6 and CEP135, suggesting that triplet microtubule formation is linked to limiting proximal end extension and formation of the central region that contains the inner scaffold. Finally, they show that mutant centrioles seem to undergo elongation during early mitosis before disassembly, although it is not clear if this may also be due to prolonged mitotic duration in mutants.

      Strengths:

      Overall this is a well-performed study, well presented, with conclusions mostly supported by the data. The use of knockout cell lines and rescue experiments is convincing.

      Weaknesses:

      In some cases, additional controls and quantification would be needed, in particular regarding cell cycle and centriole elongation stages, to make the data and conclusions more robust.

    2. eLife assessment

      The study by Pudlowski et al. shows that a protein complex composed of delta- and epsilon-tubulin together with TEDC1 and TEDC2, which was previously identified, functions in generating centriolar triplet microtubules, and that this is crucial for the proper formation of centriolar subdomains and the stability of centrioles throughout the cell cycle. The findings are valuable for a better understanding of centriole biogenesis and structure and are largely supported by solid evidence based on knockout cell lines, immunoprecipitation, and ultrastructure expansion microscopy. The work is of interest to cell biologists, in particular researchers with interest in centrosome biology.

    1. Reviewer #2 (Public Review):

      Summary:

      This study looks at sex differences in alcohol drinking behaviour in a well-validated model of binge drinking. They provide a comprehensive analysis of drinking behaviour within and between sessions for males and females, as well as looking at the calcium dynamics in neurons projecting from the anterior insula cortex to the dorsolateral striatum.

      Strengths:

      Examining specific sex differences in drinking behaviour is important. This research question is currently a major focus for preclinical researchers looking at substance use. Although we have made a lot of progress over the last few years, there is still a lot that is not understood about sex-differences in alcohol consumption and the clinical implications of this.

      Identifying the lateralisation of activity is novel, and has fundamental importance for researchers investigating functional anatomy underlying alcohol-driven behaviour (and other reward-driven behaviours).

      Weaknesses:

      Very small and unequal sample sizes, especially females (9 males, 5 females). This is probably ok for the calcium imaging, especially with the G-power figures provided, however, I would be cautious with the outcomes of the drinking behaviour, which can be quite variable.

      For female drinking behaviour, rather than this being labelled "more efficient", could this just be that female mice (being substantially smaller than male mice) just don't need to consume as much liquid to reach the same g/kg. In which case, the interpretation might not be so much that females are more efficient, as that mice are very good at titrating their intake to achieve the desired dose of alcohol.

      I may be mistaken, but is ANCOVA, with sex as the covariate, the appropriate way to test for sex differences? My understanding was that with an ANCOVA, the covariate is a continuous variable that you are controlling for, not looking for differences in. In that regard, given that sex is not continuous, can it be used as a covariate? I note that in the results, sex is defined as the "grouping variable" rather than the covariate. The analysis strategy should be clarified.

    1. eLife assessment

      Using electrophysiological recordings in freely moving rats, this valuable study investigates the role of different gamma frequency bands in the development of spatial representations in the hippocampus. However, the evidence is incomplete as the methods and data analysis need significant improvement. Critically, alternative interpretations and analyses must be provided, especially regarding the nature of gamma oscillations in the hippocampus and their interaction with neuronal firing dynamics and theta sequence features. This study will be of interest to neuroscientists working in the field of spatial navigation and neuronal dynamics.